AI/NLP (cs224n)

Lec11) Convolutional Networks for NLP

Tony Lim 2021. 6. 5. 13:29
728x90

we can apply filter and let it represent what we want , first filter can be used to filter "polite thing" and so on.

  

we have 2 channels for each kernel size (2,3,4)  -> output size is (4,5,6)  -> 1 max pooling from each channel and concatenate -> concatenate whole and put it into softmax to tell positve or negative.

 

Regularization 

Dropout = Create masking vector r of Bernoulli random variable with probability p of being 1.

prevents over fitting , but test time we dont do r x z we just scale final vector W with probability p.

 

Shortcut connection

first one's semantic meaning is to see deviation form doing nothing by just passing x

second one's semantic meaning is bit more complex. 

 

Batch Normalization

it is kind of doing conv block's oupt and do Z transform. tends to make scale things in same size.

 

1*1 Convolutions

pytorchZeroToAll) CNN , Advanced CNN(inception) (tistory.com)

 

pytorchZeroToAll) CNN , Advanced CNN(inception)

convolution use kernel matrix(filter) as weight parameter (keep changes as training goes on) with RGB (depth =3 ) picture we use kernel matrix depth 3 we can have mutiple kernel matrix and create de..

tonylim.tistory.com

we can reduce demensionality

 

 

 

 

728x90

'AI > NLP (cs224n)' 카테고리의 다른 글

Lec10) Question Answering  (0) 2021.05.11
Lec8) Translation , Seq2Seq , Attention  (0) 2021.05.05
Lec7) Vanishing Gradients, Fancy RNNs  (0) 2021.04.28
Lec6) Language Models and RNNs  (0) 2021.04.27
Lec5) Dependency Parsing , Optimizer(GD to ADAM)  (0) 2021.04.26