Lec11) Convolutional Networks for NLP

AI/NLP (cs224n)

Lec11) Convolutional Networks for NLP

Tony Lim 2021. 6. 5. 13:29

728x90

we can apply filter and let it represent what we want , first filter can be used to filter "polite thing" and so on.

we have 2 channels for each kernel size (2,3,4) -> output size is (4,5,6) -> 1 max pooling from each channel and concatenate -> concatenate whole and put it into softmax to tell positve or negative.

Regularization

Dropout = Create masking vector r of Bernoulli random variable with probability p of being 1.

prevents over fitting , but test time we dont do r x z we just scale final vector W with probability p.

Shortcut connection

first one's semantic meaning is to see deviation form doing nothing by just passing x

second one's semantic meaning is bit more complex.

Batch Normalization

it is kind of doing conv block's oupt and do Z transform. tends to make scale things in same size.

1*1 Convolutions

pytorchZeroToAll) CNN , Advanced CNN(inception) (tistory.com)

pytorchZeroToAll) CNN , Advanced CNN(inception)

convolution use kernel matrix(filter) as weight parameter (keep changes as training goes on) with RGB (depth =3 ) picture we use kernel matrix depth 3 we can have mutiple kernel matrix and create de..

tonylim.tistory.com

we can reduce demensionality

728x90

저작자표시

'AI > NLP (cs224n)' 카테고리의 다른 글

Lec10) Question Answering (0)	2021.05.11
Lec8) Translation , Seq2Seq , Attention (0)	2021.05.05
Lec7) Vanishing Gradients, Fancy RNNs (0)	2021.04.28
Lec6) Language Models and RNNs (0)	2021.04.27
Lec5) Dependency Parsing , Optimizer(GD to ADAM) (0)	2021.04.26

현재글Lec11) Convolutional Networks for NLP

250x250

Median Find, 자바8, spring, Weighted Interval Scheduling, Matrix Mutilply, 영속성, 파일입출력, Quicksort, Algorithm, Text Justification, 날짜시간, 스레드, 메소드 참조, dijkstra, Linux, fft, Interval Scheduling, systemd, JPA, 람다,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

관심있는것들