Lec2) Word Vectors and Word Senses

AI/NLP (cs224n)

Lec2) Word Vectors and Word Senses

Tony Lim 2021. 4. 6. 01:26

728x90

most of the word vectors are represented as row

Gradient Descent

computing naviely takes too much time.

Stochastic gradient descent(SGD) = randomly choose small sample(or batch) for each step and do same regular gradient descent, effectively computing faster.

Skip - grams = you have 1 center word and predict all the 'outside' words in the context

Continous Bag of Words = predict center words from context words

negative sampling

trying to mimize object function

1. we want our observed words to have high probability

2. we choose K random words and give them low probability

by this little change we can sort of reduce high frequency problem

Glove

countbased + distrubtion

using probe word (solid ,gas water..) we can measure co-relation between word (ice, steam)

f is for reducing power of high frequency words

glove tries to capture the counts of the overall statistics of how often these words appear together

728x90

'AI > NLP (cs224n)' 카테고리의 다른 글

Lec6) Language Models and RNNs (0)	2021.04.27
Lec5) Dependency Parsing , Optimizer(GD to ADAM) (0)	2021.04.26
Lec4) NLP with Deep Learning (0)	2021.04.13
Lec3) Neural Networks (0)	2021.04.12
Lec1) introduction and Word vectors (0)	2021.04.05

현재글Lec2) Word Vectors and Word Senses

250x250

영속성, dijkstra, fft, 스레드, Interval Scheduling, Weighted Interval Scheduling, 람다, 날짜시간, 파일입출력, Linux, Quicksort, Algorithm, systemd, 메소드 참조, spring, Matrix Mutilply, 자바8, Text Justification, Median Find, JPA,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

관심있는것들