Attention is All You Need

AI/Yannic Kilcher

Attention is All You Need

Tony Lim 2021. 11. 21. 11:14

unlike other RNN, LSTM we can some how tell decoder to where you should focus (attention)

positional encoding = gives network significant boost for performance

top right attention = 3 connection going into it (key, value = encoding part of source sentence) , (query = encoding part of target sentence)

dot product of keys and query = gives angles between these 2 vecotors , if both of them are pointing to same direction result will be large

by doing softmax on dot product we can index V2(value for K2) , the most relevant soruce token

V is bunch of information that we might find intersting about soruce and K is representation(index, address) of each value

Query = i would like to know certain thing, like name, height something like this , we can find it by dot product with key and doing softmax we can find matching V(value, [name, height ..])

저작자표시

'AI > Yannic Kilcher' 카테고리의 다른 글

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained) (0)	2021.11.28
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (0)	2021.11.21

현재글Attention is All You Need

Interval Scheduling, spring, systemd, 자바8, JPA, 메소드 참조, Text Justification, Weighted Interval Scheduling, 스레드, Matrix Mutilply, 영속성, Linux, 날짜시간, 파일입출력, dijkstra, Median Find, Quicksort, Algorithm, 람다, fft,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

관심있는것들