AI 51

Lecture 4: Theoretical Fund. of Dynamic Programming Algorithms

Contraction mapping we take a sequence that is convergent in that space , apply Transfomation (T) to that sequence, we get another convergence sequence that covnerges to T of x. which is limit of that sequence fixed point if apply T trasnform to x it goes back to original x Banach Fixed Point Theorem we can know that sequence is convergent to unique fixed point X star The Bellman Optimality Oper..

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Transformer = uses attention elmo just concatenating hidden vectors from left and right LSTM. have 2 half blind model , suboptimal we want to single model to look left and right simultaneously gpt built for Language Model so it looks left to right = genrating language bert masked LM = replace words with mask pretrained with 2 tasks 1. feed 2 input sentence and guess is it reasonable next sentenc..

AI/Yannic Kilcher 2021.11.21