just trying to optimize our policy even with given 2 bad condition (practical) infinty norm == just choosing max from given vectors it doesn't need to converge to optimal , we observe just few n step and evaluate if this is good policy Performance of AVI initial error == how far from optimal from my first value function our n step difference shoud be bounded by those 2 error term Perfomance of A..