Mid-way Quiz
The best way to learn and to avoid the illusion of competence is to test yourself. This will help you to find where you need to reinforce your knowledge.
Q1: What are the two main approaches to find optimal policy?
Q2: What is the Bellman Equation?
Solution
The Bellman equation is a recursive equation that works like this: instead of starting for each state from the beginning and calculating the return, we can consider the value of any state as:
Rt+1 + gamma * V(St+1)
The immediate reward + the discounted value of the state that follows
Q3: Define each part of the Bellman Equation
Solution
Q4: What is the difference between Monte Carlo and Temporal Difference learning methods?
Q5: Define each part of Temporal Difference learning formula
Solution
Q6: Define each part of Monte Carlo learning formula
Solution
Congrats on finishing this Quiz 🥳, if you missed some elements, take time to read again the previous sections to reinforce (😏) your knowledge.
< > Update on GitHub