HW 11 -- Game Theory
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
Note: All problems for HW 11 are \textbf{OPTIONAL} and not taken for a grade.
1) Tic Tac Toe§
Yes | |
No |
2) Value and Q Functions §
Recall the value function for a zero-sum two-player complete information alternating game (when there is just a single terminal payoff):
-
What is the optimal strategy \pi_\text{min}(s) for player min in state s?
Input your answer as a Python formula. You can use variabless
,a
, ands_prime
, and also functionsV_max
andP
.V_max
takes a single variable, andP
takes three variables: new state, state, action, in that order. We also provide you with a functionargmin(variable, function)
andsum(variable, function)
. For exampleargmin(a, R(s, a, s_prime))
is \underset{a}{\mathrm{argmin}} \: R(s, a, s') andsum(s_prime, R(s, a, s_prime))
is \Sigma_{s'} R(s, a, s').
-
We could rewrite this formula in terms of Q. Mark all that are true about doing it that way:
It would be more expensive to compute It would be easier to determine the optimal strategy Q_\text{min}(s, a) = - Q_\text{max}(s, a)
3) Chess§
-
If you knew the optimal minimax strategy for chess, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know? The reward is +1 for a win, 0 for a draw, and -1 for a loss.
Choose one:Yes No -
Consider a variation on chess where you get score \dfrac{1}{k} for winning on move k (and \dfrac{-1}{k} for losing on move k). If you knew the optimal minimax strategy for that game, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know?
Yes, by using a different version of the minimax algorithm. Yes, by solving an MDP No, the optimal strategy stays the same No, we are already optimizing for the case that we know the suboptimal's player strategy in the minimax algorithm
4) Penny Matching§
Here is the payoff matrix for the “penny matching” game:

-
What is player A’s dominant strategy?
Mark all that are true.Action 1 Action 2 None exists -
Does this game have a deterministic Nash Equilibrium?
True False -
What games does this remind you of?
Prisoner's Dilemma Chicken Rock-Paper-Scissors -
If player A is going to play a mixed strategy, what probability should they assign to action 1?
- The strategy in which both players always pick action 1 is not a nash equilibrium. Why:
Mark all that are true.
Because player A could improve their payoff by deviating Because player B could improve their payoff by deviating Because a deterministic strategy is never a Nash equilibrium Because they could jointly improve their payoff by both deviating
5) Feedback§
-
How many hours did you spend on this homework?
-
Do you have any comments or suggestions (about the problems or the class)?