Home / HW 11 -- Game Theory

HW 11 -- Game Theory

The questions below are due on Wednesday May 14, 2025; 11:59:00 PM.
 
You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.

Note: All problems for HW 11 are \textbf{OPTIONAL} and not taken for a grade.

1) Tic Tac Toe§

In tic-tac-toe, is the state of the board (where the X’s and O’s are) a sufficient state representation?
Yes
No

2) Value and Q Functions §

Recall the value function for a zero-sum two-player complete information alternating game (when there is just a single terminal payoff):

\begin{aligned} V_\text{max}(s) & = \begin{cases} +1 & \text{if $s$ is a win for max} \\ -1 & \text{if $s$ is a win for min} \\ \max_a \sum_{s'} P(s' \mid s, a) V_\text{min}(s') & \text{otherwise} \end{cases} \\ V_\text{min}(s) & = - V_\text{max}(s) \end{aligned}

  1. What is the optimal strategy \pi_\text{min}(s) for player min in state s?
    Input your answer as a Python formula. You can use variables s, a, and s_prime, and also functions V_max and P. V_max takes a single variable, and P takes three variables: new state, state, action, in that order. We also provide you with a function argmin(variable, function) and sum(variable, function). For example argmin(a, R(s, a, s_prime)) is \underset{a}{\mathrm{argmin}} \: R(s, a, s') and sum(s_prime, R(s, a, s_prime)) is \Sigma_{s'} R(s, a, s').

  2. We could rewrite this formula in terms of Q. Mark all that are true about doing it that way:
    It would be more expensive to compute
    It would be easier to determine the optimal strategy
    Q_\text{min}(s, a) = - Q_\text{max}(s, a)

3) Chess§

  1. If you knew the optimal minimax strategy for chess, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know? The reward is +1 for a win, 0 for a draw, and -1 for a loss.

    Choose one:
    Yes
    No

  2. Consider a variation on chess where you get score \dfrac{1}{k} for winning on move k (and \dfrac{-1}{k} for losing on move k). If you knew the optimal minimax strategy for that game, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know?

    Yes, by using a different version of the minimax algorithm.
    Yes, by solving an MDP
    No, the optimal strategy stays the same
    No, we are already optimizing for the case that we know the suboptimal's player strategy in the minimax algorithm

4) Penny Matching§

Here is the payoff matrix for the “penny matching” game:

  1. What is player A’s dominant strategy?

    Mark all that are true.
    Action 1
    Action 2
    None exists

  2. Does this game have a deterministic Nash Equilibrium?

    True
    False

  3. What games does this remind you of?

    Prisoner's Dilemma
    Chicken
    Rock-Paper-Scissors

  4. If player A is going to play a mixed strategy, what probability should they assign to action 1?

Please input your answer as a number.

  1. The strategy in which both players always pick action 1 is not a nash equilibrium. Why:
    Mark all that are true.
    Because player A could improve their payoff by deviating
    Because player B could improve their payoff by deviating
    Because a deterministic strategy is never a Nash equilibrium
    Because they could jointly improve their payoff by both deviating

5) Feedback§

  1. How many hours did you spend on this homework?

  2. Do you have any comments or suggestions (about the problems or the class)?