HW 11 -- Game Theory

The questions below are due on Wednesday May 14, 2025; 11:59:00 PM.

You are not logged in.
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.

Note: All problems for HW 11 are \textbf{OPTIONAL} and not taken for a grade.

1) Tic Tac Toe§

In tic-tac-toe, is the state of the board (where the X’s and O’s are) a sufficient state representation?

	Yes
	No

2) Value and Q Functions §

Recall the value function for a zero-sum two-player complete information alternating game (when there is just a single terminal payoff):

\begin{aligned} V_\text{max}(s) & = \begin{cases} +1 & \text{if $s$ is a win for max} \\ -1 & \text{if $s$ is a win for min} \\ \max_a \sum_{s'} P(s' \mid s, a) V_\text{min}(s') & \text{otherwise} \end{cases} \\ V_\text{min}(s) & = - V_\text{max}(s) \end{aligned}

What is the optimal strategy \pi_\text{min}(s) for player min in state s?
Input your answer as a Python formula. You can use variables s, a, and s_prime, and also functions V_max and P. V_max takes a single variable, and P takes three variables: new state, state, action, in that order. We also provide you with a function argmin(variable, function) and sum(variable, function). For example argmin(a, R(s, a, s_prime)) is \underset{a}{\mathrm{argmin}} \: R(s, a, s') and sum(s_prime, R(s, a, s_prime)) is \Sigma_{s'} R(s, a, s').
We could rewrite this formula in terms of Q. Mark all that are true about doing it that way:

It would be more expensive to compute

It would be easier to determine the optimal strategy

Q_\text{min}(s, a) = - Q_\text{max}(s, a)

3) Chess§

If you knew the optimal minimax strategy for chess, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know? The reward is +1 for a win, 0 for a draw, and -1 for a loss.

Choose one:

Yes

No

Consider a variation on chess where you get score \dfrac{1}{k} for winning on move k (and \dfrac{-1}{k} for losing on move k). If you knew the optimal minimax strategy for that game, could you get a higher score by playing a different strategy against a suboptimal player whose strategy you know?

	Yes, by using a different version of the minimax algorithm.
	Yes, by solving an MDP
	No, the optimal strategy stays the same
	No, we are already optimizing for the case that we know the suboptimal's player strategy in the minimax algorithm

4) Penny Matching§

Here is the payoff matrix for the “penny matching” game:

What is player A’s dominant strategy?

Mark all that are true.

Action 1

Action 2

None exists
Does this game have a deterministic Nash Equilibrium?

True

False
What games does this remind you of?

Prisoner's Dilemma

Chicken

Rock-Paper-Scissors
If player A is going to play a mixed strategy, what probability should they assign to action 1?

Please input your answer as a number.

The strategy in which both players always pick action 1 is not a nash equilibrium. Why:

Mark all that are true.

	Because player A could improve their payoff by deviating
	Because player B could improve their payoff by deviating
	Because a deterministic strategy is never a Nash equilibrium
	Because they could jointly improve their payoff by both deviating

5) Feedback§

How many hours did you spend on this homework?
Do you have any comments or suggestions (about the problems or the class)?

	It would be more expensive to compute
	It would be easier to determine the optimal strategy
	Q_\text{min}(s, a) = - Q_\text{max}(s, a)

	Action 1
	Action 2
	None exists

	True
	False

	Prisoner's Dilemma
	Chicken
	Rock-Paper-Scissors