Home / homework / HW 6 -- Discrete graphical models

HW 6 -- Discrete graphical models

The questions below are due on Monday October 23, 2023; 11:59:00 PM.
 
You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.

This week's Colab notebook can be found here: Colab Notebook (Click Me!)

1) Bayes Nets§

Consider the following Bayes net describing the health of a satellite, based on the status of its components:

  1. Suppose we want compute a conditional marginal between two nodes in an arbitrary tree-structured graph, e.g., P(A | Z = T), where A and Z are two (not necessarily adjacent) nodes in the graph.

    Indicate which of the following statements are true:
    Belief propagation can't be used to find an accurate answer in this case.
    To answer this question it's okay to remove the parts of the factor tree that are separated from A by Z.
    For the above query, in any factor involving Z, you can remove rows corresponding to Z = F.

  2. What if our factor graph is not a tree?

    Indicate which of the following statements are true:
    BP can't be used to find an exacty answer in this case.
    BP can't be used to find an approximate answer in this case.
    It's possibly to turn a factor graph that is not a tree into a factor graph that is a tree (although it might have exponentially larger tables), and use BP to find an exacty answer.

  3. Consider the following factor graphs:

Which of the above graphs are well-formed factor graphs?
a
b
c
d

  1. Consider the following factor graphs:

Which one of the above factor graphs corresponds to the satellite Bayes net shown above (for the first four questions of this section)?

2) You've Got Potentials§

2.1) Warming Up to Potentials§

Write a function that returns a potential following the description in the docstring below.

For reference, our solution is 10 line(s) of code.

2.2) Warm Up Part 2§

Write a function that queries the given potential for the specific variable values described in the docstring.

For reference, our solution is 2 line(s) of code.

2.3) Potential Multiplication§

Write a function that multiplies a list of potentials together. (Make sure to refer to the top of the colab notebook, especially the functions iter_joint_values and get_sub_assignment.)

For reference, our solution is 30 line(s) of code.

2.4) Marginalization§

Write a function that marginalizes out given variables of a potential to create a new potential.

Hint: you may want to use the array.sum() function in numpy.

For reference, our solution is 12 line(s) of code.

2.5) Neighbor Relations§

Given a potential and a variable attached to the potential, returns a set of the other variables attached to that potential. (Make sure your method returns a set and not a list or tuple.)

For reference, our solution is 2 line(s) of code.

3) You Just Have to Belief (Propagation) in Yourself§

3.1) Sum-Product 1: Simple is Better§

Let's start from the simplest undirected graphical model of just two nodes. Both nodes have the same domain \{0, 1\}.

Consider the following node potential functions \phi(a), \phi(b):

\phi_{a}(X_a = 0) = 1, \phi_{a}(X_a = 1) = 3.

\phi_{b}(X_b = 0) = 1, \phi_{b}(X_b = 1) = 2.

We also have the edge potential:

\phi_{ab} X_b = 0 X_b = 1
X_a = 0 1 2
X_a = 1 3 4

Now let's consider using the Sum-Product algorithm to compute the marginal for X_b, that is, p(X_b).

  1. Compute the message from node a to node b. That is, m_{a\rightarrow b}(X_b) = \sum_{X_a} \phi_a(X_a) \phi_{ab}(X_a, X_b). You don't need to do any normalization.

Input a Python list of two elements: [m_{a\rightarrow b}(X_b=0), m_{a\rightarrow b}(X_b=1)]. For example: [1, 1].

  1. Compute the marginal probability distribution for node X_b based on \phi_b and \phi_{ab}.
    Input a Python list of two elements: [p(X_b=0), p(X_b=1)]. You can also enter a python expression that will evaluate to a number (e.g., [3*2 + 4, 7/11.0])`.

3.2) Sum-Product 3: Let's Go Three§

Now let's move on to graphs with three nodes. Consider the following undirected graph with 3 nodes. All nodes have the same domain \{0, 1\}.

This time, we will only have potentials associated with edges.

\phi_{ab} X_b = 0 X_b = 1
X_a = 0 1 2
X_a = 1 3 4

\phi_{bc} X_c = 0 X_c = 1
X_b = 0 1 1
X_b = 1 2 2

Intialize the message of all random variables as [1, 1]. Now let's do belief propagation.

  1. Compute the message from node X_a to X_b at the first iteration of belief propagation.

    Your input is a Python list of two elements: [m^{(1)}_{a\rightarrow b}(0), m^{(1)}_{a\rightarrow b}(1)]. You don't need to do any normalization.

  2. Compute the message from node X_c to X_b at the first iteration of belief propagation.

    Your input is a Python list of two elements: [m^{(1)}_{c\rightarrow b}(0), m^{(1)}_{c\rightarrow b}(1)]. You don't need to do any normalization.

  3. Compute the message at node X_b after the first iteration of belief propagation.

    Your input is a Python list of two elements: [m^{(1)}_{b}(0), m^{(1)}_{b}(1)]. You don't need to do any normalization.

  4. Move on to the second iteration of belief propagation. Compute the message from node X_b to X_a, based on the computed message at node X_b after the first iteration.

    Your input is a Python list of two elements: [m^{(2)}_{b\rightarrow a}(0), m^{(2)}_{b\rightarrow a}(1)]. You don't need to do any normalization.

3.3) Sum-Product 3: The Code Does Them All§

Write a function that runs sum product to determine the marginal probability of a single random variable assignment. Feel free to use the simple graphs studied above for debugging. (These graphs are available in the Colab as create_mrf_X). However, be sure that your code can deal with potential functions that involve more than just two nodes. You can assume that the factor graph corresponding to the potential functions is a tree.

Hint: You may find the provided helper function neighbor_dict useful.

For reference, our solution is 19 line(s) of code.

In addition to all the utilities defined at the top of the Colab notebook, the following functions are available in this question environment: marginalize, multiply_potentials, neighbors. You may not need to use all of them.

4) Sampling§

4.1) Gibbs Sampling§

Consider the following 3 node factor graph:

The table on the right gives three different potential tables for the factor. Gibbs sampling did a poor job on generating samples that reflected the joint of these three variables for one of these potential tables. Which was it?

Select one of these as the problematic potential for Gibbs sampling.
a
b
c

4.2) Sampling in Bayes net§

Consider the following Bayes net.

We are interested in drawing samples from the distribution described by this network, conditioned on evidence, using ancestral sampling with rejection.

  1. Is this a good way to sample from P(K \mid B = b)?

    True
    False

  2. Is this a good way to sample from P(B \mid K = k)?

    True
    False

4.3) Markov Blanket§

Consider the Bayes net from the above question.

  1. Enter the set of nodes in the Markov Blanket of node G, e.g. ['B','D'].

  2. Enter the set of nodes in the Markov Blanket of node D, e.g. ['B','D'].

Consider the following Markov random field.

  1. Enter the set of nodes in the Markov Blanket of node B, e.g. ['B','D'].

  2. Enter the set of nodes in the Markov Blanket of node D, e.g. ['B','D'].

5) Coding: Localization with Viterbi§

In this section, we will implement an HMM for a robot that is moving around randomly in a 2D grid with obstacles. The robot has sensors that allow it to detect obstacles in its immediate vicinity. It knows the grid map, with the locations of all obstacles, but it is uncertain about its own location in the grid. We will use Viterbi to determine the most likely locations for the robot given a sequence of local and potentially noisy observations.

Concretely, we will represent the 2D grid with obstacles as a list of lists, where 1s represent obstacles and 0s represent free space. Example:

obstacle_map = [
  [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0],
  [1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1],
  [1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0],
  [0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]
]

The state of the robot is its location in the grid, represented as a tuple of ints, (row, col). Transitions are uniformly distributed amongst the robot's current location and the neighboring free (not obstacle) locations, where neighboring = 4 cardinal directions (up, down, left, right).

Observations are a 4-tuple that list which directions have obstacles, in order [N E S W], with a 1 for an obstacle and 0 for no obstacle. Observations that are "off the map" are 1, as though they are obstacles. For instance, in the map above, if there were no observation noise, then the observation for the top left corner (state=(0, 0)) would be (1, 0, 1, 1). Observations can also be corrupted with noise; see the create_observation_potential docstring for more details.

Our ultimate task will be to take in a sequence of observations and return the corresponding sequence of most likely states.

5.1) Conditioning Warmup§

Given a potential and a variable that occurs in it, return a new potential conditioned on the specified variable having value 0.

For reference, our solution is 2 line(s) of code.

5.2) HMM§

Write a function that creates a random variable for a state at a given time step in an obstacle HMM. The domain of the state variable should be a list of (row, col) indices into the map. Only free positions (not obstacles) should be included in the domain of the state variable. See docstring for more description.

For reference, our solution is 4 line(s) of code.

5.3) Transition potential creation§

Write a function that creates a potential for the transition distribution between states s_{t} and s_{t+1}. Refer to the previous question for more information about the state variables and their domains.

For reference, our solution is 13 line(s) of code.

5.4) Observation variable creation§

Write a function that creates a random variable for an observation at a given time step in an obstacle HMM. See docstring for description.

For reference, our solution is 3 line(s) of code.

5.5) Observation potential creation§

Write a function that creates a potential for the observation distribution between s_{t} and z_{t}.

For reference, our solution is 29 line(s) of code.

5.6) Viterbi Decoding§

Complete the implementation of Viterbi decoding. Note: in general, it is a good idea for numerical stability to implement Viterbi in log space. However, for the purpose of this homework, we are not expecting your implementation to be in log space. Moreover, the provided helper functions (e.g., multiply_potentials) assume that you are not working in log space.

For reference, our solution is 38 line(s) of code.

5.7) Viterbi Visualization§

We have created a visualization tool for you in colab to visualize the inference result. You can run the following code blocks and they will generate the visualized trajectory step by step, and show the comparison between the observed trajectory as well as the "noiseless" observation.

If you are not passing the test, you can use the simple map visualize_viterbi_test1 for debugging. Hint: one debugging strategy is to modify your viterbi code and print out the likelihood you get at each step.

6) Feedback§

  1. How many hours did you spend on this homework?

  2. Do you have any comments or suggestions (about the problems or the class)?