Home / HW 2 -- Inference in GMs

HW 2 -- Inference in GMs

The questions below are due on Monday February 24, 2025; 11:59:00 PM.
 
You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.

This week's Colab notebook can be found here: Colab Notebook (Click Me!)

1) CSP as a Factor Graph §

Last time, we considered the following map-coloring CSP:

Map-coloring Image

Let’s represent it as a factor graph, assuming that each of the variables has domain (R, G, B) and that we have binary constraints that require the nodes they connect to have different values as well as unary constraints that reduce the domains of some of the nodes (so, for example, there is a constraint that V_4 = B.)

  1. Indicate every graph that can encode this CSP (even if it is not the most natural encoding).

A
B
C

  1. In factor graph B above, what would be the factor \phi_{2}(V_2) that exactly encodes the unary domain-restricting constraint on node V_2? Fill in the table representing the constraint, where each a_i \in \{0, 1\} (for simplicity):

Hint: Fundamentally the idea is that we put in an entry of 1 for combinations that are "legal" (allowed by the constraint) and 0 for the ones that are "illegal".

\phi_{2}(V_2)
V_2 = R a_1
V_2 = G a_2
V_2 = B a_3

Enter the table values as a python list [a1, ..., a3]

  1. In factor graph B above, what would be the factor \phi_{24}(V_2, V_4) that exactly encodes the general map-coloring constraint between V_2 and V_4? Fill in the table representing the constraint, where each a_i \in \{0, 1\} (for simplicity):
\phi_{24} V_4 = R V_4 = G V_4 = B
V_2 = R a_1 a_2 a_3
V_2 = G a_4 a_5 a_6
V_2 = B a_7 a_8 a_9

Enter the table values as a python list [a1, ..., a9]

  1. In factor graph A above, what would be the factor \phi_{24}(V_2, V_4) that encodes the binary constraint between nodes V_2 and V_4? Fill in the table representing the constraint, where each a_i \in \{0, 1\} (for simplicity):
\phi_{24} V_4 = R V_4 = G V_4 = B
V_2 = R a_1 a_2 a_3
V_2 = G a_4 a_5 a_6
V_2 = B a_7 a_8 a_9

Enter the table values as a python list [a1, ..., a9]

  1. Can we compute the marginals using message passing?
    Yes
    Yes, if our factor graph is not loopy
    No

2) Bayes Nets§

  1. Suppose we want to compute a conditional marginal between two nodes in an arbitrary tree-structured graph, e.g., P(A | Z = T), where A and Z are two (not necessarily adjacent) nodes in the graph.

    Indicate which of the following statements are true:
    Belief propagation can't be used to find an accurate answer in this case.
    Thinking of A as the root of the tree (so, “holding the tree up by node A, and letting everything hang down”), for the above query, it’s okay to remove any variables that are in the subtree rooted at Z.
    For the above query, in any factor involving Z, you can remove rows corresponding to Z = F.

  2. What if our factor graph is not a tree?

    Indicate which of the following statements are true:
    BP can't be used to find an exact answer in this case.
    BP can't be used to reliably find an approximate answer in this case.
    It's possibly to turn a factor graph that is not a tree into a factor graph that is a tree (although it might have exponentially larger tables), and use BP to find an exact answer.

  3. Consider the following factor graphs:

Which of the above graphs are well-formed factor graphs?
a
b
c
d

  1. Consider the following Bayes Net and candidate factor graphs:

Which one of the above factor graphs corresponds to the Bayes net shown above?

3) Factor Graph §

Consider the following factor graph:
  1. Say each variable can take on values \{1,2,3,4,5\}. You find out evidence that X = 3 and that Y \ne 2. How can you change the factor graph to express this? Choose a minimal but sufficient set of changes.

    Create a new factor connected to just X, with 0 for all values except 3, and any positive number for 3
    Create a new factor connected to just X, with any equal positive number for all values except 3, and 0 for 3
    Create a new factor connected to just Y, with 0 for all values except 2, and any positive number for 2
    Create a new factor connected to just Y, with any equal positive number for all values except 2, and 0 for 2

  2. Consider if we instead edit the table associated with factor \phi_{XY} with 0 for any value inconsistent with the evidence (that X = 3 and that Y \ne 2). Will this sufficiently and correctly incorporate the evidence?

    Yes
    No

4) You've Got Potentials§

We are going to build up to an implementation of belief propagation (this week) and variable elimination (next week). It rests on an implementation of a representation and a set of operations on the factors in a factor graph. In this problem, we will represent the factors as potentials using reference classes we have given you, and then ask you to implement operations on these potentials:

  • multiplication takes two or more potentials and returns a new one, indexed by the union of the variables in the original potentials. Each entry in the resulting potential is the product of the corresponding entries from each original potential; for example, if we have f_1(A, B) and f_2(B, C), and we multiply the potentials, the entry for the resulting potential will have values, for all a, b, c:
    f'(A=a, B=b, C=c) = f_1(A=a, B=b) \cdot f_2(B=b, C=c).
  • marginalization takes a potential and a subset of the variables in the potential, and returns a new potential only over the remaining variables. Each entry in the resulting potential is the sum, over all possible values of the variables we are marginalizing out, of the corresponding entries in the original potential; for example, if we have f(A, B, C) and are marginalizing out A and C, then the resulting potential will have, for all b:
    f’(B = b) = \sum_{a, c} f(A=a, B=b, C=c).

We will also ask you to do some basic operations on the factor graph.

4.1) Warming Up to Potentials§

Write a function that returns a potential following the description in the docstring below.

For reference, our solution is 10 line(s) of code.

4.2) Warm Up Part 2§

Write a function that queries the given potential for the specific variable values described in the docstring.

For reference, our solution is 2 line(s) of code.

4.3) Potential Multiplication§

Write a function that multiplies a list of potentials together. (Make sure to refer to the top of the colab notebook, especially the functions iter_joint_values and get_sub_assignment.)

For reference, our solution is 30 line(s) of code.

4.4) Marginalization§

Write a function that marginalizes out given variables of a potential to create a new potential.

Hint: you should use the array.sum() function in numpy. You can sum over multiple axes at once by passing a tuple of axes to sum over with the axis argument. Example: y = np.sum(x, axis=(1,3)), which will sum the array over the 2nd and 4th axes. If you find yourself writing many nested for-loops, come to office hours or post on piazza!

For reference, our solution is 12 line(s) of code.

4.5) Neighbor Relations§

Given a potential and a variable attached to the potential, returns a set of the other variables attached to that potential. (Make sure your method returns a set and not a list or tuple.)

For reference, our solution is 2 line(s) of code.

5) You Just Have to Belief (Propagation) in Yourself§

5.1) Sum-Product 1§

Let's start from a simple factor graph: just two nodes. Both nodes have the same domain \{0, 1\}.

Assume we have the following three factors:

  1. \phi_a(X_a):

\phi_{a}(X_a = 0) = 1, \phi_{a}(X_a = 1) = 3.

  1. \phi_b(X_b):

\phi_{b}(X_b = 0) = 1, \phi_{b}(X_b = 1) = 2.

  1. \phi_{ab}(X_a, X_b):
\phi_{ab} X_b = 0 X_b = 1
X_a = 0 1 2
X_a = 1 3 4

Let’s use belief-propagation to compute the marginal p(X_b). We know it’s proportional to the product of the messages coming in from its two adjacent factors:

p(X_b) \propto \mu_{\phi_b \rightarrow X_b} \cdot \mu_{\phi_{ab} \rightarrow X_b}.

  1. First we will find the value of \mu_{\phi_{ab} \rightarrow X_b}. We observe that
    \mu_{\phi_{ab} \rightarrow X_b} = \sum_a \mu_{X_a \rightarrow \phi_{ab}} \cdot \phi_{ab}
    and that
    \mu_{X_a \rightarrow \phi_{ab}} = \phi_a.
    You don't need to do any normalization.

Enter a Python list of two numbers, [\mu_{\phi_{ab} \rightarrow X_b}(0), \mu_{\phi_{ab} \rightarrow X_b}(1)]. For example: [1, 1].

  1. Compute the marginal probability distribution for node X_b.

Input a Python list of two elements: [p(X_b = 0), p(X_b = 1)]. You can also enter a python expression that will evaluate to a number (e.g., [3*2 + 4, 7/11.0]).

5.2) Sum-Product 2§

Now let's move on to a factor graph with three nodes. All nodes have the same domain \{0, 1\}.

This time, we will only have two factors:

\phi_{ab} X_b = 0 X_b = 1
X_a = 0 1 2
X_a = 1 3 4

\phi_{bc} X_c = 0 X_c = 1
X_b = 0 1 1
X_b = 1 2 2

Let’s compute all the marginals, by selecting X_c to be the root node.

  1. What is \mu_{\phi_{ab} \rightarrow X_b}?

Your input is a Python list of two elements. You don't need to do any normalization.

  1. What is \mu_{X_b \rightarrow \phi_{bc}}?

Your input is a Python list of two elements. You don't need to do any normalization.

  1. What is \mu_{\phi_{bc} \rightarrow X_c}?

Your input is a Python list of two elements. You don't need to do any normalization.

  1. What is P(X_c)?

Your input is a Python list of two elements.

  1. What is \mu_{\phi_{bc} \rightarrow X_b}?

Your input is a Python list of two elements. You don't need to do any normalization.

  1. What is P(X_b)?

Your input is a Python list of two elements.

  1. What is \mu_{\phi_{ab} \rightarrow X_a}?

Your input is a Python list of two elements. You don't need to do any normalization.

  1. What is P(X_a)?

Your input is a Python list of two elements.

5.3) Sum-Product 3: The Code Does Them All§

Write a function that runs sum product to determine the marginal probability of a single random variable assignment. Feel free to use the simple graphs studied above for debugging. (These graphs are available in the Colab as create_mrf_X). However, be sure that your code can deal with potential functions that involve more than just two nodes. You can assume that the factor graph corresponding to the potential functions is a tree.

Hint: You may find the provided helper function neighbor_dict useful.

For reference, our solution is 19 line(s) of code.

In addition to all the utilities defined at the top of the Colab notebook, the following functions are available in this question environment: marginalize, multiply_potentials, neighbors. You may not need to use all of them.

6) Everything is Loopy §

Consider the following directed graphical model:

  1. We wish to find the marginal P(W) using variable elimination. Our variable elimination order is S, C, R. What factors do we have at the very beginning of the algorithm?

    Enter the factors as a python list of strings ["XYZ", ..., "X"] in any order where XYZ denotes a factor between variables X, Y, and Z.

  2. What factors do we have left after we eliminate S?

    Enter the factors as a python list of strings ["XYZ", ..., "X"] in any order where XYZ denotes a factor between variables X, Y, and Z.

  3. What factors do we have left after we eliminate C?

    Enter the factors as a python list of strings ["XYZ", ..., "X"] in any order where XYZ denotes a factor between variables X, Y, and Z.

  4. How many variables were in the largest table we had to compute as an intermediate result in this whole process?

    Enter an integer.

  5. If we instead used elimination order C, S, R, how many variables would have been in the largest table we had to compute as an intermediate result?

    Enter an integer.

7) Representations §

Let's think about the role of the algorithms and representations we have been thinking about in the context of an agent that is reasoning about the world state.

  1. In a CSP we can think of a complete assignment as representing

    a single state of the world
    a set of possible states of the world
    a distribution over possible states of the world

  2. In a CSP, we can think of a partial assignment as representing

    a single state of the world
    a set of possible states of the world
    a distribution over possible states of the world

  3. In a PGM, we can think of a set of factors as representing

    a single state of the world
    a set of possible states of the world
    a distribution over possible states of the world

  4. In a PGM, we can think of a set of factors together with some additional conditioning variables as representing

    a single state of the world
    a set of possible states of the world
    a distribution over possible states of the world

8) Feedback§

  1. How many hours did you spend on this homework?

  2. Do you have any comments or suggestions (about the problems or the class)?