HW 3 -- Continuous, conditional, and conformant planning
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
This week's Colab notebook can be found here: Colab Notebook (Click Me!)
1) Robot Motion Planning§
1.1) Configuration Space§
-
True or False: Consider a robot that is a cube with sides 2r and the obstacles are all cuboids (rectangular prisms). If a pure translation path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center), a path exists in the original problem.
True False -
True or False: Consider the same robot and environment. If a pure translation path exists in the original problem. then a pure translation path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center).
True False -
True or False: Consider a robot that is a sphere with radius r and your obstacles are all cuboids (rectangular prisms). If a path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center), a path exists in the original problem.
True False -
True or False: Consider the same (spherical) robot and environment. If a path exists in the original problem, then a path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center).
True False -
True or False: Consider a complicated robot, with many arms, legs, and antennae, but it can be enclosed in a cube of sides 2r and your obstacles are all cuboids (rectangular prisms). If a pure translation path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center), a path exists in the original problem.
True False -
True or False: Consider the same complicated robot and environment. If a pure translation path exists in the original problem, a path exists in a problem with a point robot and all the dimensions of the obstacles increased by r (each edge is pushed r apart from the center).
True False -
If your robot has four 6-dof arms and the ability to fly freely, what is the dimension of its configuration space?
Enter the dimension as an integer. -
If your robot is 2 x 1 rectangle and can move in X, Y in the plane, and there is a single obstacle, which is a 1x1 square, what is the shape of the configuration-space obstacle?
-
If your robot is a 2 x 1 rectangle and can move in X, Y, and \theta (rotation in the plane) and there is a single obstacle, which is a 1x1 square, what is the shape of the configuration-space obstacle?
1.2) Planning Algorithms§
For each of the following statements, indicate whether or not it’s true for the standard RRT with linear-interpolation extensions.
-
Mark all that are true.
The algorithm is probabilistically complete. The algorithm is correct (if a plan is found, it can be executed safely). The algorithm will tend to put more samples near the start state than most other parts of the space. It is necessary to have a description of the shape of the obstacles in configuration space. It is applicable to robot arms. It is applicable to fixed-wing flying robots. It is applicable to cars.
Artie suggests the following new planning method for robots.
- Sample 1000 points randomly in a bounded configuration space. Include the start and goal points.
- Test each pair of points to see whether they can be connected by a collision-free straight line in configuration space.
- Search the resulting graph of connected points for a path from the start to the goal.
- If a path cannot be found, throw away the current samples and graph, and return to the first step. Otherwise, return the path found.
For each of the following statements, indicate whether or not it’s true for Artie's algorithm.
-
Mark all that are true.
The algorithm is probabilistically complete. The algorithm is correct (if a plan is found, it can be executed safely). There is a parameter you can tune to take advantage of information you might have about the tightest clearance in the problem. The algorithm will tend to put more samples near the start state than most other parts of the space. It is necessary to have a description of the shape of the obstacles in configuration space. It is applicable to robot arms. It is applicable to fixed-wing flying robots. It is applicable to cars. -
True or False: Is Artie's algorithm the same as the Probabilistic Roadmap (PRM) algorithm that we have seen in the lecture?
True False
1.3) RRT§
The figure above visualizes an intermediate step in the RRT procedure.
- Configurations A through F are currently in the tree with the edges shown.
- Goal is the goal configuration.
- M is the configuration sampled by the RRT algorithm at this iteration.
- Of the configurations that are in the tree, D is the closest to M.
The figure shows several other configurations. Of these, which configurations and edges will be added to the RRT at this iteration?
-
Enter the set of nodes that will be added to the RRT. Your input should be a list of strings, e.g.,
['A', 'B', 'C']
-
Enter the set of edges that will be added to the RRT. Your input should be a list of strings. Each string has the form
parent_node-child_node
, e.g.,['A-B', 'B-C', 'C-D']
2) Implementing RRT §
You should understand extend_path
before you tackle the RRT implementation.
2.1) Understanding extend_path§
Below we provide pseudocode for the extend_path
utility:
- Input: initial configuration [x_0, y_0]; goal configuration [x_1, y_1].
- Compute the distance along x and y dimensions: d_x = |x_1 - x_0|, d_y = |y_1 - y_0|.
- Divide d_x and d_y by
max_diff
and round up, which yields n_x and n_y. - Let n = \max(n_x, n_y). n is the number of steps we need to take from [x_0, y_0] to [x_1, y_1].
- Do a linear interpolation from [x_0, y_0] to [x_1, y_1], with evenly-spaced n + 1 nodes (inclusive of the initial and goal).
- Iterate over the generated linearly-interpolated path, check for collision.
- If a collision is found, return the prefix of the path that does not have any collision.
Make sure you have understood what extend_path
is doing. Answer the following questions.
Suppose we have a 2D robot.
The initial configuration is [0, 0].
The goal configuration is [2, 4].
The hyperparameter max_diff
is set to 0.4.
There is an axis-aligned rectangular obstacle from [0.5, 0.9] to [2, 2].
We now run extend_path
based on the above arguments.
successful
)?
path
)? Input your answer as a python list of tuples. For example [(1, 2), (3, 4)]
2.2) Extend Path in a Robot Configuration Space§
In this section, you will implement an important utility for doing motion planning in a continuous space: extend_path
.
The function takes in a RobotPlanningProblem. It makes a linear interpolation between the start and end
configurations in the robot's configuration space.
Its output is a path, that is, a list of robot configurations along the line from start
to end
.
When the start
and end
cannot be directly connected with a line because of collision,
your function should return a "prefix" of the path.
In other words, you should try to move along the linearly-interpolated path towards end
until you reach some obstacles in the environment.
Take a close look at the docstring of the RobotPlanningProblem
class before you implement this function.
Note that:
- Although our test cases are for a two-dimensional configuration [x, y].
The code forextend_path
is written so that it can work for any number of dimensions. Furthermore, the implementation of RRT should also work for any dimension without change. - Our solution does not use
numpy
. However, if you are comfortable withnumpy
, feel free to use it to shorten your implementation.
For reference, our solution is 18 line(s) of code.
2.3) Extend Path Visualization§
We have created a visualization tool for you in colab to visualize the path. You can run the code blocks and they will generate the visualized map and the path.
2.4) RRT§
In this section, you will implement the RRT algorithm we talked about in class to find a path from one configuration to the goal configuration in a 2D environment.
We have implemented some utility classes (RRTNode, RRT) for you,
as well as their corresponding visualization tools.
You should take a close look at their docstrings before you implement
this function. In particular, RRT.__init__, RobotPlanningProblem.sample, RRT.extend, RRT.nearest, RRTNode.backtrace, and extend_path from 2.2 should be helpful. To run in Colab, you will need an implementation of extend_path
,
if you did not write your own, use the solution in Catsoop.
Hint: Recall that there is a hyperparameter in RRT, which balances how frequently we want to test if the goal configuration can be reached from a node in the current RRT tree. Our suggested value for this hyperparameter is 0.1.
For reference, our solution is 19 line(s) of code.
In addition to all the utilities defined at the top of the Colab notebook, the following functions are available in this question environment: extend_path
. You may not need to use all of them.
2.5) RRT Visualization§
We have created a visualization tool for you in colab to visualize the RRT tree during your search. You can run the following code blocks and they will generate the visualized the map as well as some lines showing the edges in the RRT.
If you run this cell multiple times, you will see different RRT trees due to different random seeds. You will find that sometimes the tree has very few nodes -- indicating that your RRT algorithm finds a solution very quickly. Sometimes it takes many iterations to find the solution. This suggests that the algorithm is very sensitive to random seeds, which is bad for practical usage. Thus, in practice, we shuold always use improved RRT algorithm variants such as BiRRT (bidirectional RRT).
We recommend you try in your colab notebook to see how the hyperparameter goal_sample_probability
will affect the search process.
Specifically, try to set that probability to 0.05 or 0.2 and visualize your result.
3) Trajectory Optimizaton §
In this section, you will implement trajectory optimization by minimizing cost functions.
Similar to the robot motion planning section, we will consider \mathbb{R}^2 as both the configuration space and workspace (i.e., the robot is a point). A trajectory optimization problem consists of:
- an initial configuration
initial
, - a goal configuration
goal
, and - a cost field over \mathbb{R}^2.
The objective of a trajectory optimization problem is to find a trajectory, a path between
initial
and goal
, that minimizes the "cumulative cost" and the trajectory length.
Since we are dealing with a continuous state and action space, the "cumulative cost" becomes a path integral over the cost field.
We have provided you with a base class, TrajOpt2DProblem
, an abstraction for a general trajectory optimization problem.
Further, we have provided two subclasses: the GMTrajOpt2DProblem
and the RobotPlanningTrajOpt2DProblem
.
- A
GMTrajOpt2DProblem
is a trajectory optimization problem whose cost field is a mixture of (scaled) Gaussians. The cost field of this problem is similar to the reward field of theFractalProblem
you have seen in HW01. In particular, minimizing the negation of a cost field is equivalent to treating the cost field as a reward field and then maximizing cumulative rewards. One of our predefined problems, namedgm-2
, implements this idea. - A
RobotPlanningTrajOpt2DProblem
has a similar setup asRobotPlanningProblem
from the previous section. Recall that in aRobotPlanningProblem
, the objective is to find a path from initial to goal that avoids some obstacles. To achieve the same objective, aRobotPlanningTrajOpt2DProblem
defines its cost field in terms of the obstacles. In particular, at any point p \in \mathbb{R}^2, the cost field is:\mathsf{cost}(p) = \max_{o \in \mathsf{obstacles}} \max\left(0, o_\mathsf{radius} + \mathsf{margin} - \| o_\mathsf{center}, p \| \right)Here, \mathsf{margin} is a non-negative "safety margin" by which we enlarge each obstacle. The expression \| o_\mathsf{center}, p \| means the Euclidean distance between the obstacle's center and the point p. Therefore, the cost field given above is zero outside of all obstacles' margins and positive otherwise. Further, the costs are higher at points closer to an obstacle's center.
3.1) Numerical Integrator Warmup§
In this problem, you will implement an integrator for a general TrajOpt2DProblem
using numerical integration.
Given a TrajOpt2DProblem
instance, we will use
numerical integration
implemented in scipy
to approximately compute the path integral over the cost field of the problem.
The path integral follows the formula as in AIMA (Section 26.5, P949).
Note that since our robot is a point, the inner integration over all
points of the robot's body is unnecessary. So our formula will be a simpler
version of the one in AIMA.
Given the two endpoints a and b of a line segment, we parameterize the line segment as \tau(t) = (b - a) \cdot t + a for t \in [0, 1]. The path-integral of cost c over the line segment is then:
In words, the path integral of the line segment is the segment's length times the integral of the value of cost field over the line segment.
For this problem, we have implemented most of the code for you.
You should only need to fill in the integrand
function in the template.
For reference, our solution is 10 line(s) of code.
Note
Numerical integration is simple to implement --- especially with an
existing package like scipy
that already does all the heavy lifting for us.
It will work sufficiently well for the small scale problems in this homework.
However, In practice, it is almost always better to use closed-form
integration instead of numerical integration (if closed-form integration
is possible, that is).
Therefore, we have provided you with more efficient closed-form integrators
for the problems that you will work with.
When you implement and run the actual optimization (i.e., traj_opt
), we
encourage you to use our closed-form integrators since they should be faster
and thus easier for you to try out different parameters and settings.
3.2) Trajectory Optimization§
Implement trajectory optimization traj_opt
.
The traj_opt
function finds a piecewise linear trajectory by performing
optimization.
It minimizes the sum of two terms J_\textit{cost\_field} and J_\textit{eff}
and trades off between the two using \lambda_\textit{eff}:
For our setup:
- J_\textit{eff} = \sum_{p \in \text{segments} } \| p \| is the total length of the trajectory.
- J_\textit{cost\_field} = \sum_{p \in \text{segments}} \texttt{lineseg\_costf}(p) is the sum of the costs returned by
lineseg_costf
, for all line segments in the piecewise linear trajectory.
We will use the BFGS optimizer
implemented in scipy
to carry out the minimization.
Here's a rough overview of the algorithm:
- Create an initial path with
num_waypoints
number of intermediate points by linearly interpolating between the initial configuration and the goal configuration, This step is similar toextend_path
. - "Slice" the initial path to contain only the intermediate way points --- these are the parameters of the optimization problem.
- Add a small random noise perturbation to the initial parameters. (What potential problem are we trying to solve by doing this?)
- Build the cost functions J_\textit{cost\_field}, J_\textit{eff} and J.
- Minimize J using
scipy.optimize.minimize
. - Recover the complete trajectory from
scipy
's optimization result.
Hint: the following numpy
functions might be quite handy (though not strictly necessary):
np.linspace,
np.random.normal,
np.stack,
np.vstack,
np.reshape,
np.flatten.
In general, leveraging numpy
could make your life easier! If you get stuck or unsure how to use these functions,
please feel free to reach out and we will be happy to help!
Note Sometimes scipy
might return a result with a success=False
flag.
For our example problems, one common reason of this success=False
flag is the
loss of accuracy during the optimization.
Usually, this is fine and scipy
will still return a fairly good solution.
However, if you do run into wierd results and get stuck, please reach out!
For reference, our solution is 36 line(s) of code.
In addition to all the utilities defined at the top of the Colab notebook, the following functions are available in this question environment: LineSegCostNumericalIntegrator
. You may not need to use all of them.
3.3) Trajectory Optimization Visualization§
We have created a visualization tool for you in colab to visualize the result of your traj_opt
.
You can run the code blocks and they will generate the visualized cost field and the found trajectory.
As you do the visualization, try to:
- Visualize the different problems in
get_traj_opt_problems
:gm-1
,gm-2
,robot-planning-1
,robot-planning-2
,robot-planning-3
- Vary the parameter \lambda_{\textit{eff}} --- are larger or smaller values more likely to produce an obstacle-avoiding path in
robot-planning-*
problems?
3.4) Tuning \lambda_{\textit{eff}} §
Let us visualize the effect of \lambda_{\textit{eff}}.
For this problem, fix num_waypoints
as 6.
Then, answer the following questions by visualizing the found trajectories by traj_opt
.
-
For
robot-planning-1
, which of the following setting(s) of \lambda_{\textit{eff}} produced a path without collision?0 0.5 1.0 10.0 15.0 -
For
robot-planning-1
, which of the following setting(s) of \lambda_{\textit{eff}} produced a path without collision, but the path is wasteful (i.e., longer than necessary)?0 0.5 1.0 10.0 15.0 -
For
robot-planning-2
, which of the following setting(s) of \lambda_{\textit{eff}} produced a path without collision?0 0.5 1.0 10.0 15.0
3.5) Got Stuck? §
Trajectory optimization some times return solutions that are locally optimal but not globally optimal (with respect to the cost function J).
-
Figure A. and B. depict two found trajectories and costs for
gm-2
. Which of the above trajectories cannot be globally optimal?A B
-
Figure C. and D. depict two found trajectories and costs for
robot-planning-3
. Which of the above trajectories cannot be globally optimal?C D -
Consider the the following found trajectory by
traj_opt
:
traj_opt
to produce a non-colliding trajectory?
Assume that traj_opt
always produce a locally (or globally) optimal trajectory.Add to the cost function J a positive constant | |
Add to the cost function J a negative constant | |
Increase \lambda_{\textit{eff}} | |
Decrease \lambda_{\textit{eff}} | |
Increase the value of margin in RobotPlanningTrajOpt2DProblem | |
Decrease the value of margin for RobotPlanningTrajOpt2DProblem |
4) And-Or Search§
In this section, we will consider and-or search in non-deterministic domains. In AIMA Section 4.3.1 you will find a definiton of a two-room "erratic vacuum world". What makes it erratic are the outcomes of the Suck action:
-
When applied to dirty room the action cleans the room and sometimes cleans up dirt in an adjacent room too.
-
When applied to a clean room the action sometimes deposits dirt on the carpet in that room (but does not affect adjacent rooms).
In this problem, we ask you to define a three-room erratic vacuum world. The action set will remain the same. The transition definition is as in the two-room case, except that since the middle room has two adjacent rooms, the Suck action in a dirty middle room could also clean the dirt in (exactly) one of the two adjacent squares non-deterministically.
We will refer to the rooms in the three-room problem as "left", "middle", "right". You should define states by 3 letters indicating either Clean (C) or Dirty (D) for each of the rooms (left, middle, right) and one final letter defining the location of the robot (one of L,M,R).
In the Colab, you will find the definition of the two-room problem using some helper code for defining graphs. In the code box below, define the three-room problem by returning a dictionary, from a state to a list of successor states, which will be used to define a Graph
. Use the style for defining a dictionary illustrated in the vacuum_world
definition in the Colab, which is relatively compact.
4.1) Three room erratic vacumm world§
Write code to create and return a dictionary that defines a Graph representing transitions in the three room erratic vacuum world (try not to write the dictionary by hand).
Use letter M
to denote that the agent is in the middle room.
For example, CDCM
means the left room and right rooms are clean, the middle room is dirty; the robot is in the middle room
.
For reference, our solution is 27 line(s) of code.
4.2) Very Erratic Vacuum World 1§
Now, consider the "very erratic vacuum world" (VEVW) in which the vacuum when applied to a dirty square sometimes fails to remove dirt in the room when it sucks. It is still the case that an adjacent room may become clean.
Assume that the robot starts out in the Left room and all the rooms are dirty (state DDDL
) and the goal is to have all the rooms clean. What is the maximum depth of the optimal plan found?
depth 5: Suck, Right, Suck, Right, Suck | |
depth 9: Suck, Right, Suck, Right, Suck, Left, Suck, Right, Suck | |
There is no guaranteed plan without loops |
4.3) Very Erratic Vacuum World 2§
In the VEVW , assume that the robot starts out in the Middle room and only the middle room is dirty (state CDCM
). Assume that we can make plans with loops. Which of the following sequences, when looped, are guaranteed to (eventually) reach a state in which the middle room is Clean?
Repeat the sequence 'Suck' until the middle room becomes Clean | |
Repeat the sequence 'Left, Suck, Right' until the middle room becomes Clean | |
Repeat the sequence 'Suck, Left, Suck, Right, Suck, Right, Suck, Left' until the middle room becomes Clean |
5) Set-Based Uncertainty§
In this section, we will consider a planning problem in a non-deterministic domain. Specifically, consider a robot moving on a 2D grid world.
The robot can take 4 actions, moving N
(up) by 1 grid cell, moving S
(down), moving W
(left), and moving E
(right). If an action would run the robot into a wall, then the robot does not move. The state is simply a tuple (r, c), indicating the robot's position (row and column).
We will consider the following 4 by 4 map with walls surrounding it. There are no interior obstacles. For example, the following figure shows a visualization of the state (1, 2) (X is the robot position and # indicates wall).
0123
####
0# #
1# X #
2# #
3# #
####
We will be using sets to represent belief states. For example, if the belief of the robot's position is either (0, 0) or (1, 1), the belief state can be represented as {(0, 0), (1, 1)}
. In the problems below, belief states will be input as lists [(0, 0), (1, 1)]
, but the order of the elements doesn't matter.
5.1) Conformant Plan 1§
Consider the following setting.- The initial belief is that the robot could be anywhere.
- There are no observations.
-
What is the belief state after the action sequence:
['N', 'N', 'N']
? Enter a Python list of tuples. -
What is a shortest sequence of actions to reach the belief state
[(2, 2)]
? Enter a Python list of strings.
5.2) Conformant Plan 2§
Now, what if your wheels are slippery, but only to the south. So, if you execute action south, you might (non-deterministically) move 1 or 2 squares downward, except, of course, that you can’t move through walls. In this setting, consider the same planning task:
- The initial belief is that the robot could be anywhere
- There are no observations.
- The goal is to be certain it is in (2, 2).
[(2, 2)]
? Enter a Python list of string or 'None'
if one does not exist.
5.3) Too Slippery§
Now, what if your wheels are slippery, but both to the north and south, in the same sense as above, which is that they non-deterministically move 1 or 2 squares.- The initial belief is that the robot could be anywhere.
- There are no observations
[(2,2)] | |
[(1,2),(2,2)] | |
[(0,2),(1,2),(2,2)] | |
[(0,2),(1,2),(2,2),(3,2)] | |
[(0,1),(1,1),(2,1),(3,1),(0,2),(1,2),(2,2),(3,2)] | |
[(0,0),(1,0),(2,0),(3,0),(0,1),(1,1),(2,1),(3,1),(0,2),(1,2),(2,2),(3,2)] |
5.4) Using Sensors§
Now let's use the sensors. Specifically, suppose we have a sensor that can give us observations of walls next to us. Specifically, the observation at each state is a tuple of length 4, containing 4 values (n, s, e, w) (Note the order.). Each entry in the tuple indicates whether there is a wall at the grid north/south/east/west. For example, (0, 0, 0, 0) indicates that the robot is in a space that is not adjacent to any walls.Consider the following setting.
- The initial belief is that the robot is in one of the internal squares
[(1, 1), (1, 2), (2, 1), (2, 2)]
. - After each action, it will make the following observation: in each direction whether there is a wall.
- The robot has deterministic N, S, E, W movement. We are no longer slippery.
- The goal is to be certain it is in (2, 2).
-
What's the belief state (before considering the observation) if we take an action to move South? Enter a Python list of tuples.
-
What's the belief state if we start in the initial belief, take an action move South, and get the observation
(0, 1, 0, 0)
? -
Starting from the initial state, suppose that we moved one step South and got the observation
(0, 1, 0, 0)
, then, in the second step, if we move East and get the observation(0, 1, 0, 0)
, what's the belief state?
5.5) Conditional Plan in Belief Space§
Now let's consider making a simple conditional plan to reach the belief state [(2, 2)]. We have already done some of the work, and decided to start by executing actions 'S' and then 'E'. Now the robot will make an observation, and we need you to determine what subsequent actions it would need to execute to reach the target belief.
The initial belief is that the robot is in one of the internal squares [(1, 1), (1, 2), (2, 1), (2, 2)]
.
For each of the possible observations below, give an unconditional
sequence of actions (a conformant plan) that reaches the goal if such
a sequence exists or None
if none exists. The empty sequence []
means to not move.
-
Observation is
(0, 0, 0, 0)
. Enter a list representing an action sequence (as a list of strings) orNone
. -
Observation is
(0, 0, 1, 0)
. Enter a list representing an action sequence (as a list of strings) orNone
. -
Observation is
(0, 1, 0, 0)
. Enter a list representing an action sequence (as a list of strings) orNone
. -
Observation is
(0, 1, 1, 0)
. Enter a list representing an action sequence (as a list of strings) orNone
.
6) Feedback§
-
How many hours did you spend on this homework?
-
Do you have any comments or suggestions (about the problems or the class)?