Leslie Kaelbling: 6.01 Lecture 11

Search transcript...

PROFESSOR KAELBLING: OK. So this is the second lecture on solving problems via search. What we saw last time was a couple of basic strategies for thinking about how to control a machine. So just remember, the recap is that we have some kind of a system. It's a robot wandering through the hallways. It's a farmer trying to get his possessions across the river. You're trying to solve an 8 puzzle. It's some kind of a system where we're trying to find a way to get that system into some state. And what we do is we describe just a few simple things about the system.

We have to describe how it is that you can-- when you provide an input to the system, it changes the system state. You have to describe what state the system starts in. And you have to describe what set of possible states are sufficient for termination. What are the good goal states in the system? And then the search algorithm, taking those pieces of description, will basically totally generically find you a sequence of inputs such that if you feed those inputs into the system one by one, it will drive that system into the state that you want to.

So we last time looked at some different ways of finding paths through such a system. So here's an example of finding paths on a silly little map. And the idea again was that that, well, you could think of representing all the possible paths from, say, if A was our starting state in the world-- this is the robot is standing in City A. And its goal is to arrive at some other city. Then we could think of all the paths that start at City A by drawing a tree. And this tree has nodes in it. And just one more time to reinforce the fact that the nodes in this tree, the little circles in this tree stand for partial paths through the roadmap.

So think of this as a road map and this as a description of paths through the roadmap. So this node here in this tree stands for the path ABC. You also see that there are several different occurrences of the same letter in here. So there's a D here and a D here. This D here stands for the past ABED. This D here stands for the path ADGD. Right? ADGD. So this is a road network. This is implicitly, all the paths that you could take through that road network.

OK. So our goal is going to be to find shortest paths. That's what it has been so far. And we want to do it without making that whole tree. And in fact, we want to kind of explore the fewest choices that we can and still come up with the shortest possible path. OK. So we do this at pretty great length last time. And I'm not going to go through it in detail. In fact, it's sort of recapped in the slides. But we looked at a version of the search called depth first, where what we do is repeatedly-- the way the algorithm works is we sort of create the tree as we're going along. And what we do is repeatedly take a node in the tree that we have created so far and put its children into the tree.

And really, all the variation in the search algorithms is about, which one of the nodes in the existing tree we take out and examine and expand and put its children back into the tree? If we replace the node that we put into the tree most recently, then we get depth first search. And it has the character of expanding this node and then this one and this one and this one. It'll go down the left. A depth first search that picks its nodes in a different order which expands the rightmost child first will go down this way. But it's still a depth first search in the sense that it's going down and down and down.

We also looked at breadth first search, which has the character of going through the search tree in ranks. It says-- first, it looks at all the paths of length two. So it looks at AB and AD. Then it's going to look at all the paths of length three. And then it's going to look at all the paths of length four and so.

And what we found out about breadth first search-- I don't know. What did we find about breadth first search? So if you wanted to be guaranteed to find the shortest-- in the sense of the number of hops-- path in a tree, which search algorithm would you use? You'd use breadth first search. Depth first is not guaranteed to find you the shortest path in terms of the number of hops, but breadth first is. OK. So this depth first, breadth first search.

And dynamic programming was this idea that said that well, if we're trying to find the shortest path from one point to another point, it's going to be made up of shortest paths to its intermediate points. And so when we're thinking about finding paths in the tree, we only ever care about the shortest path from the start state to somewhere. Once we found the shortest path from the start state to somewhere, we don't need any more paths from the start state to that place. So that was the dynamic programming idea. OK. Good.

So what we're going to do today now is generalize the framework a little bit and think about problems where not every step has the same cost. So I have a cool GPS in my car. And it plans paths. And it uses algorithms internally that are going to be somewhat like the ones that we'll talk about today. But it's pretty clear that it doesn't-- when I ask for a path from my house to MIT or something, it doesn't count the number of intersections. So you might imagine that intersections are the nodes, are the states.

When my GPS is trying to find a way for me to get to MIT, you might think of the intersections, the places where you have a choice as being the states in the search. That would be sort of like the graph of the road network that we made. But it doesn't make sense to find the path with the fewest intersections, which is what breadth first search would do. It would try to find the fewest intersections. What kind of path would you like? If I was talking to my GPS, and I was trying to tell it what kind of a path I wanted from home to MIT, what features should that path have?

STUDENT: The shortest.

PROFESSOR KAELBLING: Shortest, measured how?

STUDENT: In distance between intersections.

PROFESSOR KAELBLING: Good. So it could be shortest in terms of distance. So you could say, all right. I'm actually going to-- when I look at each little segment of road, I could label it by how long it is, how long it is to drive on it. And so if I did that, then I might be interested in finding the path that had the shortest sum of those costs along the road segments. So it could be distance. Is that always the best cost?

STUDENT: Time.

PROFESSOR KAELBLING: Time. Time's a good cost. So now I have-- my current GPS is pretty cool. And it listens on the radio for traffic information. And it also-- actually, it uploads my data. And it downloads your data if you have one. And it has estimates of how long it takes to traverse a particular road at a particular time. So I have it set up to try to optimize time for me, not distance. Because sometimes you can go a lot faster by going around a longer way. So you could imagine having all different kinds of costs. And in fact, even in my GPS, you can modify some other things about the cost. You can tell it that you don't want to go on toll roads you hate freeways or you love freeways.

You could try to say that I want to take the-- now I'm fabulating. Mine doesn't do this. I could try to say, I want to take the scenic route. Or I don't want the sun to be in my eyes. There's all kinds of different things that you could want of a path. And you could probably describe those in terms of the value to you of being on this piece of pathway as opposed to the other one.

OK. So what we're going to do today then is think about how we can do search effectively in problems where we actually want to minimize some measure of the cost along the path that we find. And the measure of cost can be anything you want to. OK. So here's a contorted example. Here's our old little network that we were looking at before. Now imagine that actually C is way over here. So it costs a lot. It's a long distance to get to C. So now, how can we change our search algorithm so that we can find a shortest path in distance in this graph, as opposed to the shortest path in terms of the number of hops? I will try to say least cost during this lecture. But I'll say shortest a lot by mistake. So you can just throw something at me if I do that.

OK. So in breadth first search with dynamic programing-- I'm just going to buzz through this as if it were a movie-- it's going to go like this. It's going to look at the nodes that are two away. And then the ones that are three away. And then the ones that are-- oh, and then it finds a path. OK. And what we do in breadth first search is that we are exploring or expanding nodes in order of path length, that is to say the number of hops. But if these guys are really long, then we're going to find a path of cost 12, where in fact, we should be able to find one of cost 4.

OK. So what we found last time was this kind of cool thing that just to change between depth first search and breadth first search was really a very small change in the search algorithm. We'll find that finding the least cost path is also a very small change in the search algorithm. OK. So instead now-- so we're going to explore an algorithm called uniform cost search. And the way I like to think about it is that it's going to search outward from the starting node, from the starting state. Breadth first search, it's like we search out in contours of the number of hops. Uniform cost search, you can kind of think of it as searching out from the starting state in contours of cost. So it's going to expand paths. And it's always going to expand the least cost paths first.

So the job now, when we pick a node out of the agenda, we're not going to pick the node that we put in last or the node we put in first. We're going to pick out the node that has the smallest path cost. That is to say, the smallest-- so a node represents a path. And the cost of a path is the sum of the costs along the way. So we're always going to pick out the node that has the least total path cost. And we're going to implement the agenda with something called a priority queue. We'll get to that.

OK. So what's a priority queue? A priority queue is an interesting data structure problem in computer science. And if you take 6006 or 6046, you'll learn a lot about how to implement a priority queue efficiently. So a priority queue is a thing that has the same external interface as a stack or a queue. It has an operation called portion and an operation called pop. But when we push an item, we push it with a score. So it's actually a slightly different interface. So in a priority queue, we're going to put something in the queue. But we're going to put it in with a score.

And then we pop, instead of popping the first one or the last one, we pop the one with the smallest score. No matter whether it was put in recently or a long time ago, when we ask to pop, we always pop the one with the smallest score. If there are multiple ones with the same small score, same smallest score, then typically, it's unspecified which one you'll get. You shouldn't depend on that. So that's a priority queue is. You can put things into it. But to get something out, you get out the smallest one. So here's an example of how a priority queue would work. So here we're going to make an instance of a Python priority queue. We can push A with a score of 3, B with a score of 6, C with a score of 1. When we pop, we'll get C because that's the guy who had the lowest score. We pop again, we'll get A.

So the order that we pushed them doesn't matter. It's just the score. So how do you implement a priority queue? Well, that's the interesting thing. So there's really interesting ways to implement a priority queue so that those operations of either pushing or popping don't take too long. Here we have a very simple implementation where we just store our data in a list. So here's our priority queue. Its data is a list. If we push something, we just push it onto the end of the list, append it to the end. But when it's time to pop, we actually use argmax. Anyway, what we do is we basically run along the list looking for the guy with the lowest cost. And we pull it out.

So that means if we have a priority queue with 100 elements in it, it might take us 100-- we might have to examine all 100 items. In fact, in general, we will have to examine all 100 items in order to decide which one to take out. So that's kind of inefficient. And there are better ways to do it, which you'll learn about in future classes maybe. But for our purposes, our queues don't get that big. This works fine.

OK. We're also going to make a small change to the definition of what a search node is. So remember before, a search node, a node in our search tree-- it knew the state that was associated with it, the action that got us there from the parent node, and the parent node. And that was enough to encode a node in the tree. We're just going to add one more field here, one more attribute, which is the cost. And so the cost of a node in a search tree is going to be the sum of the costs to get from the root node down to the node in the tree. And so we can say, well, when we create a new search node, all we have to do is say how much the last step cost.

So imagine that we have-- we're just adding this guy to the tree. So this is a new node that we're adding. This is the parent. The node is labeled by a state inside it. There's a state inside. There's the action that we took to get here. And then to compute the cost of this whole path, we say the cost of the whole path is the cost of the parent-- which is how much it cost to get this far-- plus the action cost. And that gives us the cost of how much the cost to get from the root node all the way down to this guy. All right. So just important to keep in mind that when we talk about a path cost, that's the sum of the costs of the individual actions starting at the root and coming down to here in the tree.

OK. Uniform cost search then is the following. It requires two conceptual changes to the search that we had before. It's a lot like breadth first search in the sense that we're going to search out in a kind of way that grows out from the starting state. But it's going to grow out in contours of cost instead of number of hops. So the first thing is that the agenda now, instead of being a stack or a queue, is going to be a priority queue, a PQ. And there's another important feature, which I'll tell you about now but which you can appreciate when we do an example. And that is that we are not going to test to see if a state is a goal state until we pull it out of the agenda.

And that's because it can happen that we find a path to the goal first that's longer than a path that we will find later. So we're going to expand nodes in order of their cost. But we won't necessarily put them into the agenda in the order of their cost. I'm going to write this on the board because this is an important-- this is the critical thing. So in uniform cost search, we expand nodes in order of path cost. And remember that expand means take out of the agenda and put its children in. So as we take nodes out of the agenda, we're going to be systematically taking them out in the order of how long their paths are. But we do not put them into the agenda or discover them.

In breadth first search, as soon as we found the goal, we could just return it knowing that we had found the shortest path to it. In uniform cost search, we can't be sure that we have found the shortest path to the goal until we pull out that goal node from the agenda. So we can see the goal, we can find it is a child, but we have to put it in the agenda and wait till we get it out to declare that we found the shortest path. So we'll see this in an example.

OK. I'll come back to the dynamic programming. Well, no. OK. I'll tell you about dynamic programming. OK. So the dynamic programming principle is actually the same as it was in breadth first search. So before we said, the shortest path from x to z that goes through y is made up of the shortest path from x to y and the shortest path from y to z. So that remains true. In depth first search, we were thinking of shortest as being the number of hops. And now we're going to think of shortest as being the sum of the path costs. But it's still a true thing.

And what that means is again that because we expand nodes in order of path cost, that the pruning principle is going to be that we should never expand a node if we've already expanded a node that goes to the same state. So again, we're going to put that test in a slightly different place in the code than we did for breadth first search.

So here's how it goes. We're going to keep a list of expanded. Expanded is going to be a list of states. And what it means for a state to be on the expanded list is we have found the shortest path to that state. We would have called it that, but it doesn't fit very well in the program. List of states to which we have found the shortest path-- that's really what expanded should be. States to which we have found the shortest path. OK. So initially, we haven't found the shortest path to any states. Well, A. We don't really have to worry about that too much.

OK. So now, here we go. And let's just walk through the algorithm one more time just so that we can really understand how it's going. So what happens here? We have the agenda. We take something out of the agenda. That's going to be the least cost unexpanded path in our tree. If we have already expanded the state of this guy that we just popped-- I mean, excuse me. If we haven't already expanded it, then we are going to say, OK, now we've expanded it. We've now found the shortest path to this state.

So if we haven't already found the shortest path this state, we're going to say that now we have found the shortest path. We're going to check to see if it's a goal. Because if it is, we found the shortest path to the goal. And we're done. And then what we're going to do-- if it's not the goal, then we're going to go ahead and put its children into the agenda. OK. So and if what happens when we pull a node out of the agenda, and we discover that we've already found the shortest path to the state that this is a path to, we just throw it on the floor. So this if doesn't have an else. So if when we pull something out of the agenda-- we consider this path. And we've already been to that state before. And we've already found the shortest path to that state. We just ignore it and go around again. So that's how the dynamic programming is going to work in this case. And basically, we're always going to do dynamic programming and uniform cost search. There's really no reason not to. It's just one little extra bit of bookkeeping. All right. So let's see how that works in this simple example. So again, we're in A. We want to try to get to I. So we put A in the agenda. Now our agenda is always going to have costs associated with it. So we put A in the agenda with cost 0. We pull something out of the agenda. The only thing we have to pull out is A. So we pull out A. And we expand it. So we put in it children. So we put in AB and AD. And they each have cost 1.

OK. We'll take out AB and put in its children. OK. so the children of AB are ABC. That has cost 6. And ABE. That has cost 2. All right. So we've added those to the agenda. Now because this is a uniform cost search, we're going to pick out the node with the least cost, the least cost path. So the least cost path is AD. So we'll take out AD and put in it its children, ADE and ADG. They're each 2.

Now at this next moment, breadth first search would have just taken the next guy, which would have been ABC. But we're not doing breadth first search. We're doing uniform costs serach. So we skip that guy because it has a high cost. We pick out one of the ones with a low cost. It could be any of these cost 2 guys. Our implementation will pick out this one. But again, it's not a critical thing. Put in its children, ABEF, ABEH, ABEF, ABEH. We'll take out ADE.

OK. So ADE, where can that take us? ADEF and ADEH. Wait. Excuse me. All right. ADG. ADGE goes to ADGH. OK, good. ADG goes to ADGH.

So now, what happened? OK, so ADGH. Now we've done all the 2's. We're going to take out a 3. We take out ABEF. And from ABEF, we can go ABEFC or ABEFI. So we put those guys in with their costs. ABEH, ABEHI. OK. There we go. Now we have two 4's, a 3, and 6, and an 8. So we're going to take this 3 out. ADGH. ADGH. From ADGH we could add ADGHI.

OK. So why is ADGHI not in there? I actually don't know. Excuse me? H has already been expanded. Indeed, it has. Right there. ABEH. So we've already found the shortest path to H.

STUDENT: [INAUDIBLE].

PROFESSOR KAELBLING: Excuse me?

STUDENT: [INAUDIBLE] two shortest paths.

PROFESSOR KAELBLING: All right. You know what? I have to do this with a tree. So let's do it with a tree. Yes. OK. We're going to do it with a tree. Here we go. All right. So we start with A. And from A, we can go to-- you guys have to help me because it's small here. B, D. And the cost to go to B is 1. And the cost to go to D is 1. OK.

Now from B, we can go to C or E. C. The cost to C is what? Path cost?

STUDENT: 5.

PROFESSOR KAELBLING: Path cost. 6. OK. So that's the cost of the whole path. Good. Now ABE. Path cost 2. OK. AD-- we can go to E. Path cost 2. ADG, path cost 2. All right. So there we go. Now we're going to take one of the 2's. So what we took here was ABC. No, we didn't. We took ABE. ABE. OK. So ABE. From E, we can go to D, F, or H. But we've already expanded D. So we don't do it. Right? So I'm going to draw D but with a-- like that. Because we've already expanded it.

So F or H. F or H. And what are the costs to go to F and H, the path costs?

STUDENT: 3.

PROFESSOR KAELBLING: 3. So all the way down. 3 to F, 3 to H. OK, good. Now, what's the next thing we can do? E. Should we do something with E? Why? We already visited it. Excuse me. We already expanded it. We already found the shortest path to E. So never mind that guy. OK G. ADG. From G, we can go to D or H. D is uninteresting. We could go to H with a cost of 3. OK. Good.

So now, our frontier-- we've got 6, 3, 3, and 3. So we pick a 3. We could pick this F. From F, we can go to C, E, or I. So we could consider C briefly. Well, no. We could consider C. E. Oh, no that's dumb. I. OK. So what's the cost of that path? 8. OK. [INAUDIBLE] get there. What's the cost of this one? That's going to be 4. OK. H. From H, we can go to E, G, or I. E, uninteresting. G, already expanded. I, OK.

OK. Who's next? This H over here. Oh, but wait. We already did H. All right? So now, who's next? I. We pull out I. Oh, I was the goal. Yay. OK. So we end up with this as our solution. Yup?

STUDENT: When we got to ADE--

PROFESSOR KAELBLING: ADE.

STUDENT: Juan told us not to expand D again.

PROFESSOR KAELBLING: Because we had expanded E. So the rule is if you've ever expanded a path to a state, never do it again.

STUDENT: [INAUDIBLE] we won't expand it again.

PROFESSOR KAELBLING: Yeah. Just to get the vocabulary straight, which is tricky-- so if we have expanded any node, representing-- I don't know. So nodes representing a path to state S don't expand any other nodes. Expand? Right. That are paths to state S. So a node in the tree represents a path. Paths are paths to a state. So this is a path to D. This was a path to D. And so this node represents a path to D. This node represents a path to D. Because we had already expanded this node, which was a path to D, this node here, which is also a path to D, is not expanded. OK. Yup?

STUDENT: In the example, we have ABC. That has cost 6. And then we have ABEFC, which has 8. Is it supposed to be [INAUDIBLE]?

PROFESSOR KAELBLING: Excuse me? So right. So AB-- Oh, you're saying we have two paths to C here in the tree right now. You're saying, well, this is pretty silly because in some sense in our head, we should already know that we don't need this guy if we have this guy. Is that it? That's right. And it'll work out fine as a byproduct of the way the algorithm is set up. You could try harder in advance to not put this guy in the agenda at all. As it is, he'll kind of languish in there. This is a pretty efficient implementation as it is. But you're right, that we know intellectually, having found this path to C, that we don't need this one. But it's not because we found this one chronologically later. It's because we found it with a longer path. Other questions? OK. Good.

STUDENT: [INAUDIBLE] code as the list of the nodes we've expanded.

PROFESSOR KAELBLING: It's called expanded.

STUDENT: Oh. OK. That's not on our notes. Never mind.

PROFESSOR KAELBLING: It's not? Oh, I'm sorry.

STUDENT: Oh, yeah, it is. Sorry.

PROFESSOR KAELBLING: OK. OK. Flip the page. Yeah, OK. OK. So good. So this is uniform cost search. It'll find the shortest path in terms of cost, least cost path, [INAUDIBLE] something. The implementation's pretty straightforward. We really mostly just put it in a priority queue. But we also have to be sure that we test for the goal when we take something out of the queue. It has to be that there aren't any negative costs or 0 costs. Negative costs are especially troublesome. Why is that? What if we had negative costs? We might end up going around in a loop decreasing your cost. So it might be that there is no least cost path or that the least cost path is infinite.

And zeroes are trouble too for similar reasons. So all costs have to be positive. And it will always return the least cost path if a goal state is reachable from the start state.

OK. So that is good. But it has the property-- uniform cost search has the property of, as I said-- OK. So here's the United States. I have really bad geography. Imagine we're in Chicago, and we're trying to go to Boston. And we turn the GPS on. And the GPS starts doing uniform cost search. It's going to kind of consider these states. And then it's going to kind of consider some over here. And then it's going to kind of consider some over here. And pretty soon, you're going to be thinking about South Dakota. Is that a good idea, thinking about South Dakota?

I don't know. Maybe if you live in South Dakota, it is. But if you're trying to go from Chicago to Boston, it probably isn't. OK. So there's nothing in uniform cost search that has the goal in mind at all. All it does is it goes, and it says, all right, I'm just going to radiate out from where I'm starting in order of how long it takes me to get to places. And every time I find a shortest path to a state, I'm going to say, oh, is this a goal? If it is, cool. But that's it. That's the only the goal enters into the computation.

So another thing that's really useful if you have some way of doing it is to try to inform the search process. So what we're going to do now is think about how if you had some idea about how to help the search algorithm out, you could use it. So do you have an idea about how to help this search algorithm out? Why is not good to be thinking about South Dakota when you're trying to go to Boston from Chicago? What? It's in the wrong direction. Right?

I mean, intuitively, we all know. Everybody knows. It's the wrong direction. OK. So we have an underlying idea roughly when we're solving a problem that involves driving around in a map that probably we're going to need to go this way. Now we accept the fact that the roads aren't totally lined up between here and where we want to go. And there probably isn't a totally straight line freeway from Chicago to Boston that we could just hop on and go. But it seems like a good idea to be trying to go in this direction. As much as you can, it seems like you should be trying to go in that direction.

So we have a way to build into our algorithms the notion of trying to go in a direction. So this is just an example of less interesting searching out from the middle. OK. So the trying to go in a direction idea is using something called a heuristic function. And the word heuristic means a whole bunch of things to a whole bunch of people. And if you start reading definitions of it, you'll get all kind of confused. So we're just going to use a particular technical sense.

So a heuristic function is a function that takes a state as input. And it gives you back an estimate of the minimum cost to reach a goal from S. All right? That's what a heuristic function is. So in the example of driving around in the map, if Boston is the only goal, our goal is Boston, then an estimate of the minimum cost to reach a goal from S-- well, on estimate would be just the Euclidean distance. Or if our costs are in terms of distance, that would be a reasonable heuristic.

If our costs are in terms of time, then maybe a reasonable heuristic would be the distance divided by 60 in hours. Maybe we go 60 miles an hour, depends on where we are. OK. So everybody understand what a heuristic function is? I haven't said what we're going to do with it yet. But just what it is. It's an estimate. And sometimes it's easy to come up with these estimates. Sometimes it can be kind of hard to think about what a good heuristic estimate is. But think of it as a "hestimate."

OK. So here's our "hestimate," estimate of how to get from Chicago to Boston, how much it will cost to go from Chicago to Boston. OK. So now what we're going to do is we're going to make a tiny variation on uniform cost search. So in uniform cost search, we would always take the node out of the agenda to expand that had the least cost so far. The small change we're going to make is that we're going to take the node out of the agenda that has the least sum of the cost so far. So maybe these two locations have the same cost so far, the same path cost in uniform cost search sense.

But what we're going to do now in order to decide which one to take out and expand is to add up the cost so far plus the heuristic. And for this guy, the cost so far plus the heuristic. So this guy's estimated cost to Boston is probably considerably higher than this guy's estimated cost to Boston. So that means we would take this node out of the agenda to expand first.

So that's going to be the idea. Instead of just expanding based on the path cost so far, we'll add in an estimate of how much farther we have to go. So here in this-- oh, yeah. There's this funny phrase that comes up probably only in AI search heuristics. The Manhattan distance. Then Manhattan distance from A to E is how long it would take to walk there in Manhattan, that is to say, where you can't cut diagonally between A and E. You can't go through the field. There's no field to go through. You have to go down and over.

So the Manhattan distance between A and E is 2. Between A and H, it's 3. That also happens to be the shortest distance in this particular graph. So in this graph, Manhattan distance is like a really, really, really excellent heuristic. In fact, it pretty much tells you the exact cost to continue. So imagine that you wanted to go from E to I. Then you could put in E with the cost-- so let's say that E is the starting state. And I is the goal. You're would put in E with your cost so far-- you haven't gone anywhere yet-- but plus the heuristic distance to the goal. And if your heuristic distance is the Manhattan distance. That's easy to compute. Then you would put E in with a cost of 2. It's kind of an estimated cost. You're not sure. But you think the cost is probably 2.

Now you take E out, and you put in its neighbors. And you put in the path cost plus the Manhattan distance. So for B, it costs 1 to go from E to B. But then the Manhattan distance from Be to I is 3. So that goes in with a total of 4. Same for D. For E and F, the path cost is 1 also. But the Manhattan distance is 1. So then now, when you take somebody out, you take the guy out with the least sum of the path cost so far and the heuristic cost.

So you would take out EF, and you'd get EFI. And you would-- let's see. You would take out another 2. Again, you take out the 2's in whatever order you want to. But you would get to the goal EFI much more quickly and without messing about over here. So the fact that we were using this heuristic, this estimate of how much farther we had to go means that we don't end up expanding these nodes that are way far out over here.

All right. Here's a much bigger and more interesting example. It's some crazy city network that I made up. But the costs are sort of reasonable. Or the road lengths are correct given this embedding in the plane. Anyway, imagine that we start here, and we're trying to get to X. So we start at G. We're trying to get to X.

If we do uniform cost search, this is the order that we are considering the nodes in. So this is just in order of path cost. And you can see, it's kind of going out. And it's kind of meandering around. And it's taking a long time to get over to X. In fact, X is the last node it gets to. That's because I made the example to be particularly dramatic. OK. So that's the path we get. That's the best way to get to X.

But now if we add the heuristic in there, we'll get something better. So adding the heuristic, we get this-- the name of uniform cost search when we add the heuristic is called A star. All right. So here's G. But now from each node, we consider the Euclidean distance as the heuristic. And so already, this T's pretty attractive because although we had to work kind of hard to get to T, its estimated cost to the goal is pretty good. And it's better than F's or I's. So we expand T. And we get to R. And R-- oh, R is pretty tasty. Oh, look. F looks better. But now after that, we just buzz right on down to where we're going.

So we were pulled down to X by this intuitive idea that we wanted to basically head in the right direction. So in map type examples, it's pretty easy to think about a good way to get pulled to the goal. In other examples, it's trickier. We'll talk about that a little bit. OK. So what we have to change? So before we had uniform cost search. This is the same uniform cost search we had before. There's a teeny little bit of red stuff in my slide. I don't know how to-- well, I didn't make it red in the handouts. It's red in the online stuff you can look at. All the way down here, all the way down here where if we're making a new node-- we're making a new node. And we're putting it in the agenda.

And before when we put the node into the agenda, we did it just with the cost that it took to get down there. Now we're going to put it in with the cost plus the heuristic cost. That's really the only change. And then we'll get this A star algorithm. Yeah?

STUDENT: [INAUDIBLE] as your heuristic. How do you find it without already knowing the shortest path?

PROFESSOR KAELBLING: Ah. Great question. So back in this example-- well, actually let me just draw a picture here in keeping with my geographic brilliance here. OK. So the question is the Manhattan distance seems like a particularly silly thing because if you're really living in Manhattan, you don't need to do a search anyway. Really, you could just kind of compute the path.

But what if you're living in Manhattan, only some of the streets aren't there? I don't know. Some of them aren't, right? So you might have a grid like this that doesn't have all that links in it. Manhattan distance is still easy to compute. It's the difference in x-coordinates plus the difference in y-coordinates. But that's now not the actual distance. It might be longer because some of the streets might not be there. So you might have to go up and around or something. So there is a difference. Even in totally grid-like world, there's potentially a difference between the Manhattan distance and the shortest path distance if some of the arcs aren't there. Yeah. Good question. Other questions?

OK. We'll kind of introduce one more important idea. This is our last really important idea for this lecture, one that will explore some things. So what I said before was that uniform cost search is guaranteed to find the least cost path. But we were sort of unhappy with it because it seemed a little bit aimless. So we added this idea of a heuristic to help it be more driven toward the goal.

So now the question is, well, if we do this, do we still find the best path? Do we still find the least cost path to the goal? And the answer is that we do find the least cost path if the heuristic is admissible. OK. So now we have to think about what admissible means. So an admissible heuristic always underestimates. Actually, this isn't exactly right. It's always an underestimate or equal to the actual distance.

So it's optimistic. One way to think about and admissible heuristic is that it's an optimistic heuristic. It says, oh yeah. Yeah, yeah. I could get there. I could get there in just a couple minutes. No problem. That's an admissible heuristic for you. It's optimistic. If your heuristic is optimistic, then you'll always find the shortest path. Why is that?

So what if your heuristic was pessimistic? What if your heuristic said, oh, yeah. From-- where is that? Albany? I don't know. From Albany, it's really far. Albany to Boston, oh, I don't know. That's like 3,000 miles. I think that's really not a good idea. That's a pessimistic heuristic. Albany is really far from Boston. If that's true, it's going to basically force you to not go through Albany. If you're going along doing your search, and you say, oh, should I go through here? And then you'll get a cost in your search tree of however far it is from Chicago to Albany plus 3,000. And then you'll basically never get that node out of the agenda. So you will kill the possibility of going through there, probably. And you'll end up going some other way instead.

OK? So if the heuristic over estimates the cost, it makes an option seem more expensive than it really is, then that could keep you from exploring that option. And that could keep you from finding the shortest path. So underestimates won't keep you from exploring options. OK. So what makes a heuristic good? It should be admissible. If you want to find the shortest path, you'd like your heuristic to be admissible.

The fact is that if you want your search to run fast, and you don't care about getting actually the shortest path, in many cases, people use non-admissible heuristics for really big problems, like probably for searching in the GPS. I actually don't know what they use in there. But they might not use an admissible heuristic. The closer your heuristic values are to the actual costs, the more efficient that will make your search. Because it'll just say, oh yeah, yeah, yeah. This is really the way to go. And you'll get a very focused search if it's close to the actual cost.

So the best heuristic in the whole world-- this is-- I lost track of the person who asked this question now. But anyway the best heuristic in the whole world is in fact, the actual cost to the path. If somehow you miraculously knew the actual shortest cost and used that as your heuristic, you would expand hardly any extra nodes at all. And you would just go right over there. But of course, that's hard to calculate. So your heuristic needs to be-- it's a trade off.

You'd like it to be close to the actual path cost. But you'd like it to be easy to calculate. And it should be in the same units as the cost function. That would be something relevant for even the lab today. So we're going to ask you to find a heuristic function for doing something. It's important that the heuristic be in the same unit. So if you are trying to optimize time, and all your costs are in terms of time, then your heuristic needs to be in terms of time also. Because we're adding the heuristic estimate together with the path cost so far. So those costs have to be in the same units. But if your heuristic is admissible, you'll find the shortest path.

OK. So is Manhattan distance and admissible heuristic for navigating in Manhattan?

STUDENT: Well, there's Broadway. So I guess not.

PROFESSOR KAELBLING: Oh, good. OK, good. So people who know something seem to think that there's diagonal roads in Manhattan. So if there's diagonal roads in Manhattan, Manhattan distance is not an admissible heuristic. Because the Manhattan distance between here and here. Now I can't even tell you where the-- I don't even know. But OK. The Manhattan distance between here and here is this plus that. But the actual distance is to go down the diagonal.

So now, let's do this one. So this is now switching to something, to a problem that is not at all like driving around in a city. So this is the old-- the tile puzzle. We talked about it last time. You slide the little tiles into the gaps, and you try to get the numbers to line up. Or you throw it out the window, whichever comes first. So here are three possible heuristic functions for the tile puzzle. So you remember, a heuristic function takes a state. In this case, a state is an arrangement of the tiles on the puzzle. It takes the state, and it tells you how much it thinks it's going to cost to finish the puzzle off.

So there's going to be some goal configuration. It could be whatever you want. Maybe this is the goal configuration. Or maybe you have to make the picture of the dolphin line up on the front. Whatever it is, there's a goal configuration. So now the question is-- so a state is an arrangement of the tiles. And a goal is an arrangement of the tiles. And what we're interested in is finding the minimum number of moves that you need to take in order to get from the starting to the goal arrangement of tiles.

And so a heuristic would be an estimate for whether this arrangement of tiles-- how many more moves is it going to take to get you from there to the goal? That's what the heuristic is. It's an estimate of that. So here are three possible heuristic functions. One always returns 0 for every state. One returns the number of tiles that aren't where they're supposed to be. And another one returns the sum for each tile over the Manhattan distance from that tile to where it's supposed to be.

OK. So this is a very complicated check yourself. I'm going to ask you some smaller questions to work up to this. First of all, is heuristic A-- is it admissible? Yes. It is an underestimate of the cost to go. Is it legal? Is it OK? Yeah. It's legal. You can always have 0 as your heuristic. If 0 is heuristic, your algorithm turns into uniform cost search. So it's a legal heuristic. It's admissible. What about the number of tiles out of place? Is that admissible?

Yeah. Because each of those tiles is going to have to get moved at least once in order to get put home. So that's an admissible heuristic. And the sum over the tiles of the Manhattan distances to their goals-- is that admissible? Yeah. That is too. If somebody thinks so, it's right. Because for instance, to get tile five to where it needs to go-- well, it just needs to move one to the right. So imagine three starting here and we want to know, how much would it cost to get it to there? Well, it has to at least go down, over, over or over, down, over or something. So three has to move at least 4 times in order to get where it's going.

So they're all admissible. All right. Now let's think about the number of moves in the best solution. So how does the number of moves in the best solution depend on the heuristic? So what I really want to know is between one and three, which one of those things do you think is true? Put your hand up if it's one. OK. Good. Yeah, I should have said that. OK, good.

Right. All equal. Why is that? You shouldn't have had to think about-- once we decided that they were all admissible, what we know about A star search is that if you do it with an admissible heuristic, you're guaranteed to find the least cost path. So those guys are all admissible, so they'll always find you a least cost path. So they'll find you paths of the same cost.

OK. The number of states expanded during the search. So this is kind of an in general question. But generally speaking, will they expand the same number of states? Or will-- so will two hold or will four hold? I promise you it's one of those. So put your fingers up. Two or four? OK, good. Four. Good. Four.

So the idea here is that with a heuristic of 0, you're not driven towards the goal at all. As the heuristic cost gets closer to the actual cost, you will tend to expand fewer nodes in your search. So the number that you expand with heuristic C will be less than the number you expand with B will be less than the number you expand with A in general.

These things are not absolutely, absolutely hard and fast, because some small details of which order the nodes get expanded in can sometimes change these things around. But generally speaking, that's right. OK. So the idea for how to find a heuristic is-- so sometimes, it's really mystifying. How in the world am I going to find a heuristic?

So you tell me, if I already know something about how to get to the goal that you'll tell me exactly how to get to the goal. Well, that's roughly somehow, right? But one thing that I just kind of want to tell you about is the idea of a relaxation of a problem. So a good way to find a heuristic is to make an easier version of your problem. So for instance, an easier version-- not necessarily easier in the sense of actually executing it in the world, but easier to think about.

So an easier version of finding your way to Boston in the road network as it exists is finding your way to Boston in a road network that's as perfect as you could want it to be, where there's straight lines between all of the cities. OK? And so the Euclidean distance heuristic is the actual cost in a problem that we made easier by letting ourselves have any roads that we wanted.

If you look at these heuristics, you can also think of them as solutions to relaxed problems. So let's consider C. Imagine that we changed the way the tile puzzle works, and you could have multiple tiles sitting on the same square at once. That's relaxing the problem. It's taking away actually the essential feature of the tile puzzle, the thing that makes it sort of fun. But if you took that away, then you could just move the little tiles around. And multiple tiles could stand on the same square. If that were true, then heuristic C in fact would be an exact cost for that problem. Because you just have to march each tile to where it needed to go. And that would be done.

So that's easier than solving the problem we have. It takes fewer steps. And therefore, the path cost in that problem is a heuristic for the real problem. Does that make sense? Another kind of relaxation that can sometimes be useful to think about is now imagine that I'm trying to plan a complicated path-- so not only am I trying to plan the path through the United States. But I also have to be worried about being sure that the cars is full of gas, and my occupants don't have to go to the bathroom and a bunch of things like that. So the space is kind of more complicated. And still, I have to get from Chicago to Boston.

Well, that's a very, very, very hard search problem. I might solve a simpler search problem to get a heuristic for the hard one. And the simpler search problem that I might solve could be actually just finding paths in the graph to Boston. So I could imagine using this problem as a heuristic for solving an even harder problem. So it's just an idea of solving a simpler problem if you can find one that has the property that solutions to the simpler problem cost less than solutions to the harder problem.

OK. So here's an example just in the little eight puzzle with those three heuristics in using our implementation A star. They all find the path of the same length, 22. But with heuristic A, which was 0-- so that's basically uniform cost search-- it expands 121,000 nodes. That's a really big tree, 121,000 nodes. But as we add these heuristics which are better and better at estimating how far it is to get to the end, we expand many, many fewer nodes.

So it's really worth spending time thinking about a heuristic, trying to find some guidance that you can put into your search. I'm skipping that. I'm skipping that.

OK. So I'm just going to do a little bit of summing up because in fact, this is the last lecture. Woo. So let me just try to now say something about what we've been trying to do and why. OK. So the main goal here-- so the goal here has been to teach you a bunch of really interesting and important technical material but also to give you some important high level ideas. The most important one that I think of is that in order to understand or construct a really big and complicated thing, you have to be able to do it in terms of understanding little pieces and how they get put together.

And we've been doing that over and over in a variety of different kinds of systems to help you see how that process can go. So that when you're confronted with a big, complicated system of a different kind, you can find some appropriate way to decompose it so that you can solve it and understand it more directly. So we have thought about this in a bunch of different circumstances. We've looked at how data combines, little pieces of it get put together into big things, procedures, state machines, terminating state machines, polynomials, signals, systems, circuits, distributions, plans.

All these things are-- we start with a small vocabulary of primitives. And we study ways to make a more complicated thing out of collections of simple things. So if there's even one high level idea you get out of this whole class, it's that when you're confronted with some big, hairy thing, you have to find a system to decompose it. Or you just won't get anywhere.

For those of you who are interested in going on in EECS, there's a bunch of literature on the EECS website. And I'm not going to grind through the detailed requirements for you. But good courses to take to follow on here are actually-- yeah, here we go. So there's 6.02. So that's the other half of a lab class. For those of you who were in this for lab credit, you got half of a lab credit for this class. And you get half a lab credit for 6.02 if that's something you want to do.

You can also then go on to the next level of subjects in EECS. So 002 has 18.03 and 6.01 as a prerequisite. So you could take 002 next if you wanted to. You could take 005 next if you wanted to. As long as you have 6.042. It's in italics, so I guess it's a co-requisite. Ah, yes. Danny says yes. Go, Danny. Excellent. So you could take 042 and 005 at the same time next term if you wanted to.

If you've taken 042, you could take 006. You could take 007. That has the best number of any course in our whole subject, of course, 6.007. So and what is 6.042? So let me just say something about 6.042. 6.042 is discrete math. It is, for some people, the first place that they've been exposed to the idea of proof. It's done in discrete situations, pretty much. There's discreet probability, some set theory, that kind of thing, induction. So 042 is a good thing to take because it's a prerequisite for a bunch of other things on the computer science side.

On the EE side in terms of background, it's going to take 18.03 and 6.041, which is a probability class. I personally think that everybody should take 6.041 because I think probability is best thing ever. But it is not yet a requirement for everybody in the department to take it. But everybody should. Yeah. I feel really strongly about that, actually.

Anyway, you can then pick a specialization. So the cool thing is, the way the curriculum is designed now, is that you can think of some topic area that you're really interested in, like maybe cryptography and security or maybe control theory or maybe robotics. And then you can look and actually plan a structure-- we use the word course so much here now. I don't have a word for it anymore. But anyway, a set of subjects that you can take that will satisfy the requirements for the major but will also focus in an area that you care about.

So rather than sitting and deciding whether you should be a 6.1 or a 6.2 or a 6.3, what you should sit and try to think about is, well, what am I excited about? Am I excited about circuits or nano fabrication or robots or algorithms or cryptography or what? And then you can kind of figure out how to take the courses that will let you explore that. So we have these tiers. They're foundations which really kind of get you going. We have header subjects, which are built on top of the foundations. And then we have advanced undergraduate subjects. And I think one of the most inspirational things to do is to go down the list of advanced undergraduate subjects, figure out which ones sing to you and then try to think about what the prerequisite chains for those would be.

So that's kind of how we want you-- if you are thinking about being an EECS major, this is the way we like you to think about that. There's also some new things. There's course 6/7, which is a combination of biology and EECS. These slides were most recently edited by [INAUDIBLE] before me. So he's in course 8 and 6. So he added course 8.Flex here as another way to get some interdisciplinary connection. There's also 18C. That's another way to get an interdisciplinary connection.

So good. So that's pretty much it. If you can help us in the following ways-- when it's time to do the MIT-- I actually don't know if this URL is correct. It's not available yet. When the subject evaluation URLs come out, I will email everybody. Also next Monday at this time and place, we'll have a feedback session. So we're not going to talk. But you are. If you come, you can tell us what you thought was good, what you thought was bad, how to fix 6.01 for next time. We'd like to hear that.

And let me just say what's happening this week. So be quiet for one more minutes. OK. So this week there's software lab and design lab as usual. There's going to be a contest in design lab with cool prizes. But you need to be there and get checked off in both places. On Thursday evening is the nano quiz make ups. Be absolutely sure that you come allowing enough time to finish the number nano quiz make ups that you want to finish. And you should allow a half an hour for each one. Well, maybe 20 minutes. 20 minutes or half an hour for each one. May 14, feedback session. We'll have review sessions for the exam. The exam's on May 21. That's Monday afternoon. If you need special exam accommodations, please get in touch with me right away. OK, thanks. We'll see you later.