Women of MIT: Celebrating Science and Engineering Breakthroughs (Session III)
CHISOLM: I'm Penny Chisholm. I'm chairing this next session, and I would like to add my voice to the chorus of thanks, to Ed for organizing this, and also to Nancy for everything she's done for women faculty at MIT over the years. She's really absolutely changed our life. I referred to the initial group in this study as the uprising when I talk about it with my colleagues around the country, and they've all watched the progress here, and it's largely due to Nancy's leadership and the fantastic administration that joined us in that activity.
So now we have three exciting talks that cover the cosmos, tiny swimmers, and patterns in language. And that's just so MIT when I think about it. So our first talk is by Regina Barzilay, who's an associate professor of electrical engineering and computer science, has degrees from Ben Gurion University and a Ph.D. In computer science from Columbia. A number of awards-- National Science Foundation, a career award, the Microsoft Research Faculty Fellowship, and also she received the MIT Technology Review Young Innovators Award for young innovators under 35. She's going to be talking today about teaching machines to behave using natural language.
BARZILAY: Thank you for the introduction. So my field, my general field of study is statistical natural language processing, and I'm sure many of you are familiar with this field, primarily from the Hollywood perspective. We have a very long time from '30s, we had machines which can understand us and speak to us in natural language. And for quite a long time, we've seen natural language technology only in the movies.
Luckily for us, in the last decade, the situation changed, and now we've seen some really interesting technologies that are used by millions of people. So one example would be Google Translator. Maybe a month ago, you've seen the question answering system from IBM, Watson, which won Jeopardy game against human champions. Many times you are using natural language technology even without realizing that you're using it. So one of the examples is SAT scoring, which from 2003 is done using statistical natural language processing techniques.
So one question you can ask yourself-- what exactly happened here that this technology was nowhere in real life and all of a sudden, it moved towards real applications? It's actually very interesting when you look at the change in technology within the field. So when artificial intelligence was very young, people had this naive idea that you can take the computer and feed it with all the human knowledge, and if you put enough knowledge into the computer, you would be able to make a computer which can understand language. So people feed grammar, people feed dictionaries, and each word for very, very small domains. And the hope was, OK, let's just add a bit more and a bit more, and then the computer will start understanding language.
So guess what. For 20, 30 years, they tried, and it never was able to really understand language or to scale up. So it was sort of a spectacular failure in the wrong scientific methodology. And when I took the first time natural language processing class, the professor couldn't show me even one system that was working. And then a great thing happened. In the mid-'90s, the field underwent statistical revolution when technology completely changed.
Rather than trying to feed the machine with the knowledge, the idea was that we can give the machine a lot of data, and machine would be able to infer the right knowledge from this data. So this is called machine learning. And let me be more concrete here. Let's say you want to design a program, which given a review predict whether it is a positive review or a negative review. This is a very simple task that machines can do very well.
So here you can see one positive review and one negative review, and you can see that they have different words in them. Correct? So if we don't want machine to fully understand English, but we still want it to solve this task, what you can do, you can say, maybe the machine can just analyze the distribution patterns in both cases, and predict which distributions are more likely to appear with positive reviews. So you can have one example, two examples, three, and four. You would see that there are certain spikes and certain words which are positive and the words which are negative.
So the way to sort of formalize this intuition is to think about every text as one big vector, and for every word in English you can have an entry in this vector, and if the word is present in the text, you would [? pull ?] one. Otherwise, zero. And for each one of these vectors, you can say, does it have positive or negative sentiment? So for those of you from the field of science and engineering, it already reminds you you can think about it as a statistical problem. You can just try to fit statistical model, which given enough of those examples, can make predictions about new unseen text and predict their sentiment.
So what we're doing here, we're taking many parameters of the model and adjusting this parameter so that they can do the right predictions. So these parameters would say that, for instance, the word delicious is not only predictive of the positive sentiment. It would also tell how much we can trust this word in relative to other words. So our statistical model will combine all the evidence and make the inference.
So this is obviously a very simple example, and we can apply the same statistical ideas to much more complex mappings. So for instance, machine translation for Google Translator. Here, you can feed as a training, as a data for the machine, instances of parallel sentences. So here you can see a piece of Dostoyevsky in Russian and English. You can imagine for instance [? Babel ?] or [? Europol ?] proceedings, where you have the same content translated in several languages.
And from these kind of data, the machine can lend the transformation, again, using more complex statistical models. Another example here is semantic interpretation you have a piece of text. You have some lambda calculus expression, which encode the semantics, and you can learn the mapping.
So one thing that may be striking for you is that the machine today can translate without actually ever understanding what the word staircase means. You just have a lot of pairs. You learn statistical model. There is no connection to what is going on in the real world. What is the real semantic of the word staircase? And in many applications, you can do it, and you can do it quite successfully.
However, there are many cases where we actually want to make connection to the real world. And my talk today is actually about grounding the interpretation of the language in the real world. So let me give you very concrete examples here. So this is an example of a task, which some of you are [? commentating ?] their own computers doing on weekly basis is installing new software on your machine.
So here, you have some text, which tells you how to install a driver on your machine. And you as a human need to read it and to do the clicks on your machine. So the question, can we automate it? Can we take this piece of text and automatically map it to the right clicks, to make machine do the right thing? So in this case, machine would be very literally interpreting what should be done.
Another example, which is maybe more complex when I'm talking about behavior, is assuming you want the machine to beat for you in some auction, or you want machine to play a game. And machines can do it, but what they cannot do today is to take a piece of text, which tells you what is the right strategy, or take a general human knowledge-- what is good for you to do-- and translate it into the better choice to create a better behavior. So these are the type of questions that I will be talking about-- how we can take a piece of text, interpret it, and translate it into better behavior in reality.
And the idea behind the approach is actually very, very intuitive. You can say, OK, whenever I'm interpreting text, and they really did it correctly, I would assume I will get a good behavior, assuming I have some measure of goodness. So in this case, let's say if I interpreted the extraction correctly, my Windows would behave the way I expect. If I did it wrongly, I may get a mistake in my operating system.
I may crash my computer. So you can try these things in your operating system, see what happens, get a feedback, and then based on this feedback, you can update your behavior. It's pretty much I'm looking at my four-year [? son ?], so this is the way he updates his behavior based on the feedback.
So here, the question is, can we design algorithms that can change the way that interpret language based on the feedback they observed? And here, for instance, for this [INAUDIBLE], the feedback is very, very clear. It would be some concrete reaction from the operating system. And this is sort of a summary of their approach. I'm not going to be blowing off anybody's head, but the idea behind it is actually shown in this cartoon.
Let me be more concrete here about what do I mean interpreting language and acting in the world. So what computer will see at every point will be the state of its, for instance, operating system environment, which windows are open, closed, what you can and cannot do. It will have a piece of text, an instruction.
So at every point, the computer will take one instruction, do an action in the real world, and then the state of the world will change. For instance, here, we will see that some new windows will be open and closed. And then it will continue doing this execution for some time until it gets a feedback, as it would be a failure, or it can be some positive feedback that it succeeded to do the instructions.
And the question is, so I'm underlying you the general frame of the question is how you can actually do the learning. So to give intuition on how you can do the learning, I will take a very simplified view on the problem, but I think it will give a good understanding of the intuition behind the [? mass ?] here.
So you can say, OK, let's say I start with some guesses of what is the correct mapping between the words in English and clicks that I need to do on my machine. And let's make a very simplifying assumption and say that every word is just translated to some click onto some object. It's like one to one mapping.
So now you start with your hypothesis, and you can take a piece of text and translate it using your current hypothesis. You translate it. You can run it on your machine and see what happens. Let's say in this particular case you fail. And you can try and go. You have a very big set of instructions out there. You can actually try to go one after another. In all the cases you fail but one.
So it succeeded with 20% of accuracy. Not great, but you already have some ideas about the mapping. You can say, maybe the parameters which control the translation of this green document, the document in which we succeeded, in this particular case, they were the right ones. The parameters which control the words which appear in all other bad documents, they were the wrong ones.
And I need to update them how to give more weight to the good ones and to detract the weight from the bad ones. And that's the idea. So we're going to change parameters. Try again. So now we have a new mapping, possible mapping. Now we succeeded in two documents, and we continue this kind of update of the parameters based on the feedback.
And what I presented to you here is actually a very simplified description of the policy gradient algorithm, which is a reinforcement learning algorithm. And it does, more generally, the things that I described. It starts with some set of guesses of parameters. Based on these parameters, it selects what is the best action or what is the likely action to take given these parameter values, tries them, executes them, observes the feedback, and corrects the parameters.
And this particular algorithm has very nice mathematical convergence properties, so you can be sure when you are applying this algorithm, you can actually assess how it will behave. So it has a good theoretical understanding of the algorithm. And even though I showed it how you can apply this kind of technique to map the words, and this makes us a bit uncomfortable, we say in general not to translate.
We need to do many more things. We need, for instance, to chunk the sentences into pieces. We need to reorder the instruction, because sometimes we change the instruction or the way we execute them. This algorithm can actually do all this very complex mapping using the same underlying mathematical technique.
So you can do this very complex mapping from the text to the instruction using this nice mathematical framework, where we can try stuff, observe the feedback, and correct ourselves. So I will just tell you how it actually works. So we apply this method to translate Windows 2000 Help documents, so when you have a problem with your machine, Microsoft has a website where you can go, and it will tell you how to solve your problem.
So before I show you the numbers, I want to ask you a question. If you are a person who does this task, you are given the instruction, and you want to go and implement it, how likely do you think human will succeed in this task? Do you think you can do it 100% correctly? A number? Can somebody volunteer a number?
BARZILAY: OK. So I will show you the numbers, which are optimistic numbers. Those are the numbers for students in computer science who are graduate students at MIT who do research in systems. If you need an expert, those are the experts. It's like our upper bound in their performance. So they can do this task with 80% accuracy. So how can MIT professors can do this stuff? I'm not sue, but I'm definitely not there.
So you can see that the machine can do this kind of mappings 53% accuracy. And I don't want to make [? hard ?] claims, but I can think of many people [INAUDIBLE] at MIT and outside, which will be able to do it with 50% accuracy, because some of this instruction is actually quite complex. So we can really see that here, this simple feedback and nice mathematical formulation enables us to do this task in a reasonable fashion.
So the first example that they showed you is actually easy example in a sense that here you had instruction from Microsoft, which were written for lay people, where they tell you exactly what to do. Now in how many situations in life are you given such detailed exact advice? Apparently, in very few. You can see there are plenty of books, which tells you how to do things, but they don't tell you every single step that you need to take.
So maybe a more interesting question to ask is, can we take general advice and make machine understand these very general advice? Which is north point to point? Can we get an understanding of this general knowledge? And I will talk about this task in the context of Civilization game. How many of you have ever played Civilization? We see some. We see very few players. It's interesting. Oh, oh, oh. OK. The next speaker plays Civilization. So like if I ask this question to a computer science audience, almost every single person raises his hand.
But the reason that I selected Civilization is because it's a very complex game. It's [INAUDIBLE] a test, because you need to build this whole civilization to grow crops, to fight other nations. You can play it for days and days. And more interestingly, the branching factor, which means what action you can take at each point, the space is huge. It's 10 in the 20s and millions of billions of possibilities. It's almost like you have plenty of choices, and if you want to win, you obviously need to make the right ones. It's a very hard computer game for the machine to play.
So what we want to do is to say, now if I give to machine some knowledge, what are the good moves to take? So for instance, this is a manual, which tells you that if you want to build a city, you need to make sure that there is water nearby, because otherwise, you cannot grow your crops, and your population will starve. The question is, can a machine take advantage of this knowledge?
And before I'm going to go and tell you how we are solving this problem, the answer is we can. Otherwise, I would not have brought it here. But I will tell you how machines [INAUDIBLE] computers can play games. You know that there was like 10 years ago, 15 years ago there was this breakthrough with chess, so computers could play a lot of games quite successfully.
And the way computer plays the games is that they represent the world as a state, so the state is just what do you have in your universe? How many cities? How many people? All the relevant information. And at every point, you can take an action. It's like a play. You can decide, do you want to invade, or do you want to irrigate, or whatever you want to do.
So you start by being in some particular state of the world, and now you need to select an action. And this is where computer needs to say, what action would I take? So I am going to tell you in few slides, explain to you how you can do it using Monte Carlo simulation. This is a very, very popular technique that is used out there.
So the idea will be the following. You take your current state, and you just copy it into some other place. You copy it, and then you start doing simulations. So for instance, you can decide that now you want to irrigate land. You irrigated the land. You take some actions. You look for a few steps, and you observe how well do you do.
You can think about it like in chess. In chess, when we're doing simulations, you say, which one I'm going to move? OK, I'm going to move this piece. Then I'm going to try and see how the game would go. It's an obviously non-deterministic prediction because of his opponent, but you would say, OK, I play five moves, and they saw what happened.
And you record the state, and here you remember this is our vectors that I showed you in the very beginning, which records various information about the state and the action. It will tell you, you try to sort of match how the action maps to the state. It will tell how many terrains are there, what unit was moved during the state, and here, you say that if this is a state of the world and the actions that they took, I got some set and score. And then you try some other action, and you again record the vector.
And at the very end, you've decided what action are you going to take. But more interestingly, whenever you did use simulation, each time you observed how good they were. So now you're learning task is to predict how a particular situation maps to a score so that you know how to predict. Again, think about chess for those of you who are not familiar with Civilization. You just try to say in these particular circumstances, it's not good for me to take this move, because I'm going to get a lower score later. And the patterns that are good or bad are actually learned automatically by the machine when it does the simulations.
So now you've selected your next move, but next time when you are in state number 2, you already remember something about good and bad moves. So when you are trying all possible moves, you will use this knowledge to look at the right sample, because the space is so large you cannot allow to just randomly try stuff. So the knowledge that you gain through the simulation helps you to focus in the right part of the space. And you continue. This is a Monte Carlo session [INAUDIBLE].
So now you can ask yourself, OK, it plays. Why do I need text now? I do my simulation. The life is good. Why do we need to make the life more complex? So look at these particular examples that they started with whenever you need to have water to irrigate your crops. How this statistical model would learn this information? It needs to do lots and lots of simulation to make the right choice, and obviously, we're talking about the huge space. It may learn it, or it may not learn it. And obviously, you don't want it to take like three years to plan the next move, just trying exhaustively all the possibilities.
So the idea here is can I somehow incorporate this information into the simulation? Can I focus it into the right space within the simulations? And the way we are going to do it, we are now going to provide the machine-- in addition to the information about the state itself, we're going to provide the machine with an information from the manuals.
So machine now needs to find for each state what is the right sentence in the manual that we can use, and then for instance, use all the words from the sentence as features in our vector. You can say, but how does a machine know which sentence is a good one? It doesn't. But that's exactly where you're just learning this in the process. Sometimes it would select bad sentences, and it will get a bad feedback for that. And with time, it would learn to select better and better sentences.
Again, using the same type of thinking, you can say, maybe I just don't want to throw it all the words, because what I care about is the action that I want to take, and I want to know the attributes of the state. Again, using the same type of intuition, you can make the machine parse it into the different semantic categories and add these categories into your feature vector. And it learns using the same feedback mechanism.
So for those of you who are familiar with neural nets, this whole learning process actually is formalized as a neural net, which is multi-layered. When you start with the game-state [INAUDIBLE] and text, and at each level of the neural net, you make different analysis decisions. You select the sentence. You identify semantic pieces of the sentence, and eventually, you select the good move. And there are, again, formal mathematical methods to train those things that allow you to do it in a proper fashion. So let's see how it works. So before, I showed you the numbers. I will tell you how we evaluated the game. As the opponent in this game, we use AI game player. For those of us who don't play games, AI game play is something that the company shapes when it produces a game. It's a product that they develop, so when you buy your game, you can play against it.
And it's very well-engineered. It has a lot of knowledge and connections to all the parts of the game, so it's a very serious opponent. It should be good enough to play against a human. So the red part here is how the machine can play if it only uses Monte Carlo methods. Without any knowledge of the manual, you can see in this case, it loses to this AI opponent. It only wins 47% of the game against AI player.
However, if you add knowledge of the money all together with the Monte Carlo player, you can see that it actually beats, in most of the games, with 80% accuracy against the AI player. And again, it's very impressive, because AI player is something that has perfect knowledge and a lot of engineering strategies. We can clearly see that you can learn to behave if you read relevant material in a smart way.
And this is pretty much the end of my talk. I would just want to summarize the highlights of the approach, so we learn language by interacting with the world and observing the result. And what this approach enables is to pull together multiple sources of information to do language interpretation better.
And this direction we started just two years ago with my student, [? Brandon ?], who's here. There are lots and lots of things that we can do here, and I'm a firm believer that if we are successful, we would enable computers to actually act competently in the world with a lot of unstructured and [? imprinted ?] information. Thank you very much.
CHISOLM: We have time for a few quick questions.
AUDIENCE: I love to ask questions. That was very, very interesting. I have sort of two questions. One is I think I missed where the actual semantic labeling occurs. So the first question is, how does the system identify certain phrases in terms of their function? And the broader context-- the second question, which is really a broader context is one of the many hats I wear is that I work in assessment, and I'd love to understand more about how the SAT is scored using methods like this and, in particular, if they're Monte Carlo methods.
BARZILAY: OK. So let me just answer the first question. First, can you please bring my slides back? It's like when you put human in the loop, then it makes it harder. So the idea here is the semantic interpretations that we're talking about is relatively shallow semantic interpretation, where you just try to understand the action words or the action, which is like build city or irrigate and the attributes of the stage that are desirable.
And what happens here at each point, you can think about it as we did in the beginning. You sort of randomly guess what is the annotation? What is the semantic label of each phrase? And each phrase will have a lot of features. It will have syntactic information. You would have location. You would have lexical information.
And now for some particular labels that you put there, when you added them to the game and you observed the output, those would be positive outcomes. In some cases, you put your labels, and they happen to be very wrong, and you're going to get the negative outcome. And so this interaction process, you would be able to refine and correctly do this type of semantic tagging.
Now let me go to your second question about the annotation for SAT scoring. So it actually doesn't use this particular technique. It uses a technique which is maybe close to the positive and negative sentiment that I described. The way it is done, they have a huge database of essays, which are graded by humans.
And you can use a lot of linguistic representation from these essays and say, those are the features which are predictors of good essays, and those are the features which are predictors of bad essays. And you can use syntactic constructions, and people use some the way they're ordering information, a variety of things, and then you just train a very large model, and you do assessment in the end.
In many cases, it is difficult to get information with annotations. But they have these huge databases of millions of graded essays, so it wasn't a problem. And let me just say how it is used. So originally, these essays were graded by two humans, and when the humans disagreed, the [? set ?] human will read it, which is an expert. In this case, it's graded by one human and the machine, and when these two disagree, the human reads it. And they actually found out that the machine is much more consistent in scoring than the human readers. Yes.
AUDIENCE: Hi. I have one probably too silly and one too serious question. So the silly question is have you tested computer science students against the Civilization machine?
BARZILAY: Sorry. Can we--
AUDIENCE: Have you tested your Civilization player against the computer science students of MIT?
BARZILAY: We actually haven't. We haven't tried it against humans, no. We haven't.
AUDIENCE: What would you expect?
BARZILAY: I don't know. Actually now, and [INAUDIBLE], there is a website where humans actually can go and play against each other, and we are going to participate in this competition. And I would let you know how it goes. But if we make an analogy from other games like poker and others, we clearly see that for many games, machines for sure do better than humans. So I hope we can do better.
AUDIENCE: Good. So here's the serious question. Can we also teach machines to behave socially competent?
BARZILAY: So no. You can say that the general question is, what exactly does it mean? And if there is a way to properly formalize what does a socially appropriate behavior mean, maybe you can try to do it. But I think that if people-- this is not my area of expertise-- or people in MediaLab here who actually look at how to make robots social. And I think it's [INAUDIBLE] who does a lot of research. You may want to look at her papers.
CHISOLM: Thank you. I think we'll move on. Thank you, Regina.
So our next speaker is Peko Hosoi, who's associate professor of mechanical engineering, and she's the director of the Hatsopolous Microfluidics Lab. She has degrees in physics from Princeton and University of Chicago and a number of distinguished teaching awards. She's also a McVicar fellow and has been a Radcliffe Institute fellow and a Doherty assistant professor of ocean utilization. She studies the optimization of biological fluid systems for locomotion, and today is going to talk to us about small swimming lessons optimizing low Reynold's number of locomotion.
HOSOI: OK. Thanks, Penny, for that introduction, and thanks very much to the organizers for putting this workshop together. I have to confess that I've met most of the speakers in various capacities at MIT, but I've never heard any of them talk about their research. So I think this is a wonderful opportunity to sort of fill that gap. But swimming and optimization-- so today I'm going to talk to you about some of the challenges you face if you are very tiny and you live in a fluid environment.
So the first thing you need to know about systems like this is that-- let's see. How do I make this go? Do I have to point this somewhere? Yes? No? The first-- OK, you're only going to see my title slide. [INAUDIBLE]. Oh, did that go? This one?
CHISOLM: That's how you move the slide.
HOSOI: Oh, good. Oh, OK. Good. OK. I don't-- OK. I'll point over here. OK. So the first thing you need to know is that size matters in these systems. And most of you I think have an intuitive feel for this, because so, for example, if you look at a fluid system like this at a scale that you're used to dealing with on an everyday basis, you sort of know exactly what's going to happen in that system.
But if you compare that to something like what an ant would do, so an ant can actually go to a pool and drink, or it can pick up a drop of water and carry it around. So clearly the physics is different between the system on the right and the system in the middle there. And my slides are advancing in a funny way. So yes. Do I have to point this at something? Over here? OK, good. OK.
So clearly, the dominant physics in these situations is different, and in general, big things tend to be dominated by things like gravity or inertia, and small flows tend to be dominated by things like surface tension and viscosity. So the idea that I want you to take away in this talk is that because the physics is different at these different ends of the spectrum, you might expect that to be reflected in the behavior or in the morphology of biological systems that have to interact with these kinds of environments.
And, of course, I've cheated a little bit here, because big and small are relative terms. So I have to tell you what I mean by big and small, which will come up in the following slides. So I showed you a video on surface tension, so surface tension is what's dominating in that video with the ants. I'm now going to show you a video of what happens when viscosity is dominant in these systems.
So this video comes from the-- I am just not good with this pointer. Hello? This way? There we go. OK. Thanks. OK. So this video that I'm going to show you comes from the National Committee for Fluid Mechanics Films, and so many of you may not be aware that we have a National Committee for Fluid Mechanics Films, but we do. And because this is a celebration of the history of MIT, I think it's worth saying a few words about this.
This committee was actually founded by Asher Shapiro, who was faculty in the department of mechanical engineering, and it's a wonderful resource if you're looking for visualizations of fluid systems. It's also historically interesting, because all of the scientists in there look like they came out of a movie from the '50s.
But the fluid visualizations are incredible. So the movie I'm going to show you was actually not done by Asher Shapiro. It was done by another giant in the field of fluid mechanics, who is GI Taylor. And I'm pointing at the screen now. Like this? No. Oh, there. I'm pointing up there. OK. I'm still pointing. OK. GI Taylor. OK. Maybe I need to move over here.
So it's by GI Taylor, and he did the following experiment. So he said, all right, let's take a cylinder, two concentric cylinders. This is-- point at the screen? Hello? OK. Good. OK. So he used two concentric cylinders, and the idea is that he fills the gap between those cylinders with viscous fluid. And then what he's going to do is he's going to put a drop of dye in that gap, which you will see any minute now as soon as I get this pointer to work, and after the dye appears, he's going to slowly rotate the inner cylinder in order to mix that dye into the existing fluid.
So here's the video, which will also appear any minute now. There we go. So there's the cylinder. That's GI Taylor's hand you see, so this is the dye that he's putting into this small gap. So the small gap is filled with the viscous fluid now. And he's now going to spin the inner cylinder. In a minute, you'll see. So there's the dye. So there's the apparatus. He's now going to turn the handle, so he turns it slowly to mix the thing in. You can see it being mixed. There you go. There's one turn.
I think he turns it maybe three or four times, so you could get it sort of nicely mixed in. Three and four. And then just for good measures, he is now going to turn around and go the other way. So now he's going to turn this thing-- give it one turn this way, and go two. And you can see that as he comes around now-- so that was the third turnn-- as he comes around for the final turn, the dye actually unmixes.
So two messages to take away from that film. Number one is that when you change scales in fluid mechanics, you can no longer trust your intuition, because very strange things can happen. The second message to take away is that when viscous forces dominate, as they did in this system, the flows are reversible. They are time reversible, and this is something that you can show very easily by looking at the mathematical structure of the equations.
So this is a challenge if you're a tiny swimmer. So a tiny swimmer-- so now I'm going to tell you what I mean by a tiny swimmer. So a tiny swimmer-- I'm just going to hold this and hold the button down and point. So a tiny swimmer is defined-- when I say tiny swimmer, I'm defining it in terms of the Reynolds number, so the Reynolds numbers is a dimensionless number which tells you the relative effects of inertial and viscous forces.
So it can be defined as I've written up there. There are two parameters that are material parameters of the fluid, which is rho, the density of the fluid, and mu, the viscosity, but more importantly for our purposes, it also has the size of the organism and the swimming velocity of the organism. So you can see from this definition that if viscous forces dominate, the Reynolds number is going to be small, and then you're going to have these problems of reversibility that I just showed you. And one way to make the Reynolds number small is to make the size of your organism very tiny.
So to calibrate yourself, what do I mean by very tiny? So there's a person. So a person has a Reynolds number of about 10 to the fifth. So that does not count as tiny. If I move down, so for example, a tuna fish has a Reynolds number of about 10 to the sixth, because they're about the same size as a person, but they're better swimmers, so they have a better Reynolds number. Ducks are about 1,000. If you go down to the size of an ant, an ant has a Reynolds number of about 1.
So even an ant, you're not really down at small Reynolds number. You're not totally dominated by viscous effects. In order to be dominated by viscous effects, you have to move sort of over here. I don't know if I dare turn on another pointer, but OK. You have to move over here, so this is the region where you're at the size of about single cell organisms.
So this is the line where human vision cuts out, so beyond this, you can't really see without the aid of a microscope. And bacteria, this is roughly the smallest thing that you can see with a microscope. So the organisms that I'm going to be talking about in my talk live sort of in this region over here.
So before I tell you about the work we've actually done, I'm going to put up two slides on biology, and I am not a biologist. Let me put that caveat up front. And so the first slide I'm going to show you is a fluid dynamicist's view of biology. So here it is. This is a fluid dynamicist's view of biology. There it is. That's what it looks like to us. And there's two reasons why this is a fluid dynamicist's view of biology.
One is that we find it very confusing, and the second is that instead of organizing this by sort of traditional biological criteria, the person who drew this-- this was actually a famous image that was done by James [? Lightheel ?] in a lecture that he did in 1975. And you could tell this was done in the days before PowerPoint when you actually had to draw all your slides by hand.
So he's organized this not by any sort of biological criteria, but he's organized this by how things swim. And so if you look at the inner circle here, so everything in this inner circle to a physicist looks like a head, which is a sphere, with a flexible tail on it, where you might have n tails, and n is a small number-- usually 1, 2, or 3. So everything in that circle basically looks like that, a sphere with a tail.
And then the second thing he did is he now classified these into prokaryotic and eukaryotic cells. And that division comes over here. There's a green line up there, so the prokaryotic cells are up here, and the eukaryotic cells are down here. And again, for the purpose of this talk, what you need to know is that the swimming mechanisms are very different for both of those. And I'm going to focus on the eukaryotic cells, which live down here.
So that's my first slide on biology. My second slide on biology is to tell you something about the mechanisms that eukaryotic cells use to swim. And there's something that's sort of remarkable about these cells, and that is that if you look at the tails that they use to swim, they all have fundamentally the same structure. And the reason this is remarkable is if you think back to the biology you learned in high school, you remember that basically everything is a eukaryotic cell.
Prokaryotes are basically bacteria, and eukaryotes are basically everything else. And so if you look at the structure of the tails on these cells, which are flagella or cilia or something like that, they all have this thing that's called a 9 plus 2 microtubule structure, which looks like this. So there are a bunch of little tubules that surround a central pair of microtubules. and this doesn't matter if this is the cilia in your lung or the tail on a green algae or name your favorite eukaryote-- grasshoppers.
Everything has got this structure in it. And the remarkable thing about that is that that means that the diameter of all the tails are basically the same across all species. So when I say basically the same, I mean it's between 250 and 400 nanometers. So that's the first thing you need to know to understand what's going to come up in this talk.
The second thing you need to understand is that these tails are not passive structures. These tubules can slide relative to one another like this, and by doing that, they can induce a local bending moment in the tail. By inducing of a local bending moment, they can actually select the shape as a function of time. So those are the two things that you need to keep in mind as we go forward.
So here's the question that I want to ask. I want to know, can we use this information to predict morphology and kinematics of microorganisms from purely hydrodynamic considerations? And the basic idea here is that biological systems, we could hypothesize, have been optimized to perform certain tasks very well through millions of years of evolution. And so if we look at this structure of the hydrodynamical equations and look for optimal structures, do those structures then show up in biology?
So the first thing I'm going to do is actually I'm going to do kinematics, because that's a little bit easier. And then we'll come back to morphology, and optimizing kinematics looks something like this. So the first thing you want to do is you want to take a swimmer with a fixed geometry. So here's my swimmer-- for example, a swimmer with a fixed geometry. There you go. So that swimmer has a fixed number of limbs and a fixed way he can move his limbs around.
And you can say, all right, please actuate this swimmer for me in an optimal way. So you could choose to actuate this thing in any one of these possible ways. If you said, OK, I want to do something that is very efficient, you might say do something like this. If I want to do something that's very fast, you might do something like this.
And I'm now going to ask the same question for microorganisms. So I'm going to say, if you're given a fixed geometry that looks something like this, please tell me, how should you actuate this thing in order to do something that's most efficient or fastest or whatever it is you're trying to accomplish if you're this microorganism?
So let's see. So here's the model. I'm not going to go into the details of the model. I'm going to tell you that we started with the simplest structure that we saw in that big complicated diagram that I showed you that [? Lightheel ?] drew, where this is the organism. So it has a spherical head, and it has a single tail. And I'm just going to ask, what are the shapes that this tail should take in order to become an optimal swimmer?
And for those of you who are interested in the details, you can go and see how it works in this paper. But I'll just show you the answer. One of the remarkable things about this answer is it's incredibly robust. So no matter sort of what kind of crazy initial guess you give our algorithm, it will always converge to the same answer, which looks like that.
And so a couple of nice things about this solution is that A, it looks exactly like what you see in biological systems, exactly meaning-- I haven't said anything quantitative yet, but qualitatively, it looks pretty good. You always get a traveling wave solution. That wave is characterized by regions of high curvature that are connected by segments with low curvature, which is also what you see in biological solutions. If you actually do a kind of a quantitative comparison, there's a few things you can measure, and you do pretty darn well.
So for example, our computed ratio of amplitude to wavelength-- so there's the amplitude, there's the wavelength-- is 0.21. This is the best possible swimmer you could have. What you measure in biological systems about 0.20. If you look at the number of periods per tail, so that's sort of the number of wavelengths along here, the optimal thing you can do is 0.123. And the [? measurement ?] is a little bit bigger but still pretty close.
So we were pretty pleased with that. That seemed very promising. The problem with this is that kinematics is very difficult to measure in microorganisms. So we thought, OK, maybe we should go back, and we should look at something that's maybe a little harder to compute, but easier to measure. So let's move on and talk about optimizing morphology.
And it turns out there is lots and lots of data available from morphology of biological organisms. And so we actually decided to constrain our space a little bit more and look at a specific type of cell. So the cell that we decided to use is a sperm cell, and the reason we decided to look at sperm cells is because sperm cells have an extremely well-defined objective function. And this is not necessarily something you can say in other biological systems.
We don't know what the biological system is trying to accomplish, which is why the sperm cell actually gives you a very nice model to work with, because their objective is to take genetic material and move it from one place to another. They don't have to worry about eating. They don't have to worry about talking to each other. They don't have to worry about any of this stuff. They're basically transporting a packet of genetic material from one spot to another.
So here's the question you can ask. You can say, all right, suppose- I wasn't going to show you that yet, but OK. So suppose you want to take your genetic material, which is stuffed in this head right here. You can ask, how long of a tail should I put on this in order to optimally move it?
All right. So we can compute that. We put that in the model. Fine. Here's a measure of efficiency of the swimming. Here's the length of the tail, and each of these curves represents calculations for a different head. So here's one head size and one tail size. Each of these points represents an optimization calculation, so we optimize the kinematics here. So the swimming stroke here might look different from the swimming stroke here, but each one of these is the best you can possibly do.
And then you find that, oh, this looks great. So for a given head size, I can find an optimal length tail. Then the next thing you say is, OK, well, there's really only two length scales in the problem, so I should be able to collapse this onto one curve, which we do. So there it is. It collapses beautifully. And now I can tell you why this is my favorite problem.
So this is my favorite problem that any of my students have worked on, because there is an answer. And here is the answer. The answer is 12. That's it. That's the whole answer. So remember, what was the question? The question is, how long of a tail should I put on my head if I want to carry this packet of genetic material around? And my answer to you is I don't care what species it is. I don't care what this thing is made out of. I don't care what it's carrying. The answer is that the tail should be 12 times as long as the head. Done.
So the next thing to do is obviously to go and look at the biological data to see if it supports this. And I had a fantastic graduate student who worked on this project, Daniel Tam, who went and looked at data for I think it was 427 mammalian species or something like that. It was over 400. It was a huge number. And the data from the biological literature looks like this. No. Like this.
It's 12. It's great. So having worked on different problems in biology, getting something as clean is that from an optimization calculation, we were very excited. But the other thing that I was very excited about this project is that so I can give this talk to engineers and physicists, and they all say, oh, this is wonderful. We understand something about the system.
And I give this talk to biologists, and they say, oh, well, this is the boring part of the data, because clearly you understand what's going on there. The interesting part of the data is the outliers, which sit over here, because the outliers, the reason they're outliers-- either they're suboptimal, which doesn't seem very likely, or these guys out here were subject to different evolutionary pressures and constraints. And that's an interesting question.
So we took a look at the outliers. So I'm going to tell you something about the furthest outlier over there, and usually if I'm in a small audience, I ask people to guess what the outlier is. But since we're in an auditorium, I won't call on anyone. But everybody in your mind decide what you think that outlier point is, bearing in mind that these are all mammals. So what is the weirdest mammal you can think of that might be standing out there? Yes. And you are correct. It is the bandicoot.
Now why the bandicoot? So for those of you who don't know what the bandicoot is, a bandicoot is a marsupial. So it's not terribly strange that it should be an outlier, but why is the bandicoot an outlier? So we went back, and we said, OK, what makes bandicoots unique in this pile of data? So here you go. I have to go back to the slide I showed earlier on the structure of cilia and flagella, where I said, OK, all tails have the same diameter. All of them you can select the shape as a function of time, et cetera, et cetera.
What I should have said on this slide is not exactly this. What I should have said is the structure of flagella, all tails have the same radius, except for bandicoots. So it picked out the exception, and the reason bandicoots are an exception, so we went and looked at this. There's actually a very nice paper over here, the bandicoot [? spermatozoan ?] and electron microscope study of the tail, published in 1958, in which people actually looked at this.
And the bandicoot tail looks like this. Let's see. Let me forward through. So the bandicoot tail-- so this is the structure that we saw before. The bandicoot tail actually looks something like this, and if you look closely, hopefully, we could see this. Yes. Right. So you can see that that structure that we had before is hidden in the middle of this big sheath. So the effective radius of the tail is an order of magnitude larger than what we were assuming in our initial calculation. So two things happen there.
One is we've got the radius wrong, so it's not surprising that it doesn't sit on the curve. And second, the other thing that sort of as engineers we know is that as you make these thin structures thicker and thicker, the energetic cost to bend them becomes more and more expensive, and it gets expensive very rapidly. It goes up like the fourth power of the radius.
So what that means is that these things are becoming expensive to bend, and suddenly you have to pay a cost for bending, which moves the optimum over to a longer length. So that's the story about the bandicoots. So let me just say one more thing, which is that there's another group of outliers that I should point out, which I do not know why they are outliers.
So I'm going to leave you with a question, an open question that I don't know the answer to. So the colors in this histogram indicate order, so the guys in the middle-- so I think we've got a lot from chiroptera. I think we've got a lot of rodents. And if you plot any subset of these colors, they tend to give you sort of a Gaussian-like distribution right around 12.
This is true, except for these yellow guys over here, which I've put in yellow deliberately so they stand out. So notice, this is an order that is not even close to 12 and has its maximum somewhere around six instead. So I will tell you what that order is, and I don't know why it's off, but it's the only one that is systematically off. So that guy is the even-toed ungulates. I don't know why the even-toed ungulates are off. If anybody has any suggestions, I would be delighted to hear that.
So I just have one more slide just to sort of give you a big picture here. So let me actually fast forward. So in looking at optimization, I think it's very nice to look at bandicoots and even-toed ungulates and all that, but I think there's a bigger picture here that makes this topic sort of interesting. So there's a very nice article that was published in Nature by William Sutherland, where he talks about the best solution and about optimization of biology. And he says, "There are increasing calls for biology to be predictive. Optimization is the only approach biology has for making predictions from first principles. The wider adoption of these ideas right across biology should reap ample rewards."
And I would say that while I would be very hesitant to put the word "only" in this sentence, I think that he does have a point that there are a lot of interesting problems that are ripe to be tackled now. And the reason I think they are ripe to be tackled is, A, as many of you know, optimization problems are computationally expensive. And right now, we have computational resources that we didn't have 20 years ago that makes these problems accessible.
And the second thing that we have right now is a wealth of biological data. When we did that calculation, we just went to the literature and got data on 400 different morphologies of sperm cells. So all of that is available, which I think makes these problems particularly appealing at this time. And then the last thing I'm going to add is that one of things that I think is particularly exciting about this is that once you understand the underlying principles in these biological structures, then you can move on and use that to inform engineering design.
And so this is now a topic for another talk, which I won't talk about today. And I think that with that, I'll close and just thank the members of my group, especially-- actually let me-- I think there's one more-- Daniel Tam. So he's the one who did all the work that I showed you today. Linda Turner and Susan Suarez, who taught me a lot about biology, and NSF for funding this. So thank you for your attention.
CHISOLM: We have time for some questions. Here comes one.
AUDIENCE: The bandicoot is-- are there other marsupials that are also outliers, or is this uniquely the marsupial that lies--
HOSOI: Yes. That's a great question. So the bandicoot is the only marsupial that I know in the outlier. There are other outliers that are not necessarily-- so one of them is, I think, a Chinese hamster, and so what all of them have in common is that they have the sheath around the tail. So they all have the thick tail, although they aren't necessarily the same order or the same species.
CHISOLM: Any other questions?
HOSOI: I don't-- I wish-- the bandicoot is my new favorite animal. And I don't know if you noticed, I accredited that photo-- I got it from scarysquirrel.org.
AUDIENCE: I wanted to make an observation that that was a great talk, and I think it's really been fantastic to hear other engineers talk about biological things. So I'm a biologist who knows not so much about engineering, and I have this kind of sense that life works, because it works, not because it's necessarily a logic behind every decision that's made. And I think it would be a fantastic conversation to sit down with the engineering perspective of life, which is kind of how it works, and there must be some kind of underlying energy and structural logic with the logic of biologists, which is, well, it's like this, because this is what's worked historically.
And I kind of wonder that with the data that you have obtained. How much is, well, it works, and you can find the underlying cause, but it doesn't really have an engineering logic to it. It just has an effect where it works. So that's more of a philosophical comment than a question, but I think it's really a fascinating conversation that biologists and engineers don't have-- biological engineers maybe don't have as much as they should.
HOSOI: Oh, yeah. No, exactly. I think that's actually a fantastic comment, and I think one of the things that I've learned from doing the study is how different those philosophies are in the different perspectives. So I think very often in biology, what you said is exactly true, because biology-- it's just complicated.
Like that image I showed of a fluid dynamicist's view of biology, it's just really complicated, and some things, sometimes in biology, you might not do something because it's the best. You might do it because you happen to have this molecule lying around from a previous incarnation, and so it's cheap to just use that one.
So I think there are very interesting conversations to be had about that about when is this a useful approach, and when do we throw our hands up and say, it works?
CHISOLM: Any other-- oh, we've got another one.
AUDIENCE: Not being a scientist at all, I'm just wondering is there a reason for you using the tiny swimmers, or does this, as an engineering principle, also work on a macro basis?
HOSOI: Oh, yeah. Yes. So I think that the questions regarding optimization are good at all scales. The particular model we were using was relevant to small swimmers, because we were considering the case where viscous effects are dominant, which is, again, that sort of reversibility principle that I showed you. So that's the case where inertia is not important, and usually when you're big, inertia is important. The optimization you could certainly do at any scale, but you'd have to develop the correct model for the larger scale swimmers.
CHISOLM: OK. Thank you.
So our next speaker is Nergis Mavalvala-- I had to practice that-- who's a professor of physics at MIT, got her undergraduate degree from Wellesley, and her PhD from MIT. She's a recipient of the MacArthur Genius Award. "Genius Award"-- it's always in quotes. And she's a fellow of the American Physical Society, and her research links the world of quantum mechanics to the most powerful yet elusive forces in the cosmos. So today she's talking about exploring the warped side of the universe.
MAVALVALA: Thank you. So I'm just going to test if this actually works, and it does. Good. So I'm going to start just by, as many before me, thanking the organizers. And also, I just want to acknowledge that I stand here before you because of the work of people like Nancy Hopkins and the mentoring I received from many of the people in the room. And in honor of Nancy, on my hat today, I have a sticker, which was given to [INAUDIBLE] friends and women and gender studies, and it says, this is what a feminist looks like. And I decided if Nancy could say it, so can I.
So now I have this task of bringing you away from the wonderful things we've heard about and taking you into the slightly uncomfortable side of the universe, which is sort of warped and violent. So my story today actually is going to center around this character here, Albert Einstein. And it's a story about a work I got involved in with many, many others, and it is to study the universe using a completely new type of messenger. And that messenger is gravitational radiation, and gravitational radiation is an essential [INAUDIBLE] the waves that travel to us from distant sources. And they're an essential piece of Einstein's theory of general relativity.
I'll tell you about the detectors that we use, and these are not just for us to make these measurements. They are the most sensitive position meters ever operated, so they measure changes in distance to an exquisite level, to such an exquisite level, in fact, that quantum mechanics becomes a big player here. So you have to sort of imagine that you're going to go on a journey where you go from the farthest reaches of the universe, where these waves first emanate, and then they come to our detectors, and then you have to become really, really small and microscopic and see the world of quantum mechanics in action.
So that's the idea, and I'll just throw in there that even though today we think of these as sort of [? done ?] [? deal ?] quantum mechanics and gravitational radiation existence, Einstein actually struggled with both these ideas quite a bit. So we are all used to looking out into the sky and seeing these lovely starry nights, quiescent stars shining at us.
But when we point our telescopes, and most of the telescopes that we point are far more sophisticated than this one now, we actually see rather violent parts of the universe. This particular object here was about, a little bit over 300 years ago, was a nice little star, actually somewhat larger than our sun. But it was just a star that sort of shined on its merry way, until it ran out of the nuclear fuel that makes stars shine.
And at that point, it exploded into what we now know are supernovae. And this particular image, this is Cassiopeia A, and this particular images is rather impressive, because it's actually made up of three different wavelengths of light, so from three different telescopes. The red colors are infrared light. The yellowish greenish colors are visible light, and the blue colors are actually x-rays, so very, very energetic, very high energy photons. And if you really zoom in on the little bit middle dot right there, what you see is a small little blue dot.
And this is only visible to us in the x-ray, and this little blue dot is the small little star that got left behind after the big parent star exploded and shed most of its material flying outward. And this small little star is one of two classes of stars. Usually it's either a neutron star, so you can think of a neutron star as taking all of the mass of our sun and scrunching it into a small little object that's about a few kilometers big. So this is like startling, because you're scrunching our sun down by about five orders of magnitude in radius when you do that.
And the other kind of object that could show up in the middle of this supernova explosion after all the smoke and dust settles is a black hole. Black holes are actually even more bizarre objects. They're called black holes, because their gravity is so strong, they're so compact, there's so much mass crunched into this little space that they actually-- even light photons cannot escape, hence the name black hole.
So if you now took your telescope and pointed it towards a black hole, what would you see? So the question is, can color vision, this lovely plot of three different wavelengths of light, can it tell us the whole story? And the one thing you could ask as well, if you pointed your telescope at the black hole, you might see something like this. And in case anybody gets very excited that I'm showing you an image of a black hole, I'm not.
This is an artist's picture of a black hole, because in fact, when we shine our telescopes at it, we shouldn't be able to see it. But we do see some very interesting things in this image in any case, which is we see the surrounding gas that's swirling around the immense gravity of this black hole. And from the properties of this gas and from the light that this gas gives off because of all these frictional forces that are around, we can learn something about the black hole.
And this particular black hole, it's a real object that's been observed by an x-ray observatory, the Chandra X-ray Observatory. We see that this gas actually flickers at a few hundred hertz, at about 450 hertz. And that's believed to be because this black hole that's in the center is spinning, rather rapidly you can tell-- 450 hertz.
So now when we asked the question of whether light tells the whole story, well, clearly, if we want to see a black hole directly, for example, we're going to need another messenger. And this messenger is gravity's messenger. Now to motivate how this messenger comes about and brings information to us, I need to step back a little bit and tell you about our picture of gravity.
In the 16th century, there was Newton. And Newton had a very, very successful theory of gravity. He actually successfully predicted, explained why apples fell from trees, and it also explained why planets orbited around suns, and moons orbited around planets. And it was actually all encaptured in this rather simple formula, which said that the force between these two objects is proportional to their masses and inversely proportional to the square of their distance.
Now Newton worried about something, and, in fact, even long before him, Aristotle worried about this, which was that how is it that two objects that are far away from each other can influence each other, exert a force on each other? So this puzzle actually wasn't solved until the early 20th century when our next hero of gravity shows up, and that's Einstein.
And Einstein actually gave us a picture of gravity that says, don't worry about objects exerting forces over distances. Think of all of space as a fabric. Think of it as the surface of a cushion, and when we put massive objects in the center of our cushion, that cushion will form a dent. And when we put small little playing marble at the edge of that cushion, it will fall in towards the massive object because of the curvature.
So Einstein told us that spacetime is curved when you have massive objects around, and he actually encapsulated that in this really innocuous looking formula. It looks almost as nice as this one, but this formula is very, very difficult to work with and is a very, very non-linear theory. So in Einstein's picture, if we take spacetime, and we put a massive object like a star or the Earth on this spacetime fabric, we form a dent because of the gravitational pull of this object. And now when we have a small moon or something that's orbiting it, it will follow these curved lines and fall into the object or orbit it just as we see.
Now the additional piece of this that Newton never could have imagined that had to come out of Einstein's theory is the question of what happens when you take this massive object in the center of your cushion, and you vibrate it? What happens when it starts to bounce around? And now just does as you would see if you did this experiment with a bowling ball on a stretch cushion or membrane, ripples, the whole surface of the cushion will start to form ripples. And those ripples will spread out from this bouncing object.
And that, in fact, was what Einstein proposed, and that is the gravitational wave-- those ripples of the spacetime itself emanating outwards from this vibrating, massive object. So a few gravitational wave basics-- so we already talked about the fact that there are ripples in the spacetime fabric. According to Einstein's theory, they would travel at the speed of light. Very important for us if we try to design detectors, you can say, well, what effect could these have away from the source? And the effect they have is that as they propagate through space, they stretch and squeeze the space transfers to their direction of propagation.
So if you look at this animation here, if you think of all of spacetime as this square grid, if a gravitational wave was propagating normal to the screen, the whole grid spacetime stretches and squeezes. Now how strong is a gravitational wave? It's usually measured in units of strain, so this quantity H is the amplitude of the wave. And it's measured in the change in distance over the distance. So it's a relative change in the coordinates on that grid. And we know that they're emitted by accelerating masses. And here's an example of another supernova, the crab pulsar, which also, we believe, could be a good emitter.
So what could emit this gravitational wave? So let me just say upfront, objects that you can imagine making in the laboratory would have ridiculously weak emissions. So we never think about doing that. It doesn't make sense with our present day technologies, so we look out into the skies. And the ingredients that we need-- we need lots of mass, and we need it to be scrunched in to make it dense and compact. So neutron stars and black holes are good candidates for that.
And we need rapid acceleration. We need these masses to accelerate trajectories like orbits when objects orbit each other or when things explode and blow matter off or when they collide with each other. So you can see we are already getting into this really warped violent side of the universe. You need the spacetime warpage, and then you also need it to be doing collisions and explosions.
And so we talked a little bit about neutron stars and black holes, and if you take either one of these, and you put them in orbit around each other, eventually they will collide, and that will be a big burst of gravitational radiation. Supernovae are expected, because they blow off all this material that you'll see a lot of accelerating material. Now one of the most interesting and compelling sources of gravitational radiation comes actually from the very, very early universe immediately after the Big Bang itself.
And I want to draw your attention to this picture here, because what we see is this is an arrow looking back into time, and here we are. This is our present day universe. And as we go further back in time towards this sort of bright area where we have no information, if you see the edge of information here, this is the first information we could get for the early universe that was carried by photons, light.
So when we use light to look into the early universe, actually, the first glance we get is when the universe was 400,000 years old. And at times before that, the universe was such a hot, dense soup of matter that the photons could not escape. So only after it had expanded and cooled down enough that the photons start carrying information to us.
Now gravitational waves, on the other hand, are streaming to us from the very first moments, because they are not subject to these interactions. And I'll say a little bit more about that momentarily. And then, of course, I could tell you more about the unknown, but your guess is as good as mine about what other sources of gravitational waves there are that we don't know about. So if you were an astronomer, and you wanted to decide whether you wanted to do astronomy with gravitational waves versus the more the more conventional method of light, you could make a table like this. How do you make light? Well, you accelerate charge. How do you make a gravitational wave? You accelerate mass.
If you were using light, you point your telescope at an object. The wavelength of the light is much, much smaller than the size of the object you're trying to image, and you can make these very, very pretty pictures. Now in the case of gravitational waves, the wavelengths that we're talking about are very, very, very large, and therefore, imaging is not possible. So we actually look for gravitational waves with something called a wave form, which is basically as a function of time on the horizontal axis, the amplitude of the wave H on the vertical axis.
And just as you can take these images and make pretty pictures, what we tend to do with these kinds of wave forms is we'll make pretty sounds. Now one property of photons, which I already alluded to when we discussed why we didn't see photons from earlier than 400,000 years after the Big Bang, is that they're extremely friendly. If you take a photon, and you send it out into space, it will meet every electron and proton and try to interact with it. It will hang out. It will chat. It will get absorbed. It will scatter. It will disperse.
Now gravitational waves on their hand, they're extremely aloof. They actually simply, they see some matter, and they go right by. So they're actually wonderful messengers, because as astronomers, we don't have to worry about what's between the source and us as observers. Now, of course, this is actually a rather big problem as well, because for the same reason that the gravitational wave goes through most of the material between the source and us, it also interacts very weakly with our detectors. So we have a bit of a challenge.
Now the last thing I'll say is usually when you think about light, you think about making detections that are 100 megahertz and higher, and with gravitational waves, it's very hard to think of accelerating these very massive objects with very high frequencies. So we usually think of 10 kilohertz and lower. So typically, audio band. So again, think about sounds. This is the band in which the human ear works.
So this is a movie, which was actually a real tour de force, because it's the first-- it starts a little bit faster than sooner, but that's fine. We'll play it again. This movie is actually a simulation and one of the very first simulations where that horrendous equation that I showed you of Einstein's general relativity theory was solved numerically for a pair of two black holes. And what you see here is-- can we play the movie again, please? What you see here is-- I think it has it.
Good. Two black holes, and early in their life, they're orbiting each other. And then as they orbit each other, they're emitting gravitational waves, and those gravitational waves carry away energy. And that makes the orbits get closer and closer to each other, and eventually, and you can see these red contours are the gravitational wave emission right near the source. And eventually, the two black holes will collide, and to conserve angular momentum, it will form one nice little round object, and eventually, the gravitational wave source will turn off.
Now if you wanted to take a system like this and encode its gravitational wave waveform in sound, then it would sound something like this. Could we have the sound, please?
So it's basically just a hum, and then you'll hear the last collision [? throw ?]. This is where they collided, and the gravitational wave turns off after a short drain down. So a little bit of history-- so gravitational radiation was first introduced by Heinz Stein in 1916 and is in his original paper on general relativity. And then two years later, he actually gave the first correct formulation of gravitational waves.
But he remained uneasy with that. He was not just uneasy about how immeasurably weak they were, but also of their very existence. He wasn't always very sure and actually published a retraction in an obscure paper in 1937. Then he retracted the retraction, and the doubts and controversy finally subsided after 1955 after people like Feynman weighed in.
But whenever you think about a theory and controversy, there is only one way to put it to rest, and that's through experiment and observation. And there was an observation. It was considered the first indirect observation of gravitational waves, where Hulse and Taylor observed two neutron stars orbiting each other in the same way as I showed you in the simulation. And in this particular system, they were using a radio telescope and looking at the lighthouse beacon beams of one of the neutron stars.
And they noticed that the timing of that little beacon beam was changing, and when they plotted that change in the timing over three decades, you could see their first data began in 1975, and this goes out to 2005. What they noticed was that the two neutron stars were getting closer to each other, and they were getting closer to each other, their orbits were shrinking because of the energy being lost to gravitational waves.
And that's this curve. The curve here is not a fit. It's the exact prediction that Einstein's general relativity would give for gravitational wave emissions from two neutron stars that were orbiting each other with the parameters that were measured in other ways. So this is widely considered evidence, and there was a Nobel Prize for it in 1993.
So how strong is this? And now comes my tale of woe. So the only formula I really want to put up here is this particular one, and the thing I put it up for is this is the amplitude of the gravitational wave. And I just want to show you that the numbers in the denominator are frightening, because c is the speed of light to the fourth power in the denominator. r is the distance between the observer and the source in the denominator. So we are in trouble.
The gravitational constant is what it is, and this is the second derivative of the moment of inertia, which for all purposes, is a number close to unity. So you take a typical neutron star. Take the Hulse Taylor binary, and you ask Hulse, what is the gravitational wave stream we would see from it? It turns out to be 10 to the minus 18. If you take that same binary system, and you put it further away in the Virgo cluster, which is a cluster of galaxies where we expect to have many, many more such sources to be able to observe, that number shrinks to 10 to the minus 21.
So it's just a number, H is 10 to minus 21. Now how do we go about detecting it? So using the property that the gravitational wave shrinks and stretches the spacetime as it passes by, we take a laser interferometer. We split the light on a beam splitter here, and it reflects on to this mirror, reflects back, and the path on this side reflects from this mirror, comes back here, and is detected on a photo detector. Now when a gravitational wave comes by, the space, the distances shrink and stretch, and we can measure that as a change in the travel time between the photons in this arm and the photons in that arm.
And if we were to do that with an interferometer that we built in our laboratories, say about a meter big, then we would have to measure relative changes in the positions of these mirrors to be 10 to the minus 21 meters. Now that is really, really hard. You would never be foolish enough to try that. So instead, what we do is we say, well, we have this little quantity l to play with. The change in distance we measure is proportional to the gravitational wave amplitude and to the distance itself.
So we build longer detectors, and these are one of the detectors of the laser interferometer gravitational wave observatory. And these are four kilometers long, and now you see that we have a much simpler task. We only have to measure displacements at the level of 10 to the minus 18 meters, and that's simply just 1,000 times smaller than a proton.
So that seems more doable. So now let me just take you on a little tour. We're not the only ones foolish enough to think this is doable. Observatories like this are scattered across the planet, and there's even a planned observatory in space.
I'm going to tell you a little bit more about the LIGO observatories, which are the two US detectors in eastern Washington state and in Louisiana. So to put it in a nutshell, to measure the gravitational wave, what we do is we measure the displacements of mirrors, and we measure those by measuring the light travel time of the laser beams.
What makes it hard? The gravitational wave amplitude is small, and everything else in the world you can think of wants to push on that mirror more than the gravitational wave does. And then the last piece of our headaches are that at the precision we need, the very quantization of the light, the fact that light is made up of photons that come in discrete quanta and that quantum uncertainty limits how well we can do.
So a quick tour. This is the LIGO observatory, an aerial view of the Louisiana observatory. If you looked inside of these beam tubes that go for four kilometers, you would see a stainless steel tube like this, which holds a high vacuum in which laser beams run the four-kilometer length. This is a protective cover that has been useful more than once.
Inside of the observatory, you see structures like this. Each of these is a vacuum chamber that holds a mirror. Now you'll say, if you put a scale on this, if I were to stand here, my head would be just below this row of [INAUDIBLE]. You say, why do you need them to be so big? You need them to be so big, because those mirrors are engineered to be very still.
And so all of the vibration isolation systems like this one have to fit into that vacuum chamber. So this is a vibration isolation system that keeps the mirrors still. And this is the mirror itself. In the first generation of detectors, it was a 10-kilogram, 25-centimeter diameter mirror that hangs like a pendulum. And that's a zoom in of the mirror.
Now what do you do with this detector? Well, in the years between 2005 and 2007, the first generation of LIGO detectors was operational. Those photographs I showed you are a reality. And there was a great deal of science that came out of it, but let me just say squarely, there were no direct detections. So we looked out into the sky, and we did not see anything that we could confidently say was a gravitational wave. But we had a chance to do some real science, nonetheless, and one example of that was a gram array burst explosion. So this was a gamma ray.
So gamma rays are incredible objects, because what they are are they're basically explosions of very high energetic photons that light up the night sky many millions of times brighter than our sun would be over its entire lifetime. And this particular kind of gamma ray burst is believed to come from when neutron stars or black holes collide.
So what could be better? If it's colliding, a LIGO should see it. Well, LIGO happened to be on the air when this very energetic and rather nearby gamma ray burst went off. LIGO went to look for the signal in the same time period, and we saw nothing. And seeing nothing led to the conclusion with 99% confidence that this particular gamma ray burst was not caused by the merger of two neutron stars or black holes in the location that we believed it was. So it's a nice example of being able to do science with zero positive signal.
So the next generation of LIGO detectors is on its way. And the question we ask ourselves is can we listen to more distant sounds? Can we look out further into the gravitational wave universe? Because the further we can look out, the more objects we should be able to see. And the answer is yes, we could.
We had the technologies that had evolved enough that we could put that into place, so we are now building a detector that's 10 times more sensitive to gravitational waves and that lets us look 10 times further out in distance, because our detector has sensitivity that goes as 1 over the distance to the object. And so we're expecting that when you make a detector that has the sensitivity to look out into a sphere around us into the universe that's 10 times larger, we expect an event rate that's 10 to the cubed or 1,000 times bigger.
So this is actually the first time I'm showing you real data from LIGO. Notice the horizontal axis is frequency in hertz, and we go here from 10 hertz to 10 kilohertz. So it's just squarely the human audio band, just as promised and just as you heard in the simulation that I showed you. And on the vertical axis, we're just measuring our ability. How well can we do in measuring that 10 to the minus 18 meters in terms of the displacement?
And this red curve is the initial LIGO target design, and then the subsequent blue and green curves are the levels at which we have operated LIGO in the last three to four years. And the black curve is the thing that we are building for advanced LIGO, a 10 times more sensitive instrument. And one of the things that we're doing to make a 10 times more sensitive instrument is we're increasing our laser power. It turns out that in the region where we are limited by laser noise, increasing the power increases our signal to noise ratio.
Now I just want to point out, since we're celebrating MIT, this is Ray Wise. He was my PhD advisor and the founder of this whole gravitational wave detection using interferometers, and this red curve, this is something that blows my mind when I look at the history of the field. This red curve was almost exactly spelled out in a paper he wrote in 1972, and we built this in 2005. So I don't know. I can't think of a better example of prescience, I should say.
Now the last thing I wish to say is talking about where the quantum limit itself comes in. So the fundamental question is we use light to measure the position of a particle. This particle happens to be our mirror. We shine light at it. Light reflects off. We measure how long the light took to come back to us, and that tells us where our mirror was. And there's a little cartoon of that.
Now the question to ask then is that light, which as photons, carries momentum. And there's nothing quantum about that. This was told to us by classical electromagnetism. And that momentum, those photons that carry momentum, when they impinge on our mirror, transfer that momentum to the mirror. So the light then kicks the mirror.
Now the light is built of photons. It does have quantum uncertainty, so the momentum that's transferred to our mirror also has that same quantum uncertainty. And this is now the big dilemma, which is, how can we know the position of our mirror, to what accuracy can we know it, if our very attempt to measure it kicks it around? And you see this completely in play in the advanced LIGO design, where in this region of the curve, we are using enough laser power that we can make a really exquisite measurement of the light travel time. We can do it really, really well.
But what happens-- the price we pay is that in this region of the curve, the momentum kicks from this quantum limited light fuzzes out where our mirror is. So if you ask now, what is the limit, the limit is set by the fact that we have this quantum light and the back action of that light kicking back on our mirror. So advanced LIGO has this incredible sort of mind-blowing property, which is that the 40 kilogram mirrors of advanced LIGO are going to be so well-shielded from all other forces that they will be in the state of acting like a quantum particle.
So imagine this. This is incredible. We've always learned that quantum mechanics is only important on microscopic scales, atomic scales and smaller. And here, we may observe it in a truly human size scale. A 40 kilogram object is almost twice the mass of my three-year-old son.
And it will be quantum mechanical. Now he never could be. So I leave you with this thought. When the elusive gravitational wave is captured by our detectors, we will have tests of general relativity. It will be the first time we'll directly observe the spacetime. We will certainly do lots of astrophysics, where we'll be able to observe black holes and photons, gravitational waves from the early universe and, of course, objects well beyond our imagination. And it will also give us an opportunity to, for the first time, observe quantum mechanics in truly human scale objects.
So my collaborators and my team, my wonderful graduate students, and the MIT LIGO laboratory, which I'm a part of, and I should also mention that the LIGO laboratory, the LIGO science is weaned by a collaboration called the LIGO Science Collaboration, and everything I talked about is actually funded by the National Science Foundation. And I will leave you with just a message. Oh. This was not the message I wanted to leave you with. Thank you.
CHISOLM: Lunch awaits, but we have time for one quick question. We're a little behind schedule, so anybody have a question? Here's one. Here's two. Well, we'll take two. Quick, go.
AUDIENCE: Thanks for the nice talk. Is the detector directional, and are there particular directions that you'd expect to see more signals?
MAVALVALA: Yes. So that's an excellent question. I don't have a picture of it. The detector is actually only very slightly directional. If you think about the L shape of the interferometer on a plane like this, you can think of the detector sensitivity as a peanut, two lobes on either side. So it has no sensitivity in the plane of the L shape, and it has higher sensitivity normal to it. And then it's falling off with a cosine squared angle function.
AUDIENCE: First, let me call out some of your mistakes, factual mistakes.
MAVALVALA: Please do.
AUDIENCE: The factual mistake-- you say Einstein Rosen's paper retract. They didn't retract.
MAVALVALA: No, I did not. [INAUDIBLE]. Oh, yes.
AUDIENCE: They got rejected, but they [INAUDIBLE] it to something else.
MAVALVALA: But it got published. That's why I said it was obscure. Yes, the story is that the Einstein Rosen paper was sent to a mainstream journal, and it was rejected there by Infeld and Robertson, and Robertson in particular, because he thought it was wrong. So then they published it in the Franklin Journal. But now it's true. Robertson's wrong. Einstein not correct, not totally correct, but partially correct. Number one.
AUDIENCE: Number two, you talk about Hulse and Taylor experiment.
MAVALVALA: So you know what? I'll tell you something.
AUDIENCE: No, let me finish then.
MAVALVALA: I'll give you only one more question, because there's maybe hopefully someone else waiting to ask a question.
AUDIENCE: [INAUDIBLE]. Hulse Taylor, a trigger extended. [INAUDIBLE], the [INAUDIBLE] of MIT, went to Princeton to talk to Taylor about his experiment. He couldn't explain it. You know why? Because the calculation was wrong. So what you learned today [INAUDIBLE] is about 15 years outdated. And also, [INAUDIBLE] theory is wrong. [INAUDIBLE], he has a tendency to make up things, which Einstein didn't say. He said, Einstein didn't count on the tidal wave. It's wrong. So I suggest if you do this problem, you have to start from the beginning.
MAVALVALA: I will do that tonight. Thank you.