Neuroscience Study

3.3 Reading Minds: Stimulus Reconstruction 본문

Computational Neuroscience/Week2 What do Neurons Encode?

3.3 Reading Minds: Stimulus Reconstruction

siliconvalleystudent 2022. 10. 23. 13:22

We're going to finish up today's lecture with a discussion of reconstruction.

I think we all share the dream that one day we might be able to record brain activity, during our sleep, and in the
morning play back our dreams.

So while the dream state is still not well understood, how close are we from being able to reconstruct even awake
sensory experience from neural activity.

We can apply the methods that we've discussed in the past two lectures to think about how to do that.

So in the previous parts of this lecture, we talked about methods to find an estimate of the stimulus using Bayesian decoding.

So now we'd like to extend our decoding procedures to the case of the responses
and the stimuli.

At varying continuously in time.

Let's go through a simple example of a decoding strategy that meshes with the problem set that you have been working on.

Some of you, anyways.

Let's say that we wanted to find an estimator, S Bayes, that gives us the best possible estimate of our stimulus, S, given that we've observed some response, R.

So, how should we compute s bayes?

So one startegy that makes sense is to ask for an estimator that is only average as close as possible, to our stimulus.

So I'm going to intriduce some error function which we'll call L and then minimize this error.

Averaged over all possible stiulus choices that are consistent with our response R.

So now, we need to choose a form for this error function, a very natural choice. Is the main square error.

We'll take L to be just the mean squared difference between our estimator and the true stimulus squared.

Now to derive an expression for S bayes that solves this problem, we need to minimize the average error.

So remember how we minimize a function.

We take the derivative of that function which respect to the parameter that we're interested in.

So here as base and we set as equal to 0.

So lets just do that calculation.

So now we want to take d by ds, this call SB integral DS.

Now let's substitute in our expression for our error squared.

Probability of s given i.

So now we take the derivative of that with respect to s b.

So that's going to be equal to interval d s.

The only time that depends on SB is this one, so the derivative of a square is just S minus SB times 2 times the probability of S given R.

And now we set that equal to zero.

So hopefully you can see that the solution is of this form.

So how did we get that, let's just write that out.

So we have integral DS, we can separate these two terms out and put them on two sides, so and S P of S given R, is going to be equal to integral ds, s Bayes, probability of s given r.

And now, if we integrate the probability of s given r, over s, sB here is just a constant and will come out.

Now, the integral over this probability distribution, since the probability distribution is normalized, it's just
going to be equal to sB.

And so here's our solution.

So, we have sB is equal to this expression, which we already, already have here.

Now, I want you to take a look at that for a moment and see if you recognize it.

So, what if our response is just a single spike?

So what does this expression amount to?

Well it does the spike triggered average, right, it's the stimulus triggered by the
response from the spikes.

So we're going to take all the stimuli, wait them by this probability that they occurred in response to a spike and
average over them all.

So how do we apply this to reconstructing a simple stimulus?

So imagine that this is our spike triggered average.

So now every time there's a spike, so that our measured spike train, we're going to paste in the spike trigger that
our, our conditional average.

So at low firing rates, this is not looking very good.

But at higher firing rates, so now you see that we're getting closer and closer to a, a smoothly bearing function.

You might've realized already that there are some issues with a filter or feature of, of this exponential form as we drew before.

Which is that it can never capture a negative fluctuation in the input.

This is actually an issue with the fly neuron data that you've looked at in the problem set.

The fly has two h1 neurons.

One that encodes leftward motion and another that encodes rightward motion.

So if you've tried to construct a, reconstruct a velocity stimulus with only one of your H one neurons, you'll only ever be able to recover either leftward or rightward motions.

In the book Spikes which is a very nice exposition of this kind of reconstruction, at considerably more depth than I can give here, the authors actually simulate the other H1 neuron by playing the original stimulus, but with the opposite sign.

And that now gives us enough information to reconstruct both positive and negative inputs.

So now let's see this kind of decoding in action.

The movie you're about to see is based on the activity of multiple neurons in the lateral denucleate nucleus of the cat.
This is work by Yang Dan of Berkeley done about 15 years ago when she was a PhD student.

By convolving the spike trains from multiple neurons in LGN with the spatial temporal receptive fields of those neurons Activity allows a noisy, but comprehensible reconstruction of the scene.

So in this case, it's a cat being recorded now from, while, while anesthetized.

But the LGN neurons are giving a pretty good reconstruction of what the cat is looking at.

Hopefully you can see Yang's advisor looming into view.

That's Joe Attic whose work applying information theory, to understand receptor field structure, is coming up next week. Now let's forward a few years to 2011.

I'm going to finish up this week lecture with a rather impressive example of decoding, that starts to get us closer to that mind reading fantasy.

It also neatly brings together the idea we covered last week and this week.

In this set of experiments, Jack Gallant also at Berkeley and his colleagues recorded from visual cortex of humans viewing movies using FMRI.

And used the recordings to reconstruct one second long movie sequences.

So here you're seeing reconstructions of single scenes.

But these are in fact stills from one second long movies.

Nonetheless I hope you get a sense of how impressive these reconstructions are.

So how did they do this?

So here's the basic idea.

Going back to this model that we've used over and over again.

The researchers here are trying to find a movie clip s that maximizes this a posteriori distribution.

So they use a library of 18 million clips and take the prior p of s to be uniform across those samples.

So what's missing is the likelihood.

To compute the likelihood of a given clip from the database, they develop an encoding model that they fitted from a different training set of movies, so that they can evaluate the predicted response for an arbitrary input.

Then they can evaluate this likelihood measure by computing how well the predicted response to a movie from the library matches the true response.

Let's take a peek at the encoding model, as it uses several of the ideas that we developed last week.

So here's the model that predict responses.

As you might recall, we mentioned last week that fMRI relies on blood oxygenation or, or BOLD signals, so it has a slower response time than in neuro-, than neural activity.

So in this model the neural response despite is separated from the bold signal separately fit how is the neural response
model.

Let's zoom in, its predicted as in the models of last week by first filtering the input.

Here's a couple of different filtering stages to filter the input to extract certain features.

Let's focus on this part in which the image is filtered through a pair of oriented filters here.

At different phases, just as we described last week for complex cell responses in V1.

Now the outputs of those two filters are squared and summed.

This means that one gets a large response independent of spatial phase, as we also mentioned last week.

Then the output of that filtering stage Is passed through a compressive non linearity.

In this case the function is taken to be a log function.

And then this is temporally down sampled.

That is, it's smoothed from a 15 hertz signal to a one hertz signal in order to reduce noise.

That's taken to be the predicted neural response.

This newer response is then passed through an additional filter that accounts for the slow response of the, of the blood oxygenation level.

So now here's the full procedure.

An encoding model, like we just saw, is fitted for each voxel, each volume unit in the brain region being image/g.

And then that model is used to predict the response to the millions of images in the database.

The stimuli with the highest likelihood, which in this case, is equivalent to those with the highest poterior and those best account for the predicted respionse.

So here are the predicted responses, in this column.

The map solution, the most likely solution or the highest a posteriori solution would be to simply read off the maximum value.

Because the clips are full of highly specific detail, one can in this case do a lot better by averaging those out.

So what they're going to do is to rank these images by the degree to which their pricted, predicted responses fit the true response.

And take the top sequence of images that have the highest match.

So here they're drawing the top 30 highest posterior clips.

So the, the 30 clips that have the highest degree of match to the predicted response.

So one could simply take this best value but because of all of that, because of all the specific detail in these sample images from there prior, one doesn't look better by combining them.

So now if you look at the cumulative average of many of these high probability clips, then what you see is that as one gets To larger and larger numbers of them, you're getting quite a good match.

So remember this is one of these examples.

Perhaps this one. So in this case, you can see the effect of that averaging.

So now you no longer see a crisp a crisp image that you would get from a single choice.

From your prior distribution.

Instead, you average over many of them.

But now, what that does is to remove specific features, and give you a general gestalt that's much more similar to to the stimulus that was presented.

So I hope this demonstrates that we are within reach of that dream.

That we will be able to look at neural activity, and using clever models of the type that I showed you just now, we'll be able to reconstruct naturlistic images from that neural activity.

So that brings us to the end of my lecture for this week.

You'll also find online a special guest lecture by my colleague Fred Rieke, a world-acknowledged wizard of retinal
processing.

Next week we'll be moving on to a consideration of information: how is information defined?

What exactly does it quantify, and how can it be useful in neuroscience?

I hope you've enjoyed this week and that we'll see you back next week.