Homework #3

Motion topics: short-range, reverse-phi, higher-order and non-Fourier motion


Short-range Motion

Background on Apparent Motion. Many early studies of visual motion processing used apparent motion stimuli in which a spot or bar of light was displayed at time T and then shifted and displayed later at time T+DT. In other words, a stimulus pattern was flashed twice with a separation in space and time. For appropriate DX and DT, the stimulus would appear to move from one point to the other. If the DX was too large, or the DT too long, motion was not perceived, just two separate flashes. One natural question arose: What is the maximum separation, Dmax, that supports motion perception? The answer tended to depend on the stimulus pattern, and separations from a few minutes of arc up to more than ten degrees could support motion perception. For an early review of visual motion processing, see Nakayama, 1985.

In 1974, Oliver Braddick made the important observation that, if a patch of a Julesz random dot pattern is suddently shifted by DX within an otherwise homogeneous random dot field, that patch would appear to move only if the amount of motion was less than about 15 minutes of arc. This critical value did not depend strongly on the size of the random dots, and it was much smaller than the Dmax values associated with many apparent motion displays in which a single, recognizable target was moved, for example a light bar on a black background.

This caused Braddick to propose that there were two processes for motion analysis in the brain, a "short-range" process, which is most likely associated with the direction selective neurons in V1, and another higher-level process that might involve something more akin to identifying and tracking recognizable objects or features. He referred to the latter as a "higher interpretive system" that was involved in long-range apparent motion.

To demonstrate these ideas, please download this PDF file and open it in a viewer that allows you to switch quickly back and forth between the frames (Mac Preview or Adobe Acrobat should work, but just viewing this in a web browser window may not). Try the following:

  • (a) Switch back and forth between pages 1 and 2. What do you see?

  • (b) Switch back and forth between pages 2 and 3. How is this different from 1-2?

  • (c) Switch back and forth between pages 3 and 4. Do you see motion, or just changes in the dot patterns?

  • (d) Now, switch back and forth between pages 5 and 6. These squares have the same DX as the dot patches in 1-2 above.

  • (e) Switch back and forth between pages 6 and 7. These have the same separation as 2-3 above.

  • (f) Do the same with 7 and 8, which have the same separation as 3-4 above. Does the square appear to move? How does this compare to the case above where the patch of random dots jumped by the same distance?

Question 1. Simply write down your observations for the demos above.

The important concepts to note are: (1) it is plausible that short-range motion processing is associated with V1 DS neurons, (2) There is a distinction between seeing (short-range) motion of arbitrary image patches by noticing a correlation across space and time in the absence of clear features and seeing motion by recognizing that a distinct object has moved. The former, short-range process seems to be less feature-based, whereas the higher process may depend on the awareness of the position of particular features. This dichotomy will be referred to below.

Reverse-Phi

The term phi was introduced by Max Wertheimer (1912) and is commonly associated with certain types of apparent motion stimuli (see Steinman et al., 2000, for a careful definition). A basic diagram of first-order, sampled motion (Figure 3.1 A) shows a space-time (X-T) plot of a black and white pattern (first row, time T=1) that moves to the right as time increases (downward). This rightward motion is easily recognized as slanted (oriented) stripes in the X-T plot. This oriented structure can be detected by oriented space-time filters, such as that shown in panel C, which form the basis of simple models of direction selective (DS) neurons in the primary visual cortex.

In 1970, Stuart Anstis reported an intriguing psychophysical discovery: in an apparent motion sequence like that of Figure 3.1 A, if you periodically invert the contrast (i.e., swap white and black), then the motion will appear to go in the reverse direction. To appreciate this, please view the following movies (it may help if you turn on looping on your movie player, or simply repeatedly play the movie):

  • Stuart Anstis, four spots.   Here, the upper dots move to the right (and then back) without changing contrast and show typical apparent motion, while the lower dots move to the right but invert contrast, before shifting back. The lower dots nevertheless appear to move oppositely to the upper dots. You must appreciate that, for example, the lower left "BW" dot is subsequently plotted (white) to the right of the black dot, but motion appears to go to the left, nevertheless.

  • Stuart Anstis, annulus.   This shows typical apparent motion, with the dots rotating clockwise.

  • Stuart Anstis, annulus, reverse-phi.   This shows reverse-phi, where the dots still rotate clockwise (CW), but they now reverse their sign (change between black and white) as they move. If you fixate in the center, you should see counter-clockwise (CCW) motion (and note that you may see motion after-effect if you continue to fixate after the movie stops). If instead, you track a clump of dots with your fixation, you should see that the clumps of dots move CW.

  • Patrick Cavanaugh, reverse-phi.   Fixating on the central dot, you should experience reverse-phi motion. But if you track one set of bars with your eyes and/or mouse, you will see that the bars move clockwise.

Question 2. Demonstrate your understanding of the construction of a reverse-phi stimulus by coloring in (blackening) the appropriate boxes in Figure 3.1B. The reference stimulus is shown in panel A. The procedure is: invert the contrast of the reference stimulus on every other line. The odd number lines (time T=1, 3, ... 11) have been done for you, they have simply been copied over as is. The even number lines (gray arrows) need to be completed.

Question 3. In terms of the appearance, or features, of your completed X-T plot, can you explain (predict) how the perception of this "reverse-phi" stimulus will compare to that of the original reference stimulus in Panel A?

Question 4. Given your observations from Question 2, do you think that "reverse-phi" is an illusion? Does it involve seeing motion that is not there? Would DS neurons in the primary visual cortex respond to reverse-phi stimuli?

The above examples may be compelling, but we can argue that reverse-phi is expected to hold more generally by considering that any visual pattern or image can be broken down into a sum of sinusoidal components. For simplicity, we will consider the 1-dimensional case. In Figure 3.1 D, a sinusoidal luminance profile is plotted that represents the visual display at time T (top trace). Imagine that this profile is then inverted (negated) and displayed at time T+1 (lower trace).

Question 5a: Would an observer perceive either rightward or leftward motion in such a two-flash stimulus display?

Question 5b: If instead the original sinusoidal grating was simply shifted slightly to the right going from time T to T+1 (panel E, thick red arrow), what direction of motion would the observer perceive?

Question 5c: Now, in Panel F, if the sinusoid displayed at T+1 is shifted a bit to the right (dashed line at T+1) and inverted in contrast (solid line at T+1), what direction would be perceived?

Question 5d: What is being indicated by the thinner and thicker red arrows in panels E and F, and how does this relate to the perceived direction of motion?

Question 6: A real image is composed of many different spatial frequencies. When the image is shifted by an amount, DX, from time T to time T+1, different frequencies will shift by different amounts of phase, P, even though they move by the same absolute distance DX. Can you argue that any sinusoidal spatial frequency component that appears to move to the right for a regular apparent motion display (in which a fixed DX is shifted on each frame), will move to the left in the corresponding reverse-phi stimulus? You may find it helpful to use a phase diagram (Figure 3.1 G), where each arrow plots the phase of the sinusoidal component at successive time steps, T, T+1, ...

Non-Fourier Motion, or Higher-order Motion

Up to now, we have discussed mainly first-order motion. Loosely speaking, first-order motion occurs whenever a pattern of luminance moves coherently, for example, a darker than average region shifts to right, or a lighter than average region shifts upwards, etc. An example of motion that is not first-order is the following,

Dynamic stripes.   This stimulus is created by multiplying a square wave, varying from 0 to 1, by a Julesz-style dynamic random dot pattern, where black dots have a negative value, -1, and white dots have a postive value, 1 (note, mean gray is zero).

The construction of this stimulus is shown in terms of its luminance profile in Figure 3.2. The trace in A shows a horizontal spatial cross-section through the square wave that multiplies the random luminance pattern shown in B. The resulting stimulus (shown in C) is displayed at a particular time, T. On each frame, the square wave (A) shifts to the right, and a new, independent random luminance pattern (B) is chosen, so that the modulated epochs from C move to the right (D) over time.

Figure 3.2 E shows a Space-time diagram of the stimulus, where it can be seen that the random dot patches shift to the right over time.

Question 7: Why would an oriented space-time filter, like that shown in Figure 3.2 F, fail to detect the motion in the stimulus in E, but succeed at detecting motion in G (a classical first-order stimulus, where a grating drifts to right)?

Question 8: What simple mathematical operation could be performed on the stimulus in E (or equivalently, the traces in C and D) to transform the stimulus into a form that would restore the ability of the filter in F to respond in a direction selective manner?

Finally, we discuss why these higher-order stimuli are also called non-Fourier motion stimuli. When first-order moving stimuli are represented in the frequency domain, it turns out that their power lies in certain quadrants. Restricting ourselves to 2D space-time plots for simplicity, Figure 3.3 A shows a sine grating that is moving leftward over time (left column) and the power spectrum of that stimulus (right column). Note, in taking the Fourier transform of the stimulus, we are transforming our axes from space and time to spatial frequency (SF) and temporal frequency (TF). The power spectra must be symmetrical about the origin, and zero frequency is in the middle of the power spectra plots. The diagram at the upper right indicates that any power in the upper right and lower left quadrants is associated with motion to the right, and any power in the remaining quadrants is associated with motion to the left.

In the first example (panel A), a grating that drifts to the left has all of its power concentrated at two points in opposite quadrants in the frequency domain (two white dots). This corresponds to the fact that a sinewave transforms to a pair of delta-functions in the frequency domain, and the orientation of that sinewave in 2D determines the orientation of the pair of delta-functions in the frequency domain. Thus, rightward motion (panel B) is associated with power in the other two quadrants. For slower motion (panel C), the slope of the line through the points in the frequency domain is less, and the points fall lower on the TF axis. Likewise, for lower SF (panel D) the points are closer to the origin in the frequency domain.

Generalizing to any image (not just sinusoidal gratings), the power spectrum of an image that is moving at a fixed velociy falls along a line in the power spectrum, where the slope of that line is proportional to the velocity. Thus, Fourier motion stimuli are ones for which the direction of motion can be determined easily by examining how the power is concentrated in the frequency domain.

Tilted Gabor filters also have their power concentrated in regions of the frequency domain that correspond to particular directions of motion. See Figures 11 and 12 in Adelson and Bergen, 1985 for examples of how Gabor-like motion filters look in the frequency domain.

Finally, if we look at the power spectrum of the higher-order motion stimulus (panel E) that we had examined above, it is clear that the power is spread throughout the frequency domain, being in all four quadrants, and that this makes it difficult to determine the direction of motion from classical motion energy models.

This is why these types of stimuli are referred to as non-Fourier motion.

Here are some other examples of non-Fourier motion stimuli:



Question 9: Putting together some of the terms from above, do you think there is any relationship between Braddick's distinction between short-range motion vs. higher interpretive processes and the perception of first-order (Fourier) vs. higher-order (non-Fourier motion)? Keep in mind your answer to Question 8.

Question 10: When thinking about the reverse-phi demonstrations above, are there both first-order (Fourier) and higher-order (non-Fourier) cues? If so, describe the higher-order motion cues for some of the reverse-phi demonstrations above. Which cues dominate perception? In some of the demo stimuli, does this depend on whether you fixate away from, or allow your eyes to track closely, particular features in the stimulus?