(Poster - 500 KByte)

advertisement
A Maximum Expected Gain Model of Movement under Risk
L. T. Maloney, J. Trommershäuser, M. S. Landy, Psychology and Neural Science, New York University
VSS 2003
Sarasota, FL
We present a novel model for the planning of
movements in environments where there are explicit
gains and losses associated with the outcomes of
actions.
In our model, subjects choose movement trajectories
to maximize expected gain.
The model takes into account the consequences of
accidental deviations that carry the movement into
“dangerous regions.”
ILLUSTRATION
TEST OF MODEL
Penalty
-500
0
On each trial, she must rapidly
touch the screen or incur a large
‘timeout’ penalty.
0
: -100
0
100
0
Reward
18 mm
0
: 100
y [mm]
• 100 trials, bonus:
25¢ per 1000 points
: -500
: different outcomes
of the (simulated)
experiment
2. When a strategy is executed, the result is a particular movement trajectory, τ.
3. A visual-motor strategy imposes a probability density, p(τ|S), on the space
of possible movement trajectories.
Trial by trial variation of
movement end points:
Subjects are not learning
by trial by trial error
gain [$]
1. The outcome of movement planning is a visual-motor strategy, S.
R
1.5R
2R
• 6 stimulus configurations:
(varied within block)
• 3 penalty conditions: 0, -100,
-500 (varied between blocks)
• Time limit: 700 ms
Results:
: 100
y [mm]
• Response after time
limit: -700 points
Assumptions:
One practice session: 300 trials, slowly
decreasing time limit.
Mean movement endpoints
• subject’s variability:
 = 4.83 mm
MEGaMove* MODEL
Eighteen Conditions
six spatial arrangements
three penalty values
40 repetitions per condition
Record movement endpoints (x,y)
720 = 40 x 6 x 3 data points per subject
gain [$]
We report the results of an experimental test of the
model.
0
0
The subject sees a target on
a computer screen that
consists of overlapping reward
and penalty regions.
Example (simulation):
0
R = 9 mm
x [mm]
4. The movement space contains possibly overlapping regions, Ri, I = 1, …, n.
trial number
5. If a trajectory τ passes through region Ri, a gain Gi is earned (Gains may be
negative).
6. The subject chooses the strategy, S, that maximizes overall expected gain,
PREDICTED MEAN END POINT
Penalty = 0
  S    Gi Pi  S   GTO PTO  S   B  S 
i 1

GTO PTO  S 
BS 
Penalty = 500
y
x
x
y
90
60
30
0
-30
-60
x
x, y: mean movement end point [mm]
‘timeout’ penalty for failing to
complete a movement within a
specified time limit
biomechanical costs associated
with strategy S.
x [mm]
y [mm]
Ri 
y
y [mm]

p  | S  d
y [mm]
Pi  S  
probability that the trajectory
resulting from strategy S passes
through region Ri.
Penalty = 100
points per trial
n
x [mm]
X: mean movement end point; x = xgreen - X
(data corrected for constant pointing bias)
CONCLUSIONS
Subjects shifted their mean movement endpoints in response to changes
in penalties and in the location of the penalty region. Overall, subjects
acted so as to maximize expected gain in a variety of stimulus
configurations, in good agreement with the predictions of the model.
We conclude that movement planning takes extrinsic costs and the
subject’s own motor uncertainty into account.
x [mm]
Trommershäuser, Maloney & Landy (2003), Spatial Vision, 16, 255-275.
Trommershäuser, Maloney & Landy (2003), JOSA A, in press.
*: Maximum Expected Gain Model of Movement planning
Supported by NIH EY08266 and HFSP RG0109/1999-B, J.T. funded by the DFG (Emmy-Noether Programm)
Download