Activity-based Modelling: An overview (and some things we have

advertisement
Activity-based Modelling:
An overview
(and some things we have been doing to advance state-of-the-art)
E. Zwerts
(With the cooperation of E. Moons and D.Janssens)
Transportation Research Institute
Data Analysis and Modelling Group,
Faculty of Applied Economic Sciences,
Limburgs Universitair Centrum, Diepenbeek, Belgium,
E-mail: enid.zwerts@luc.ac.be
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Outline
•
•
•
•
•
Why transportation modelling?
Which kinds of transportation modelling?
Why activity-based transportation modelling?
Which activity-based transportation model?
Model Selection: Albatross
– What is Albatross?
– Things what we have been doing and are still going to do
with respect to Albatross
• Introduction of an alternative modelling approach based
on sequential dependencies in data (short version)
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Why transportation modelling?
• Transportation problem is multi-dimensional:
–
–
–
–
Traffic jams
CO2-emissions
Impact on economy
Traffic accidents with significant number of casualties in Belgium
•  The need for transportation infrastructure is high, due to:
– Globalization
– Urbanization
– Governments cannot afford transportation constraints to have a
negative impact on future competiteveness, foreign investments,…
• However, changing the existing infrastructure is:
– Expensive, have significant long-term effects
– No guarantee for succes
– Not trivial (existing spatial zones, restricted by local and federal
regulations, legislation, etc.)
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Why transportation modelling?
• Therefore transportation models are often used. They
can:
– Support management decision making
– Make predictions in uncertain circumstances:
•
•
•
•
Changing infrastructure, environment
Changing behaviour of people
Changing socio-demographic circumstances
...
• The aim for these models is to portray reality as accurate
as possible
• They are frequently used in different countries
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Transportation modelling:
Trip-based Approach
Reality
• Modelling as independent
and isolated trips, no
connections between the
different trips
• no time component
• no direction
• no sequential infomation
Play Squash
12h,
By foot
Work
7.30h,PT
22h,
Car
Family
visit
12.50h,
By foot
16.40h,PT
At home
19h,
Car
Trip-based model
Work
PT, 2X
At
home
Work
By foot,
2X
Squash
At
Family
Car,
2X
home
visit
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Transportation modelling:
Tour-based Approach
• Trips that start and end from
home or from the same worklocation are modelled
independent
Reality
Play Squash
12h,
By foot
Work
7.30h,PT
• Direction + (spatial) limitations
• No temporal dimension
• Independent tours, model is not
capable of making the integration
12.50h,
By foot
16.40h,PT
At home
22h,
Car
19h,
Car
Family
visit
Tour-based model
Play Squash
By foot
By foot
• Uses Nested logit techniques
Work
PT
PT
At home
At
home
Car
Car
Family
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek,
visit Belgium
Transport modelling: An activitybased approach
• Travel demand is derived from the activities that individuals need/wish
to perform
• Sequences or patterns of behaviour, and not individual trips are the
unit of analysis
• Household and other social structures influence travel and activity
behaviour
• Spatial, temporal, transportation and interpersonal interdependencies
constrain activity/travel behaviour
• Activity-based approaches reflect the scheduling of activities in time
and space.
 Activity-based approaches aim at predicting which activities are
conducted where, when, for how long, with whom, the transport mode
involved and ideally also the implied route decisions.
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Which Activity-based transportation
model?
• Utility maximizing models
• Sequential models (computational process models)
Model
Activity
classes
Destination
Timing
Duration
With
whom
Mode
ALBATROSS
p(9)
p(23)
minutes
p
p(3)
p(5)
AMOS
g(4)
n
minutes
minutes
n
n
DAILY ACTIVITY
SCHEDULE
p(4)
n
sequence
n
n
p(3)
g
p(4)
sequence
g
n
n
p(3)
p(9)
n
n
n
p(6)
SCHEDULER
p
p
minutes
g
n
n
STARCHILD
g(6)
p(110)
sequence
n
n
p
NO NAME
(Wen&Koppel
man)
g(4)
n
sequence
n
n
p(1)
GISICAS
PETRA
p=predicted by the model; n=not treated in model; g=assumed given in model
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
ALBATROSS
• Albatross: A learning based transportation oriented
simulation system
= activity-based model of activity-travel behavior, derived
from theories of choice heuristics
• Developped in the Netherlands
(Arentze, Timmermans ;2000)
• The model predicts which activities are conducted when,
where, for how long, with whom and also transport mode
• Decision tree is proposed as a formalism to model the
heuristic choice
Obviously, this is a crucial component of the model.
The better the learning algorithm, the better the
prediction…
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Constraints that have been taken into
account in Albatross
• Situational constraints: can’t be in two places at the same
time
• Institutional constraints: such as opening hours
• Household constraints: such as bringing children to school
• Spatial constraints: e.g. particular activities cannot be
performed at particular locations
• Time constraints: activities require some minimum duration
• Spatial-temporal: constraints an individual cannot be at a
particular location at the right time to conduct a particular
activity
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Modelling Choice behavior
• Models used to rely on utility-maximization
• Albatross assumes that choice behavior is based on
rules that are formed and continuously adapted through
learning while the individual is interacting with the
environment (reinforcement learning) or communicating
with others (social learning).
As said, rules are currently derived from decision trees
Other rule-based learning algorithms can also be used
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
The scheduling model
Aim: Determine the schedule (=agenda) of activity-travel
behaviour
Components:
1. a model of the sequential decision
making process
2. models to compute dynamic
constraints on choice options
3.
a set of decision trees representing
choice behavior of individuals
related to each step in the process
model
]
a-priori
defined
]
derived from
observed choice
behavior
Skeleton refers to the fixed and given part of the schedule
Flexible activities: optional activities added on the skeleton
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
The sequential decision process
(process model)
Each oval
represents
a DT
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Example
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
The inference system in
Albatross
• For each decision, the model evaluates
dynamic constraints
• The implementation of situational, household
and temporal constraints is straightforward
• We will look at space–time constraints and
choice heuristics determining location choices
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Albatross derives DT based on Chaid-learning algorithm
Use a probabilistic assignment rule. The probability of
selecting the q-th response for each new case assigned to
the k-th node is:
where fkq is the number of training cases of category q at
leaf node k and Nk the total number of training cases at that
node
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Testing the model
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Results of inducing decision
trees
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Branch of time-of-day tree
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Performance of Albatross
• The eventual goodness-of-fit of the model can be
assessed only by a comparison at the level of complete
activity patterns
• Eventual output of Albatross is OD- trip matrices
• Conclusions till here:
– Use of decision trees for choice heuristics, resulting in a
considerable, but varying improvement over a null model
– A sample size of 2000 household-days suffices to develop
a stable model
– Transferability of the model to another context than in
which it was developed remains to be studied
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Advance the state of the art
Some things what we have been doing in our research group with
respect to Albatross:
• Two other rule-based techniques applied in the context of the
Albatross model:
– Integrate Decision tree techniques and feature selection: Identify
irrelevant attributes and build simple models
– Build advanced complex models by means of Bayesian networks
and try to improve accuracy
• Use (and adapt) Albatross towards the application area of Flanders
• Evaluate the performance of activity-based models versus trip-based
models
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 1: Build simple models by
means of DT and feature selection
• General idea: Occams’ razor: “Entities are not to be
multiplied beyond necessity”
 Large set of attributes
- likely to be correlated
- larger trees, but not necessary better !
 Use feature selection techniques to identify irrelevant
attributes that do not significantly improve accuracy and
can thus be omitted in the final model
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 1: Empirical results
• Build a DT for every decision facet in the Albatross
model
• Example: “location”-facet
Accuracy
46
44
42
40
38
36
34
32
30
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 20 25 28
Number of Attributes
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 1: Empirical results
Full approach
FS approach
Decision tree
# attrs
# leafs
e
Decision tree
# attrs
# leafs
e
Mode for work
32
8
0.598
Mode for work
2
6
0.595
Selection
40
35
0.686
Selection
1
1
0.669
With-whom
39
72
0.499
With-whom
4
51
0.467
Duration
41
148
0.431
Duration
4
38
0.368
Start time
63
121
0.408
Start time
8
110
0.382
Trip chain
53
8
0.802
Trip chain
10
13
0.811
Mode other
35
63
0.524
Mode other
11
60
0.508
Location 1
28
30
0.540
Location 1
6
15
0.513
Location 2
28
47
0.372
Location 2
8
14
0.312
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 1: Empirical results
Model performance at activity pattern level
Measure
Mean Distance
(Full approach)
Mean Distance
(FS approach)
SAM (activity type)
2.929
2.862
SAM (with)
3.205
3.112
SAM (location)
3.188
3.034
SAM (mode)
4.706
4.559
UDSAM
16.957
16.43
MDSAM
8.558
8.257
 Conclusion: There is no evidence of substantial loss in predictive
power when trimmed decision trees are used to predict activitytravel patterns.
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 2: Build complex models by
means of BN and try to improve accuracy
• General idea: Modelling travelling behaviour is non-trivial as it
is multidimensional and complex in nature. Hidden, unknown
relationships might have an impact on the final outcome
• Need for a technique that is able to deal with this:
Bayesian networks
–
–
–
–
–
Able to capture (complex) relationships between variables
Able to be learned from data
Visualize interdependences between variables
Prior and posterior probability distributions per variable
Well suited to conduct what-if scenarios and sensitivity
analysis
– White box
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Case study on mode choice facet
Steps to follow:
(1) Build the network (Structural Learning), (2) Choose a target variable
and prune the network, (3) Calculate probability distributions (Parameter
Learning), (4) Perform what-if scenarios by entering evidences in the
network
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Example of pruning a network

Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 2: Empirical results
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 2: empirical results
• Conclusions:
– Better predictions
 Reason: Unlike decision trees (CHAID), variables are
selected simultaneously, no hierarchy of importance of the
selected variables
– Selection of the variables +/- the same in both approaches
( difference in performance more due to different nature than
to additional insights)
– Much larger number decision rules in Albatross compared with
CHAID, however performance is also OK on the test data
( additional research on other datasets is warranted)
– Interpretation is an issue, BN link several variables in sometimes
complex direct and indirect ways.
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Application 3: Activity-based versus
trip-based
• Use (and adapt) Albatross towards the application area
of Flanders
• Evaluate the performance of activity-based models
versus trip-based models
• Transportation models: trip based
• Mobility Plan Flanders (2003)
– Predict in a static way reliable results for distribution,
substitution and route effects
– They cannot manage generative and temporal
shiftings
• Need for a more dynamic and more complete model
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Activity based models
Trip based models
• Travel demand is
derived from activities
• 24 hour schedule with
activities
• Household interaction
• Time and space
constraints
• Just consider oneway trips
• Only during peak hour
• Individual trips
• Calibration is needed
to fit the data to the
real situation
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
But ...
• Trip based models take the outcomes (traffic
flows, passengers numbers, ... ) as input in the
calibration
• As expected, the outcomes are robust and fit the
actual situation perfect
• The influence of the calibration is much stronger
than the influence of the input data
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Aim: the application of an activity based model in
Flanders
• Albatross ► developed for the Dutch situation
• First stage: use of the Dutch decision tables
• Comparison of the results of the two model types
and their performance on the same input data
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Data
• Travel behaviour study: urban region of Leuven (2001) +
trip-based model
– Trip schedules (no information on in-home activities)
– Locations: zip code ≠ statistical sector
– Assumptions:
–
–
–
–
Overestimation of car and bike availability per household
Standard values for work time
Transport mode: longest distance in the trip
Facility data: not yet available
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Assumption: trip based models predict the actual
situation almost perfect
• ALBATROSS:
• Mean length of the schedules is shorter
than in the Dutch example (reason:
conversion trip schedule to an activity
schedule)
• SAM values (parameters for Goodness-ofFit) are very high ►predictions are not
good
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• OD – matrices: match reasonably well
• Activity type:
– Good predictions for work and bring and get
– Grocery and non-grocery is a problem
• Length of tours
– Predict too much short tours (< 2 km)
• Transport mode
– Too much public transport and car
passengers
– Too little car drivers
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Predictions are not good: fortunately!
– Refinement of input data/ facility data
– Adaptation to the Flemish situation of the
decision tables
– Trip based model runs without traffic flows
and passenger numbers for a real fair
comparison
– Run model on other Flemish regions
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Some words on what else we have
been doing…
• An alternative approach to model activity-travel decisions
is also under development at our research group
– This model assumes that each diary consists of correlated
successive activities.
• For instance during morning: Sleep-Having BreakfastTransportation to work
•
Markov chains are often used to model this type
of dependences:
– Transition Matrix:
=First-order Markov Chain
Transition Matrix:
= Second-order Markov Chain
Etc.
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Artificial Example
Diary 1: TcFFFFFFFFFFFFFFFFE
Diary 2: TcEEFREREERFTcFTcFFTcFETcF
Diary 3: RREFEFEETcTcR
Diary 4: EEFFTcFTcFRRTcTcRTcRR
Diary 5: FFTcFFRE
Diary 6: EETcFRRE
With Tc= Transportation, with car as transport mode, F=visit Family, E=Eat, R=Read
Tc
E
R
F
Tc
E
R
F
0.11
0.23
0.10
0.21
0.03
0.40
0.53
0.20
0.16
0.08
0.30
0.28
0.70
0.29
0.07
0.31
These probabilities
can be computed by
means of Markov
Chains
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Example derived from data
• Simulation procedure: Simulate Xt as a function of the values
taken by Xt-1 and Xt-2  Repetitive procedure
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
• Some results…
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Let’s recapture things…
•Why transportation modelling?
•Which kinds of transportation modelling?
•Why activity-based transportation modelling?
•Which activity-based transportation model?
•Model Selection: Albatross
–What is Albatross?
–Things what we have been doing and are still going to do wrt
Albatross
•Introduction of an alternative modelling approach based on
sequential dependencies in data (short version)
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Questions?
Limburgs Universitair Centrum, Universitaire Campus, gebouw D, 3590 Diepenbeek, Belgium
Download