Rail Public Transportation is a Leader

advertisement
Analytics and Big Data — Rail Public Transportation is a
Leader
Lyndon Henry
Railway Age Magazine
Urban Rail Today Consulting
Austin, Texas
INTRODUCTION
MAJOR APPLICATIONS OF ANALYTICS
Two concepts currently at the leading edge of today's
information technology (IT) revolution are Analytics and
Big Data. Analytics is high-technology applied to data
processing, complex calculations, and automation; Big
Data is the current term referring to significantly large
volumes of data, on virtually every facet of human
activities and characteristics, that can be rapidly processed
via Analytics, yielding a broad spectrum of highly useful
results. Recent technological advances have sparked what
amounts to a "revolution" in the application of these
cognitive and informational tools.
Apparently without realizing it, the public
transportation industry, has, for many decades, been at the
forefront in utilizing and implementing Analytics and Big
Data, from ridership forecasting to transit operations.
Rail transit systems have been especially involved with
these IT concepts, and tend to be especially amenable to
the advantages of Analytics and Big Data because they
are generally "closed" systems that involve sophisticated
processing of large volumes of data. In virtually any
American city, on any normal weekday, one is likely to
see the results of analytics literally in motion — the
operation of transit buses and trains that are essential to
maintaining the mobility of the metro area.
The more that public transportation professionals and
decisionmakers understand the role of Analytics and Big
Data in their industry in perspective, the more effectively
they will be able to utilize its promise. Furthermore, it is
useful for both the public and the industry to realize how
significantly public transportation has been a leading
pioneer in the rich and extensive historic development of
these tools, the roots of which in some cases extend back
to 19th century rail technology.
Some of the most salient applications of Big Data and
Analytics in today's urban rail transit are summarized in
the following sections. [1] These range from urban
planning activities with computerized processing of
massive amounts of demographic and geographic data, to
complex signaling and train dispatching or control
systems, to communications, train tracking, and passenger
information operations using increasingly common
modern technologies like GPS, Wi-Fi, and cellular phone
systems. The intensive use of Analytics is particularly
underscored by the incorporation of "automatic" and
"automated" in so many of the common technical features
of modern rail operations: terms such as Automatic
Block Signals, Automatic Train Control, Automated
Passenger Counting, etc.
Travel Demand Modeling
Surely one of the biggest deployments of Big Data
has been in planning new public transportation services
and infrastructure. At least since the 1950s, this relatively
gigantic undertaking has involved modeling (projecting)
future travel demand in various urban areas.
Not only have the public at large generally not
realized the magnitude of this task, but transportation
planners themselves seem to have been unaware of the
extent to which systems modeling — involving
projections ranging from travel demand to modal split to
ridership — has, for more than half a century, represented
one of the most widespread and intensive deployments of
Analytics and Big Data.
The modeling process typically involves splintering
up all the census tracts in a multicounty metro region into
segments (each one often called a travel (or transportation
or traffic) analysis zone, or TAZ), tallying the total
households in each TAZ, then assigning some
demographic characteristics (e.g., income level) to
B – Partnering for Success
proportional household categories based on available
data. Next the model uses sophisticated algorithms to
project future growth in population, economic activity,
and perhaps other critical elements. Then (typically using
estimates of factors such as travel time and cost) more
algorithms project an estimate of all trips (for an average
weekday) among all the households and centers of
employment (and other activity centers, such as
educational facilities, retail centers, etc.) among all these
TAZs. [2]
From this complex process enormous volumes of
trips (Big Data) are projected, which are then assigned
(via the model Analytics) into travel corridors.
Ultimately, final projections are produced for
transportation facilities such as new road routes,
additional freeway lanes, and major new transit lines
(such as light rail).
This highly complex procedure, involving massive
amounts of data and networks of intricately interrelated
algorithms, has been repeated in metro areas across the
country, for many decades, well before the terms Big
Data and Analytics were fashioned. And planning for rail
transit systems has certainly been at the leading edge,
driving this effort and refining the methodology from city
to city, year after year. Furthermore, the incorporation of
global positioning systems (GPS) and geographic
information systems (GIS), with associated Big Data
Analytics, has facilitated major advancements in essential
planning tasks such as pinpointing locations, determining
land areas accurately, and overlaying and correlating large
volumes of demographic data with geographic areas (e.g.,
census tracts, transit service areas, etc.).
systems. On the other hand, where trains may run at
higher speeds in exclusive alignments, automatic block
signaling (ABS) systems are common, with lines
segmented into fixed blocks governed by automatically
operated signals that detect train occupancy and use red,
green, or yellow signal lights on the wayside of the track
to inform train operators whether their train should
proceed, stop, slow down, etc.
An additional
improvement is the cab signaling system (CSS), whereby
the current track block condition is displayed in the
operator's cab (usually with wayside signals as a backup).
At the core of most of today's urban rail systems is a
central dispatching operation, often highly automated. A
centralized traffic control (CTC) system, involving the
rapid and deft processing of train and track data via
Analytics, not only controls how signals and switches are
set, but usually also monitors the location of all the
system's trains, their directions and speeds, whether
they're on schedule, etc. — typically displayed
diagrammatically across several computer screens.
Simple signaling and control has evolved over many
decades, and has in fact long incorporated features such as
automatic train stop (ATS) to stop a train that proceeds
past a red signal or perhaps commits other dangerous
violations. Another advance, in use for many decades on
numerous systems, is automatic train control (ATC) to
control train movement according to signal and speed
authorizations. This takes train protection a step further
by implementing some form of speed control, usually
with CSS, in response to external inputs.
On the very high end of train control, some systems
are installing or upgrading to communications-based train
control (CBTC), typically eliminating physical fixedblock segmentation of track in favor of moving "virtual"
blocks with variable-length spacing between trains.
Needless to say, computer-based CBTC is especially
heavy on Analytics.
Some systems (such as the Port Authority Transit
Corporation's Speedline from Philadelphia into South
Jersey and Bay Area Rapid Transit) have deployed
operational Analytics to a very high degree indeed, even
automatically running trains via automatic train operation,
or ATO. (Airlines have had air traffic control and
autopilot capability for decades, but, in the Analytics race,
rail technology beat them to it!) Currently, the newest
installations of ATO appear to rely on CBTC technology.
In addition to these evolutionary technological
developments, a 2008 federal law mandates for many
transit operations the installation of a further signaling
technology, positive train control (PTC). Basically
incorporating a form of ATC, together with GPS, PTC
involves the installation of special devices in thousands of
locomotives and railcars, the construction of an extensive
Train Signal and Control Systems
By far one of the most intricate and critical aspects of
rail transit operations, the signaling and control (e.g.,
dispatching) component merits prominent discussion
because it represents an ancestor, of sorts, of Analytics in
public transit — originating in the late 19th century! At
first, this was primarily a system (based on the electrical
and mechanical technology of the period) to keep trains
from crashing into each other. But in the modern era, it
has been upgraded into a high-tech, complex process for
tracking the location of trains, estimating travel times, and
providing other valuable information. Intricate electrical
circuitry and electronics (a quantum leap from 19thcentury technology) form the basis of today's systems. [3]
[4]
Contemporary urban rail systems use a wide range of
signal types to manage train movements and ensure safe,
efficient operation. In very simple operations, train
signals may be integrated with ordinary street traffic light
2
B – Partnering for Success
new wireless communications network (some using
satellite-based links), and installation of many thousands
of
wayside
devices
(typically,
transponders)
interconnected with signals, switches and other railway
hardware. By law, PTC must be functional by the end of
2015. [5]
passengers as to when their trains (or buses) are due to
arrive at their stations or stops. Passenger information
display (PID) monitors or digital signs in stations or even
available apps on smartphones keep passengers updated
on imminent arrivals or departures.
Automated Fare Collection (AFC)
Route Planning and Scheduling
While passengers still drop coins in onboard
fareboxes on most urban bus systems, fare collection on
almost all rail systems today is largely automated.
Automated fare collection (AFC) typically uses ticket
vending machine (TVM) devices in stations that can
receive cash or process credit card swipes, thus also
instantly updating a central database — often with
voluminous amounts of really Big Data.
Passes and discounted multi-tickets are encouraged,
but the hot trend is toward smartcards that provide access
to all types of transit services across multiple operating
agencies and jurisdictions. When a passenger uses a credit
card, the transit agency can correlate passenger travel
with other data available from the credit card. Slick new
analytics give transit agencies details of how passengers
are using their systems, identify trends, and help improve
service.
With the advent and proliferation of computer
technology, and advances in analytical processing of
complex data, laborious transit scheduling tasks have
been made considerably faster, less arduous, and more
efficient through the application of Analytics to process
large, complex volumes of data. Today's powerful
software accomplishes the tasks of routing, developing
timetables, blocking these into bus and train schedules,
then performing runcutting and other essential component
tasks such as rostering.
As Christopher MacKechnie explains in an
informative online summary, [6]
Scheduling software allows a transit agency to
design bus routes, create bus stops, schedule bus routes,
combine individual bus trips into blocks, cut blocks into
pieces that individual drivers will operate, on a daily
basis assign individual drivers into runs, and provide
customer information about the network. The
automation allows for schedulers and transit planners to
quickly develop many different scheduling scenarios
rather than rely on just one, which has significantly
[increased] the operational efficiency of today's transit
systems.
Automated Passenger Counting (APC)
While AFC can tell a transit operator how many
people are purchasing tickets or passes, and how much
fare revenue is being taken in, transit agencies still need
to count how many passengers are actually boarding each
bus or train. An automatic passenger counting (APC)
system not only can inform the agency as to how many
passengers are boarding or deboarding each vehicle, but
precisely where this happens — and they can relay this
information online, continuously, to a central database
(typically generating Big Data in the process). Especially
by applying sophisticated Analytics, the transit agency
can then use this data to provide better service and project
evolving ridership trends.
Automatic Vehicle Location (AVL)
Some of the most useful and popular of today's
applications of Analytics in public transit are automatic
vehicle location (AVL) and associated passenger
information systems (such as NextBus). Using GPS-based
data plus Analytics to track both buses and trains, AVL
has become an extremely reliable system to inform central
dispatching personnel (or an automated control center) as
to where trains are and whether they're on schedule —
information that can then be communicated to passengers
in stations. (While the data on the location of a single
train at any point in time is relatively small, in the
aggregate, with multiple trains moving constantly, the
volume quickly leaps to the category of Big Data.)
Many transit agencies use AVL integrated with a
passenger-oriented information system (NextBus, the
brand name of the most widely deployed system, has
virtually become a generic term for this) to clue waiting
EXAMPLES — SELECTED SYSTEMS
To illustrate the diverse role of Analytics and Big
Data in a broad variety of tasks and scenarios, it's helpful
to summarize a selective sampling of useful and essential
applications involving Analytics in several U.S. rail
transit systems of various sizes.
3
B – Partnering for Success
can be integrated with data incoming from station fare
gates, thus enabling the team to also monitor the flow of
passengers through the system, using identification codes
from passenger tickets, cards, passes, etc. Altogether,
Analytics facilitates more accurate projections of service
needs in terms of schedules, train consists, and similar
essential features.
Thus, according to Roy Henrichs, reliability
engineering team manager, this application of Analytics
enables BART to address three critical needs passengers
face: "You arrive at the station, and the first question you
ask is, 'Where's my train?' Then you ask, 'Where's my
seat?' And finally, 'Will I be on time?'" Using operational
Analytics, BART has been able to resolve these issues
positively for passengers, thus ensuring high user
satisfaction.
Bay Area Rapid Transit (BART)
The Bay Area Rapid Transit (BART) system
provides a highly automated, relatively high-speed, urbansuburban rail rapid transit (RRT) system for the San
Francisco Bay Area, serving the counties of San
Francisco, Contra Costa, San Mateo, and Alameda. The
system consists of 105 miles of double-track RRT service
with 43 passenger stations; it includes a 4-mile-long
underwater tube connecting San Francisco with Oakland.
Average weekday ridership totals about 370,000.
From its Lake Merritt Operations Control Center,
BART maintains supervision over all phases of its
system, including train operations, passenger services,
power delivery, and wayside facilities. And Analytics,
especially with the processing and interpretation of Big
Data, is a key element within all of these functions. [7]
The critical role of some key aspects of BART's
operational analytics — aimed at ensuring schedule
reliability — is the focus of a 2012 article by Beth
Schultz, titled "Operational Analytics Keeps Bay Area
Trains on Track" and posted on the All Analytics website.
[8] On-time performance is cited as "the most important
issue" for BART's passengers, and it certainly is for the
system's management. Thus, "some rather sophisticated
operational analytics" are essential to enable the agency to
know "if its trains are running on time and patrons
arriving at their destinations as expected...."
Implementing this, a variety of operational analytics
performed by BART's reliability engineering team
includes delay analysis, passenger flow modeling (PFM),
and system performance analysis, as well as various other
types of modeling used for forecasting and similar tasks.
Notably, "the data required for the analytics is complex
and voluminous" — definitely Big Data. Using
infrastructure based an IBM Maximo asset management
system and an Oracle database/Linux platform, the BART
team's operational analytics implements algorithmic code
developed by Analytics software vendor SAS (ported
from an original mainframe-based environment ).
Critical to the top priority of ensuring on-time service
is the PFM application, which deploys time-series
analysis integrated with econometric data to render
ridership forecasting models with the objective of
optimizing train schedules to ensure high customer
satisfaction while constraining service operating costs.
The goal is to avoid running trains that are either underor overloaded; PFM, in addition to other functions,
"captures or estimates train loadings for use in generating
the train schedules."
Through this application of Analytics, BART's
operational team can monitor and analyze train arrivals
and departures precisely. Furthermore, that information
Salt Lake City TRAX
Salt Lake City's TRAX light rail transit (LRT) system
illustrates how Analytics — in this case, tracking and
utilizing large volumes of real-time operational data —
plays an integral role in even a modest rail system in a
midsized American city. With the opening of the new
Airport line in mid-April 2013, the TRAX LRT system
stretches 41.3 miles over three lines (Red, Blue, and
Green) with 47 stations, carrying daily ridership of about
65,000. An extension to the suburb of Draper nearing
completion will expand the system to about 45 miles. [9]
With a fleet of 146 LRT cars that run in train consists
of up to three cars at maximum speeds ranging between
15 and 65 mph, TRAX's central operations must keep
track of up to 23 trains running at headways as close as
five minutes in peak periods or 20 minutes in off-peak
periods. Particularly complicated is operation in sections
where two or three lines share the same tracks.
TRAX also shares a section of its line to suburban
Sandy with freight trains of a short line operator. This
joint use is authorized via temporal separation by the
Federal Railroad Administration (FRA).
To coordinate all these trains, including LRT often at
close headways, a thoroughly reliable, efficient, safe, and
secure train control and traffic management system is
essential — and for TRAX, an ABS system with fixed
blocks provides this capability. Analytics is an especially
critical ingredient in the CTC train tracking and
dispatching system, which relies on a GPS network to
locate more than two dozen trains at peak times and
communicate their positions to the control center, located
at TRAX's Jordan River Service Center..
Since TRAX uses a relatively modest, less costly
dispatching system heavily incorporating the KISS
4
B – Partnering for Success
principle (Keep It Simple, Stupid), the system relies
mainly on human dispatchers (rather than automation) to
monitor train locations and operations. In turn,
dispatchers perform essential functions like permitting
trains to embark on each trip from a terminal station,
throwing switches when necessary, and ensuring that
trains adhere to schedule.
But the GPS-based network and associated Analytics
comprising the backbone of the dispatching system also
facilitate a convenient passenger information system, with
information presented via PIDs suspended beneath the
roofs
of TRAX
stations.
Similarly,
reliable
communications, transmitting large volumes of Big Data,
and Analytics are involved in the self-service AFC
system, including TVMs in all stations.
Data" that would enable the agency not merely to track all
its vehicles in service but, by applying Analytics to
scrutinize service performance, to fine-tune and improve
its operations. [11]
MetroRail cars are also equipped with APC. While
this is obviously valuable in gathering passenger statistics,
it's also an efficient tool for planning and operations (e.g.,
adjusting schedules to accommodate passenger flows and
changes in traffic demand by time of day). [10]
Philadelphia — SEPTA Regional Rail
Southeastern Pennsylvania Transportation Authority
(SEPTA) is a very large transit agency operating an
extensive system of rail rapid transit (subway and
elevated), regional rail (commuter), light rail (urban and
suburban lines, some with subway operation), motor
buses, and electric trolleybuses ("trackless trolleys").
SEPTA deploys Analytics and Big Data throughout its
system in a wide variety of functions, including
operations control, AVL, train signaling and dispatching,
AFC with TVMs, APC, passenger information with PIDs,
and other tasks. However, for examples of the role of
Analytics and Big Data, only selected applications in just
a couple of these modal categories will be discussed here
summarily.
SEPTA's Regional Rail system consists of 13
electrified lines operating FRA-compliant rolling stock
over about 280 miles of track, stretching as far north as
Trenton, New Jersey, and south to Wilmington, Delaware.
Daily ridership totals nearly 124,000.
Regional Rail operations deploy Analytics
intensively in the signaling/control/dispatching system,
mostly with a combination of CTC and ABS. On lines
shared with Amtrak (Paoli-Thorndale, Cynwyd, Chestnut
Hill West, Airport, Trenton and Marcus Hook-Newark)
Amtrak controls the dispatching and SEPTA trains are
equipped with cab signals and compatibility with
Amtrak's relatively high-tech Advanced Civil Speed
Enforcement System (ACSES) system for PTC. On
Amtrak’s Keystone corridor (Philadelphia to Harrisburg)
cab signals are utilized, but the line is not 100% ATC.
[12]
The ACSES system is being extended to all of
SEPTA's Regional Rail lines. Working in unison with
SEPTA's existing signaling-control operations, these two
systems will provide the functionality of PTC in
compliance with the federal mandate. SEPTA's PTC
system will be able to enforce permanent and temporary
civil speed restrictions and train stops through a network
of transponders, while maintaining the continuous track
Austin — Capital Metro's MetroRail
Since March 2010, Austin, Texas's Capital
Metropolitan Transportation Authority (Capital Metro)
has been operating its MetroRail light railway using diesel
multiple-unit (DMU) rolling stock over a 32-mile line,
with 10 stations, from the city's lower downtown to a
northwestern suburban town. Despite the length, it's a
relatively small, bare-bones system, with a fleet of just six
DMUs, and daily ridership currently averaging about
2,200. In an arrangement similar to Salt Lake City's
TRAX system (with its line to Sandy), MetroRail also
shares its tracks, under temporal separation mandated by
the FRA, with freight trains of a short line operator.
Despite MetroRail's relatively small size, Analytics
plays a critical role in the line's operations, particularly in
its ABS system overseen by CTC. Communication
between blocks and to-from the CTC control center is
maintained via data radio as the primary system, and a
cellular phone system as a secondary backup. [10]
While trains (currently run as single cars) are
equipped with GPS, the geopositioning system is not
currently used for routine train location, but mainly as a
component of the passenger information system. (GPS for
train location is used as a temporary expedient for
emergency situations or unusual freight train movements.)
Thus, in what's in effect a limited AVL application,
GPS provides train schedule information (e.g., the next
arrival or departure) at stations via PIDs. The system,
originally installed by Orbital Sciences Corporation, is
now branded as ACS, under parent Xerox Corporation.
Capital Metro has plans for major expansion of GPS
and AVL in both its bus and MetroRail services. Possibly
to be developed within the next several years, according
to Todd Hemingson, Vice-President for Strategic
Planning, AVL would generate "a massive pool of Big
5
B – Partnering for Success
monitoring advantages of the current ATC System. The
installation of the ACSES system will also ensure
interoperability with Amtrak and various freight carriers.
On Regional Rail lines owned and maintained by
SEPTA there is a mixture of ABS and ATC (Rule 562:
Cab Signals with no wayside signals). While all trains
have cab signals, not all lines have been upgraded from
ABS. ATC is operational from Center City to
Doylestown, Jenkintown to Woodbourne, Glenside to
Warminster, Newtown Junction to Fox Chase and from
Wayne Junction to Chestnut Hill East. Projects are
currently under way to convert the Manayunk-Norristown
Line (16th Street Junction to Elm Street) and Chestnut
Hill West (North Philadelphia to Chestnut Hill) to ATC
(Rule 562). GPS is used for AVL, processed through the
Regional Rail control center.
APC is gradually being introduced into the Regional
Rail system; SEPTA's new Silverliner V cars are
equipped with APC detectors. APC data will be used to
adjust scheduled consists and to track trips that receive
external funding, such as services receiving federal Job
Access and Reverse Commute (JARC) funding.
Analytics plays a role in current planning and
scheduling. For schedule development, the Regional Rail
system uses Multi-Rail Passenger Edition, soon to be
upgraded to Enterprise Edition.
SEPTA's passenger information system provides
PIDs with train arrival/departure updates in some of the
system's larger stations. In addition, the system includes
an app that provides bus and train status information to
passengers' smartphones (Android and I-phone
platforms).
latitude/longitude, it will be possible to tally and analyze
passenger boarding and alighting at each station.
As with bus, trolleybus, and high-speed services, for
the schedule management and planning of the suburban
trolley services, SEPTA's Service Planning deploys the
automated Trapeze scheduling system. Furthermore, this
is integrated with Google Maps so that all route changes
mapped in Trapeze are communicated to Google for
automatic updating. And the smartphone-based passenger
information app described in the Regional Rail section
also serves passengers on these suburban trolley routes.
Seattle — Sound Transit's Link and Sounder
The Seattle-Puget Sound region's Sound Transit (ST)
agency provides several important rail transit services
reaching from the Central Business District into the
surrounding metro area. ST's Central Link is a 15.6-mile
(25.1-km) LRT line running between downtown Seattle
and Seattle-Tacoma International Airport, with 13
stations. Average weekday ridership is about 25,300.
ST's Sounder is a regional passenger rail (commuter
rail) service operated under contract by BNSF Railway.
From central Seattle, trains run north to Everett and south
to Lakewood, plus two daily round-trips to and from
Tacoma, over about 82 miles (132 km) of route, with 9
stations (and another 3 under construction). Average
weekday ridership is about 25,300.
As with other major rail transit operations, Analytics
and Big Data are intensively involved in signalingcontrol-dispatching; passenger information with online
and smartphone train status information and station PIDs;
APC; GPS and AVL capabilities; and AFC with TVMs in
stations.
By far one of the most interesting deployments of
Analytics and Big Data can be seen in a relatively recent
expansion of the AFC system with the regional, transagency ORCA payment card. A contactless, stored-value
"smartcard" containing a microprocessor, the ORCA (One
Regional Card for All) card is used for the payment of
public transportation fares on most of the region's major
bus and rail services, including Washington State Ferries
— thus providing a virtually "seamless" fare-payment (in
effect, a prepaid pass) among these multiple systems and
agencies.. [13] [14]
The card medium itself must be purchased by the
user (currently the charge is $5.00 or less, depending on
the user's eligibility for discounts). Value (for fare
payments) must then be added to the card (typically, via a
credit card account, often as an online transaction).
Discounts are offered for multi-ride packages as well as
for passengers that are seniors, disabled, or in other
Philadelphia — SEPTA Suburban Trolley Lines
Another category of SEPTA's rail operations that
provides interesting examples of some aspects of the
deployment of Analytics, particularly in signaling-control
functions, is the suburban trolley lines, Routes 101
(Media) and 102 (Sharon Hill). Totaling 11.9 miles (19.2
km), with 52 stations, the two lines carry daily ridership
averaging over 6,500.
While these LRT services use ABS, dispatching is
very bare-bones — i.e., manual communication with the
control center, where dispatchers authorize train
departures by voice. However, conversion of signalingcontrol to CBTC is being planned. GPS is currently used
primarily to assess on-time performance (averaging about
92%). [12]
Suburban trolleys do not currently have APCs
installed, but plans to install up to 10 units are awaiting
funding. Since each station is now geocoded with
6
B – Partnering for Success
qualified categories. Thus, the equivalent of multi-ride
passes can be purchased as well as single fares.
The card eliminates the inconvenience to passengers
of constantly having to find currency or change to pay
fares, especially when transferring from system to system.
Passengers can use the ORCA card somewhat like a debit
card. Entering a rail station or ferry terminal, or boarding
a bus, the passenger can just tap it against an electronic
reader.
But ORCA card information can also be a source of
significant Big Data for transit agencies, providing
information on individual passenger movements on transit
throughout the region, as well as broader data on
passenger flows at various locations and times of day.
Processed with good Analytics, the card data provide a
wealth of information for planning and scheduling,
leading to service improvements, as well as for marketing.
typically, merely a computer-based process of discerning
patterns in sizable sets of data — for example,
discovering passenger flow and mobility patterns from
boarding-deboarding data at stations.
The basic aim of data mining is to glean information
from a set of data and transform it into an intelligible
structure for further useful analysis. Data mining is
tending to bring together innovations in statistical
analysis, database architecture, and machine learning
development.
Particularly with the maturing of technologies such as
AFC and APC, opportunities abound for the rail transit
industry to utilize data mining of the data flows from
these technologies to analyze operations, passenger
behavior, and other phenomena. This can then be utilized
to improve services and performance, thus better fulfilling
the basic missions of transit agencies.
Tacoma — Sound Transit's Tacoma Link Streetcar
Cloud Computing
Tacoma Link, operated in central-city Tacoma by
Sound Transit, is a very small 1.6-mile (2.6-km)
streetcar-type light rail transit line with 5 stations,
carrying daily ridership of roughly 3,800. Currently, the
service is provided free (no fare), so there is no
integration with ST's AFC system and the ORCA card.
While the system is currently extremely bare-bones
in overall design, with relatively simple operation, it does
integrate its APC system with onboard GPS using inputs
from the cars' door systems. Furthermore, GPS
deployment in operations is planned to be further
expanded. Currently on-train passenger announcements
use wheel pulses from the cars' propulsion sensors to
gauge distance traveled. However, the agency is in the
process of replacing this older passenger information
system on the train with a new digital system that will
rely on GPS to identify train position. [15]
Cloud computing commonly refers to the utilization
of computing resources (both hardware and software) that
are available over a network (typically the Internet). On a
small scale, using online software and servers (e.g., an
Email system or blogging software) is an example.
However, cloud computing has grown as a means of
providing the substantial computing resources — in terms
of both computational "firepower" and storage — needed
for the increasingly gigantic volumes of data (really Big
Data) many organizations now encounter. With cloud
computing, an external entity and remote services must
be entrusted with the user's data — and many
organizations understandably are reluctant to pass access
to such sensitive data to external users. However, such
security concerns are traded off against the necessity to
have access to the necessary off-site computing and
storage resources.
One resource commonly used for many Big Data
Analytics applications is Apache Hadoop, an open-source
software framework supporting distributed processing
applications, often needed for data-intensive tasks.
Derived from Google's MapReduce and Google File
System research, and written in the Java programming
language, Hadoop is designed to support running
applications on large "clusters" of commodity hardware
(i.e., affordable and easily procured).
Whether transit agencies will need to access cloud
computing resources to effectively handle future needs in
Analytics and Big Data remains to be seen. But these
resources merit monitoring in the event such needs do
eventually arise.
CURRENT ISSUES AND TRENDS
Where are Analytics and Big Data headed in public
transportation? Here's a brief overview of some of the
major current issues in Analytics and Big Data and the
implications for rail public transportation.
Data Mining
This application of Analytics and Big Data has
acquired a somewhat adverse public reputation, mainly
because of privacy issues raised by intrusive manipulation
of personal information. But data mining is, more
7
B – Partnering for Success
Sentiment Analysis
Privacy Concerns
Sentiment analysis, also known as opinion mining , is
actually a form of data mining applied to textual, verbal
information. By parsing human verbal communication
through natural language processing, applying
computational linguistics, and deploying text analytics,
subjective information, such as attitudes, opinions, and
even intentions, in source materials can be identified and
extracted for more intense processing and scrutiny.
In general, the objective of sentiment analysis is to
determine the attitude of individuals (the public, specific
business customers, transit passengers, etc.) with respect
to certain issues. Typically, specific issues are assessed
against the contextual polarity of each of the verbal
documents, which might be Emails, text messages,
postings to forums or Facebook, Twitter messages, and so
on. Evaluating aspects such as attitudes, judgments, or
emotional states, is key to the process.
The prominence of social media (especially
Facebook and Twitter, as well as blogs and other social
networks) has expedited interest in sentiment analysis.
The proliferation of verbal data such as consumer
reviews,
subjective
ratings,
and
personal
recommendations, together with other types of verbal
expression publicly available online, has tremendously
increased attention in regard to this aspect of Analytics.
For public transit agencies, sentiment analysis is a
potentially valuable tool that merits consideration — for
example, to gauge public attitudes toward the agency's
services in general; to assess attitudes in regard to a new
service, or perhaps a political issue such as a ballot
measure; or simply to sift for transit-related issues
important to passengers or the public at large. In some
agencies, sentiment analysis is also used to monitor for
security threats to transit operations or passengers.
Certainly one of the most hot-button issues with
respect to the general public's relationship to Big Data,
Analytics, and associated applications such as data
mining, is the issue of privacy. For transit agencies, the
potential exists to extract great volumes of Big Data from
fare transaction data, passenger counts, and even
surveillance of passengers in trains and stations. Yet the
opportunity for abuse is clear, and the public realize this
— and it's also a major issue of concern within the
professions themselves that are involved with Analytics
and Big Data.
An example is the ongoing controversy over Seattle's
ORCA card (see previous discussion), underscored by a
news reporter's ominous headline "Is Big Brother
watching your ORCA card?" [14] This arises from the
revelation that ORCA card sponsors and participants are
data-mining information from passengers' use of the card;
indeed, employers that provide the card to their own
employees can receive data as to how and where each
individual employee is using the card and traveling on the
various transit systems. As the article reports,
Whenever someone buys an employer-subsidized
fare card through one of 2,000 companies or institutions,
the employer has the right to see that person's travel
records. A boss could check to see, for example,
whether someone is abusing a subsidy by reselling
ORCA cards or find out if an employee called in sick
but rode the bus to the mall or the beach.
And if you register any ORCA card, as transit
officials suggest to protect against loss or theft, your
personal information goes into the transit-agency
database. Personal fare-card information is technically
available to news media and other groups, as well,
though it's unclear how forthcoming ORCA would be in
providing it.
In another application of data mining, a form of
sentiment analysis is used by some transit agencies not
merely to assess public attitudes toward the agency, but to
parse personal text and Email messages and evaluate
content to reveal possible intentions of threats against the
system's operations or passengers.
Despite the
presumably benign objectives of agency security
personnel, the public may well perceive this practice as a
serious violation of the right to privacy.
There's no "magic potion panacea" to apply to this
issue. However, transit agencies would be well-advised
to exercise caution in encroaching on personal privacy,
and to keep monitoring this issue as it evolves in public
discourse.
Security issues
Protecting critical data from theft, vandalism,
intrusion by unauthorized users, and other hostile or
destructive acts is obviously a major concern for transit
agencies. This concern has only grown greater with the
expansion of Big Data.
This issue has escalated even further recently with
the increasing publicity of "cyber-attacks" on the data and
cybernetic functioning of large institutions, from banks to
electric power installations to military facilities. Clearly,
transit agencies are vulnerable and need to maintain
vigilance against such threats.
8
B – Partnering for Success
further application of this promising technology to solve
problems and improve services in transit operations.
Predictive Analytics
Comprising an array of techniques from statistics,
modeling, machine learning, artificial intelligence, and
data mining, predictive analytics applies Analytics to
current and perhaps historical data (increasingly, Big
Data) to develop predictions about future (or perhaps
otherwise occult) events or possible outcomes.
Exploiting patterns detected in historical and
transactional data, predictive models can be used to
identify risks and opportunities, capturing relationships
among a variety of elements to facilitate assessment of the
potential risk associated with a particular set of
conditions, thus helping to guide decisionmaking. But
certainly the most venerable and productive use of
predictive Analytics models in public transportation has
been to evaluate the future role of public transit systems,
forecast ridership, and suggest the need for new transit
systems and facilities (as detailed in the earlier discussion
on Travel Demand Modeling).
Predictive models may have further benefits for
public transportation, such as illustrated in BART's use of
Analytics for passenger flow modeling and related
operational projections and simulations. Other
possibilities, meriting evaluation by transit agencies and
IT professionals, is applying predictive modeling to
analyze behavioral data to evaluate the propensity of
transit passengers to exhibit specific behaviors. This
would be useful, for example, as a tool to help improve
the effectiveness of new marketing efforts.
NOTES
1. This section has been adapted and expanded from
Henry, Lyndon. Public Transportation Moves With
Analytics. All Analytics (online), 10 July 2012.
http://www.allanalytics.com/author.asp?section_id=2310
&doc_id=247066
2. BMC staff. Travel Demand Forecasting Model.
Baltimore Metropolitan Council (BMC) website, 2013.
http://www.baltometro.org/regional-data-andforecasting/travel-demand-forecasting-model
3. RTWP editors. US Railroad Signalling. Railway
Technical Web Pages (RTWP). Site updated 3 April
2013.
http://www.railway-technical.com/US-sig.shtml
4. Burgett, Michael J. The Engineering Basics of CTC.
Control Train Components website. Accessed 11 April
2013.
http://www.ctcparts.com/aboutprint.htm
5. AAR editors. Positive train control. Association of
American Railroads (AAR) website. Accessed 13 April
2013
https://www.aar.org/safety/Pages/Positive-TrainControl.aspx#.UWlKRZNthLU
Robotics
6. MacKechnie, Christopher. Software Used in the Public
Transit Industry: Hastus by GIRO. About.com Guide.
Accessed 2 April 2013).
http://publictransport.about.com/od/Transit_Technology/a
/Software-Used-In-The-Public-Transit-Industry-HastusBy-Giro.htm
Robotics technology incorporates some of the most
advanced applications and developments of Analytics to
Big Data sets and challenges, addressing the design,
fabrication, operation, and application of automated
machines or devices that can replicate human activity or
behavior in situations ranging from dangerous
environments, manufacturing processes, or tediously
repetitive tasks, to simply ordinary, routine physical
functions such as housekeeping chores or operating a
vehicle. While some robots may be designed to resemble
humans in appearance, in most cases they are designed to
assume human behavior and even cognition.
With the development and deployment of automatic
train control and, increasingly, totally autonomous, selfcontrolled and self-monitored transit operations (e.g.,
driverless metros), rail public transport systems have
certainly been in the forefront of robot technology for
decades.
Transit professionals, and especially IT
personnel, should continue to monitor developments in
this area of Analytics, seeking opportunities for the
7. Center for Urban Transportation Research (CUTR)
staff. Case Study — Bay Area Transit District (BART) —
San Francisco, California; CUTR, University of South
Florida (USF); document #FTA-FL-26-71054-03.
http://www3.cutr.usf.edu/security/documents/UCITSS/B
ART.pdf
8. Schultz, Beth (Editor in Chief, All Analytics website).
Operational Analytics Keeps Bay Area Trains on Track.
All Analytics (online), 15 May 2012.
http://www.allanalytics.com/author.asp?section_id=1411
&doc_id=244062
9
B – Partnering for Success
9. This section has been adapted and expanded from
Henry, Lyndon. Analytics Keep SLC's Light Rail on
Track. All Analytics (online), 28 December 2012.
http://www.allanalytics.com/messages.asp?piddl_msgthre
adid=260931
10. Clendennen, Mark (MetroRail, Capital Metro). Phone
conversation, 10 April 2013.
11. Hemingson, Todd (Vice-President for Strategic
Planning, Capital Metro). Phone conversation, 5 April
2013.
12. Calnan, John F. (Manager, Suburban Service Planning
& Schedules, SEPTA). Phone conversation, 10 April
2013. Email message, "Signals, APC, GPS etc. on SEPTA
Suburban LRT ", 11 April 2013.
13. ORCAcard.com website editors. About ORCA.
Accessed 9 April 2013.
http://www.orcacard.com/ERG-Seattle/p3_001.do?m=3
14. Lindblom, Mike. Is Big Brother watching your ORCA
card? Seattle Times, 17 December 2009 (updated 18
December 2009 ).
http://seattletimes.com/html/localnews/2010537022_orca
card18m.html
15. Blackburn, Robert (Tacoma Link Light Rail Manager,
Sound Transit). Email message, 26 March 2013.
CONTACT INFORMATION:
Lyndon Henry
nawdry@gmail.com
Phone 512.441-3014
10
Download