MineFleet®*: An Overview of a Widely Adopted Distributed Vehicle

advertisement
MineFleet®*: An Overview of a Widely Adopted Distributed
Vehicle Performance Data Mining System
Hillol Kargupta**
Kakali Sarkar
Michael Gilligan
Agnik, LLC
8840 Stanford Blvd.
Columbia, MD 21045, USA
Agnik, LLC
8840 Stanford Blvd.
Columbia, MD 21045, USA
Agnik, LLC
8840 Stanford Blvd.
Columbia, MD 21045, USA
hillol@agnik.com
kakali@agnik.com
mgilligan@agnik.com
possibility. Several years of research on distributed data mining
[1, 2, 3, 5, 7] and data stream mining have produced a reasonably
powerful collection of algorithms and system-architectures that
can be used for developing several interesting classes of
distributed applications for lightweight wireless applications. In
fact an increasing number of such systems [4, 6] are being
reported in the literature. Some commercial systems are also
starting to appear.
ABSTRACT
This paper describes the MineFleet® distributed vehicle
performance data mining system designed for commercial fleets.
MineFleet analyzes high throughput data streams onboard the
vehicle, generates the analytics, sends those to the remote server
over the wide-area wireless networks and offers them to the fleet
managers using stand-alone and web-based user-interface. The
paper describes the overall architecture of the system, business
needs, and shares experience from successful large-scale
commercial deployments. MineFleet is probably one of the first
commercially successful distributed data stream mining systems.
This patented technology has been adopted, productized, and
commercially offered by many large companies in the mobile
resource management and GPS fleet tracking industry. This paper
offers an overview of the system and offers a detailed analysis of
what made it work.
This paper reports the development of MineFleet®, a novel
mobile and distributed data mining application for monitoring
vehicle data streams in real-time. MineFleet is designed for
monitoring commercial vehicle fleets using onboard embedded
data stream mining systems and other remote modules connected
through wireless networks in a distributed environment.
MineFleet is a powerful data stream mining software for
modeling, benchmarking, and monitoring of vehicle health,
emissions, driver behavior, fuel-consumption, and fleet
characteristics.
Categories and Subject Descriptors
H.1.0 [Models and Pinciples]: General; H.4.m [Information
Systems Applications]: Miscellaneous
Consider a nationwide grocery delivery system which operates a
large fleet of trucks. Regular maintenance of the vehicles in such
fleets is an important part of the supply chain management and
normally commercial fleet management companies get the
responsibility of maintaining the fleet. Fleet maintenance
companies usually spend a good deal of time and labor in
collecting vehicle performance data, studying the data offline, and
estimating the condition of the vehicle primarily through manual
efforts. Fleet management companies are also usually interested
in studying the driving characteristics for a variety of reasons (e.g.
policy enforcement, insurance, Department of Transportation
regulations). Monitoring fuel consumption, vehicle emissions, and
identifying how vehicle parameters can be optimized to get better
fuel economy are some additional reasons that support ample
return of investment (ROI) for systems like MineFleet.
General Terms
Algorithms, Experimentation, Design, Performance.
Keywords
Vehicle data stream mining, distributed data mining, telematics.
1. INTRODUCTION
The wireless and mobile computing/communication industry
is producing a growing variety of devices that process different
types of data using limited computing and storage resources with
varying levels of connectivity through wireless communication
networks. The rich source of data from the ubiquitous components
of businesses, mechanical devices, and our daily lives offers the
exciting possibility of a new generation of data intensive
applications for distributed and mobile environments. Mining
distributed data streams in a ubiquitous environment is one such
The MineFleet is widely adopted in the mobile resource
management and fleet management industry. Similar applications
also arise in monitoring the health of airplanes and space vehicles
[9, 10, 11]. There is a strong need for real-time on-board
monitoring and mining of data (e.g. flight systems performance
data, weather data, radar data about other planes).
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
KDD’10, July 25–28, 2010, Washington, DC, USA.
Copyright 2010 ACM 978-1-4503-0055-1/10/07...$10.00.
------------------------------------------------------*Protected by Patented Technology; ** Also affiliated with the
CSEE department, University of Maryland, Baltimore County.
The main unique characteristics of the MineFleet system that
distinguish it from traditional data mining systems are as follows:
37
transmission of the results to the server over the wireless network.
This problem is even more serious for long distance trucks and
off-the road equipments that make use of satellite-based wireless
communication networks instead of the land-based cellular
networks. Note that the satellite-based wireless communication
networks are considerably more expensive compared to their
land-based counterparts.
1. Distributed mining of the multiple mobile data sources
with little centralization of the data.
2. Onboard data stream management and mining using
embedded computing devices.
3. Designed to pay careful attention to the following
important resource constraints:
a. Minimize data communication over the widearea wireless network.
b. Minimize onboard data storage and the
footprint of the data stream mining software.
4. Process high throughput data streams using resourceconstrained embedded computing environments.
5. Respect privacy constraints of the data, whenever
necessary.
Section 2 presents the business motivation. Section 3 compares
MineFleet with existing vehicle telematics systems. Section 4
offers an overview of the MineFleet system architecture. Section
5 describes the Predictive Health Monitoring capabilities of the
MineFleet system. Section 6 discusses the Fuel Consumption
Analysis module. Section 7 offers an overview of the Driver
Behavior Monitoring module. Section 8 discusses the Emissions
Monitoring capabilities of the MineFleet system. Section 9
discusses how MineFleet penetrated the market and achieved
wide-spread adoption. It also offers a perspective of how the
business ecosystem evolved and shares some of the experiences in
placing a new distributed data mining product in a new vertical
not quite familiar with the data mining technology. Finally,
Section 10 concludes the paper.
Commercial fleets are usually comprised of large number of
vehicles. These fleets are usually segmented among a set of
different groups of vehicles of same type. Fleets also have drivers
and the overall efficiency of the fleet depends on the driver
behavior. Therefore, just analysis of the vehicle performance data
onboard the vehicle is not enough. Comparing and contrasting the
performances of different vehicles and drivers is also very
important.
Moreover, the embedded devices placed onboard the vehicles are
often inexpensive and resource-constrained. For example, typical
GPS tracking devices that are deployed in large commercial fleets
would be based on 8-bit microcontrollers or 32-bit processors
with limited storage. As a result, the onboard devices cannot be
used for long term storage of the analytics. Results of the
relatively short-term analysis should be sent and aggregated at the
server for long term modeling, trend-analysis, and outlier
detection.
MineFleet’s business case relies upon these observations. It is
based on the distributed data mining technology that is driven by
the following capabilities:
2. BUSINESS MOTIVATION
The success of MineFleet is fundamentally based on a strong
business case. There are approximately 25 million commercial
vehicles in North America alone and about 250 million passenger
vehicles in US only. So the total market size is fairly substantial.
Vehicles generate ample data. However, accessing the data,
particularly the ones specific to different manufacturers, is a nontrivial issue from the know-how perspective since many of these
parameters are not publicly available. Upon access to the data,
there are many useful things that can be done (e.g. detect potential
health problems, optimize fuel economy, achieve proper driving
performance, and emissions reductions) by using advanced data
mining techniques. This also implies that any investment to
achieve extensive access to the vehicle performance data from
different manufacturers is likely to create natural protection by
limiting quick-market entry by other competitive entities.
1.
Access to a large number of vehicle subsystem
parameters that are not publicly available.
2.
Onboard analysis of the vehicle data streams using
advanced data stream mining algorithms capable of
supporting high throughput streams (e.g. one tuple of
data every 10 ms).
3.
Aggregation and comparative analytics of the vehicles
connected over a distributed bandwidth-constrained
wireless network.
The following section discusses related work.
3. RELATED BUSINESS PRACTICES
MineFleet is a real-time distributed vehicle performance
monitoring system. To the best of our knowledge this is the first
distributed data mining system for commercial fleet monitoring.
However, it builds on the existing work on vehicle telematics.
Existing vehicle telematics systems collect vehicle performance
data and offer them to the fleet managers or vehicle owners.
OnStar1 for General Motors vehicles and Sync2 from Ford are
examples of such telematics systems. There are some major
differences between the MineFleet and traditional telematics
systems. Some of them are listed below:
There are many other challenges. First of all, vehicles generate
high throughput data streams. Monitoring hundreds of different
sub-system parameters over couple of hours may easily generate
several mega bytes of data and transmitting this data over the
wireless network is a non-trivial challenge. One of the main
reasons is that most fleet owners do not appear to be willing to
pay for a wireless data plan more than 5MB per month or so.
Moreover, most of the fleets that opt for advanced vehicle
performance data mining capabilities also require tracking and
navigation related capabilities. As a result the data transmission is
further constrained. This requires exploring the other option--onboard analysis of the vehicle performance data and
1.
38
Advanced data analytics: MineFleet is powered by
advanced distributed data mining and statistical analysis
algorithms. Most telematics systems are designed for in-
1
www.onstar.com
2
http://www.fordvehicles.com/technology/sync/
moving vehicle using an on-board computing device, identifies
the emerging patterns, and if necessary reports these patterns to a
remote control center over low-bandwidth wireless network
connection.
car infotainment and security application based on
relatively simple data management operations.
2.
3.
Onboard data mining: MineFleet offers dramatic
reduction of wireless communication by performing
data analysis onboard the vehicle. Unlike most
conventional telematics systems, MineFleet sends the
results of the onboard analysis to the server over the
wireless network, not the raw data. As mentioned
earlier, if a device is monitoring hundreds of vehicle
performance parameters it may easily collect about 10
MBs of raw data in about a few hours. Sending this raw
data to the server for advanced data mining at the server
over the wireless network is very expensive. Most
MineFleet customers would not pay for a data plan
beyond 5MB per month. Therefore, analyzing data
onboard the vehicle and sending the resulting analytics
instead of raw data is imperative. One full MineFleet
update takes about 1K. If a vehicle runs for about 8hrs a
day and gets an update once a hour then in 30 days the
vehicle would need about 240K wireless data
communication in order to send the MineFleet analytics
to the server. This dramatic reduction in communication
cost is a unique feature of the MineFleet technology
which enabled more powerful data analysis and mining
at a low cost.
MineFleet also offers different distributed data mining capabilities
for detecting fleet-level patterns across the different vehicles in
the fleet. This section presents a brief overview of the
architecture of the system and the functionalities of its different
modules.
The current implementation of MineFleet analyzes and monitors
only the data generated by the vehicle's on-board diagnostic
system and sometimes the Global Positioning System (GPS).
MineFleet Onboard is designed for embedded in-vehicle
computing devices, tablet PCs, and cell-phones.
The overall conceptual process diagram of the system is shown in
Figure 1. The MineFleet system is comprised of several important
components that are briefly described in the following sections.
4.1 Onboard Hardware
MineFleet Onboard module is comprised of the computing device
that hosts the software to analyze the vehicle-performance data
and the interface that connects the computing device with the
vehicle data bus. Figure 2 shows the MineFleet Onboard Data
Mining platform (MF-DMP101) device that hosts the MineFleet
Onboard software. MineFleet also runs on many different types of
embedded devices, in-vehicle-tablet-PCs, laptops, cell-phones and
other types of handheld devices. Several other hardware platforms
(e.g. DMP-201 from Agnik and other third-party vendors) are also
currently available for running MineFleet Onboard.
Not a GPS-based tracking/navigation system: Unlike
most conventional telematic devices MineFleet is
primarily focused on vehicle performance data analysis
not tracking and navigation.
These unique aspects of the MineFleet distinguish itself from the
conventional tracking/navigation and telematic services. The
following section offers an overview of the MineFleet
architecture.
4. MINEFLEET: AN OVERVIEW
Figure 2. MineFleet Data Mining Platform (MF-DMP101)
that hosts the MineFleet Onboard software. © Copyright,
Agnik, LLC.
4.2 Onboard Data Stream Mining Module
Figure 1. MineFleet architecture. © Copyright, Agnik, LLC.
This module manages the incoming data streams from the vehicle,
analyzes the data using various statistical and data stream mining
algorithms, and manages the transmission of the resulting
analytics to the remote server. This module also triggers actions
whenever unusual activities are observed. It connects to the
MineFleet Server located at a data center through a wireless
network. The system allows the fleet managers to monitor and
analyze vehicle performance, driver behavior, emissions quality,
MineFleet® is a mobile and distributed data stream mining
environment where the resource-constrained "small" computing
devices need to perform various non-trivial data management and
mining tasks on-board a vehicle in real-time. MineFleet analyzes
the data produced by the various sensors present in most modern
vehicles. It continuously monitors data streams generated by a
39
and fuel consumption characteristics remotely without necessarily
downloading all the data to the remote central monitoring station
over the expensive wireless connection.
4.6 Return of Investment
MineFleet offers ROI on many different fronts. For example, the
driver behavior monitoring analytics offer direct ROI by reducing
idling resulting in reduced emission and fuel consumption and
reducing hard braking resulting in less frequent brake shoe
replacement. Wireless emissions monitoring eliminates the need
to send the vehicle to the Smog test center saving around $200 per
vehicle. Fuel consumption analysis improves gas mileage by
identifying sub-optimal conditions of vehicle systems such as O2
sensor. Based on the data from many fleets that have been
running MineFleet, several case studies have been generated. It
appears that MineFleet offers at least about 4-5% reduction in the
fleet monthly operating costs. This is a significant ROI for the
commercial fleet monitoring and mobile workforce management
vertical. Detailed ROI analysis and ROI calculators are also
available.
4.3 MineFleet Server
4.7 Algorithmic Challenges
In order to monitor the vehicle data streams using the on-board
data management and mining module we need continuous
computation of several statistics. For example, the MineFleet Onboard system has a module that continuously monitors the
Figure 3. MineFleet Server. © Copyright, Agnik, LLC.
The MineFleet Server is in charge of receiving all the analytics
from different vehicles, managing those analytics, and further
processing them as appropriate. The MineFleet Server supports
the following main operations: (i) interacting with the on-board
module for remote management, monitoring, and mining of
vehicle data streams and (ii) managing interaction with the
MineFleet Web Services. It also offers a whole range of fleetmanagement related services that are not directly related to the
main focus of this paper. The Server is connected with a relational
database management system where it stores the analytics
received from the vehicles in the fleet. All the onboard diagnostic,
provisioning, and updates are performed over-the-air. Using an
easy-to-use web-based interface, members of the support team
from Agnik and its resellers perform these over-the-air operations.
4.4 MineFleet Web Services
This module offers a web-browser-based interface for the
MineFleet analytics. It also offers a rich class of API functions for
accessing the MineFleet analytics which in turn can be integrated
with third-party applications. Figure 4 shows one of the interfaces
of the MineFleet Web Services. MineFleet is currently offered by
many vendors that have already integrated their web-based
mobile resource management product with the MineFleet webservices.
Figure 4. User interface of the MineFleet Web Services.
©Copyright, Agnik, LLC.
spectral signature of the data which requires computation of
covariance and correlation matrices on a regular basis. The onboard driving behavior characterization module requires frequent
computation of similarity/distance matrices for data clustering and
monitoring the operating regimes. Since the data are usually high
dimensional, computation of the correlation matrices or distance
(e.g. inner product, Euclidean) matrices is difficult to perform
using their conventional algorithmic implementations.
4.5 Privacy Management Module
This module plays an important role in the implementation of the
privacy policies. This module manages the specific policies
regarding what can be monitored and what cannot be. It also
allows the Fleet manager to create an environment where the
MineFleet technology can be used for saving money, sharing
benefits without violating the privacy of the drivers.
The incoming data sampling rate supported by the vehicle data
bus limits the amount of time we get for processing the observed
data. This usually means that we have only a few seconds to
quickly analyze the data using the on-board-hardware (e.g. the
MF-DMP101 device). If our algorithms take more time than what
we have in hand, we cannot catch up with the incoming data rate.
40
collecting diagnostic trouble codes, malfunction indicator lightdata, and analyzing a large number of parameters available
through the diagnostic data port.
In order to handle this situation, we need to address the following
issues:
1. We need fast "light-weight" techniques for
computing and monitoring the correlation,
covariance, inner product, and distance matrices
that are frequently used in data stream mining
applications.
2.
Typically, sensors in the vehicle subsystems generate two types of
data. The observed operation conditions that are relatively
independent variables and the dependent features that change
behavior in response to the changes in the operating condition
variables.
Examples of operating condition variables in
conventional automobiles include the following: Barometric
Pressure, Calculated Engine Load(%), Engine Coolant
Temperature (°F), Engine Speed (RPM), Engine Torque, Intake
Air Temperature (IAT) (°F), Mass Air Flow Sensor 1(MAF)
(lbs/min), Start Up Engine Coolant Temp. (°F), Start Up Intake
Air Temperature (°F), Throttle Position Sensor (%) , Throttle
Position Sensor (degree), Vehicle Speed (Miles/Hour), and
Odometer (Miles).
We need algorithms that will do something useful
when the running time is constrained. In other
words, we allow the data mining algorithm to run
for a fixed amount of time and expect it to return
some meaningful information. For example, we
give the correlation matrix computation algorithm
certain number of CPU cycles for identifying the
coefficients with magnitude greater than 0.7. If that
time is not sufficient for computing all the
correlation coefficients in the matrix then the
algorithm should at least identify the portions of
the matrix that may contain significant
coefficients.
There are also many other features that depend on the operating
conditions. Examples from the fuel sub-system include Air Fuel
Ratio, Fuel Level Sensor (%), Fuel System Status Bank 1 [Categ.
Attrib.], Oxygen Sensor Bank 1 Sensor 1 [mV], Oxygen Sensor
Bank 1 Sensor 2 (mV), Oxygen Sensor Bank 2 Sensor 1 (mV),
Oxygen Sensor Bank 2 Sensor 2 (mV), Long Term Fuel Trim
Bank 1 (%), Short Term Fuel Trim Bank 1(%) , Idle Air Control
Motor Position, Injector Pulse Width #1 (msec), and Manifold
Absolute Pressure (Hg).
In order to illustrate the idea, consider the problem of monitoring
the correlation matrices little more closely. Given an m x n data
matrix U with m observations and n features, the correlation
matrix is computed by UTU assuming that the columns of U are
normalized to have zero mean and unit variance. A straight
forward approach to compute the correlation matrix using matrix
multiplication takes O(m.n2) multiplications which is
computationally very expensive. MineFleet deploys fast
probabilistic algorithms to detect changes in the correlation
matrices that are based on the observation that the sum of squared
values of the elements in the correlation matrix that are above the
diagonal, C=∑1≤j1≤j2≤nCorr2(j1,j2) where Corr(j1,j2)=∑ ui,j1 ui,j2
represents the correlation coefficient between j1--th and j2--th
columns of the data matrix U. Using this observation, one can
design a divide and conquer algorithm for searching in the space
of correlation coefficients for detecting significantly changed
correleation coefficients. More discussion on some of these
algorithms can be found elsewhere [5].
Since operating conditions for a complex vehicle can be diverse,
segmenting the distribution of values can be effective. Once the
data is segmented into different regimes, models for each one of
the regimes should be developed for the different regimes.
MineFleet is powered by many such advanced stream mining
algorithms designed to run in a resource-constrained environment.
MineFleet makes use of distributed data mining algorithms that
reply upon such advanced onboard data analysis techniques and
aggregation of the resulting analytics at the server. This paper
does not discuss the algorithmic issues. Rather it focuses on the
functional capabilities and business case analysis. The following
section describes one of the key capabilities of MineFleet--Predictive Health Monitoring.
5. PREDICTIVE HEALTH MONITORING
Figure 5. Example of a vehicle health test designed based on
the domain knowledge and statistical analysis of data. ©
Copyright, Agnik, LLC.
This section provides insight into the vehicle health monitoring
module of the MineFleet system. Predictive vehicle health
monitoring is very important in many fleets since breakdown of a
vehicle on the road is often very expensive because of the
downtime, unpredictability, and often increase in cost.
MineFleet performs a large number of health tests onboard the
vehicle and if any of the tests fail MineFleet would report that to
the server along with its recommended severity level. Figure 5
shows one such example.
Predictive health monitoring in cars usually involves processing
multitude of information available from the diagnostic data bus
and possibly correlating that with maintenance data. This includes
41
MineFleet also assigns a health score to each vehicle by
aggregating the results of the health tests performed over a certain
period of time. Figure 6 shows the interface for identifying the
vehicles with poor health score using a heat-map like interface.
Red zones represent vehicles with poor health scores. User can
easily click on those regions in order to dig up more information
about those troubling vehicles. A tabular view of the vehiclehealth scores is also available in MineFleet.
6. FUEL CONSUMPTION ANALYSIS
The MineFleet fuel consumption analysis module offers many
unique capabilities to compute the fuel economy of a
vehicle/fleet, perform trend analysis of various kinds, and
correlate that with various vehicle and driver performance
parameters.
Figure 8. Variation of Mass Air Flow with respect to Engine
Speed and Engine Load.
Figure 6. The vehicle health score visualization interface in
MineFleet. The color coded heatmap interface allows the fleet
manager to quickly identify the vehicles with poor health
score and drill down to find out the reason behind it. ©
Copyright, Agnik, LLC.
Typical vehicle fuel subsystems are high dimensional and
modeling the data onboard the vehicle requires feature selection
based on domain knowledge and representation construction
using various techniques such an eigen analysis and other
orthogonal transformations. For example, consider Figure 8 which
shows the variation of mass air flow with respect to engine speed
and engine load. The relationship is fairly non-linear. A
comprehensive analysis of the fuel subsystem would typically
require including many additional parameters. As a result, near
orthogonal transformations similarity-preserving transformations
are often very useful. Figure 9 shows the transformation of the
data in Figure 8 in the eigenspace.
Figure 10 shows an example of how MineFleet offers an ROI by
linking vehicle health condition with performance parameters
such as fuel economy. The figure shows the user interface that
shows the top five ways to improve the fuel economy that are
identified by various onboard data mining techniques. It also
offers fuel savings calculator driven by predictive models learnt
from the data collected from that vehicle. The resulting ROI is
direct, simple to understand, and execute. This allows the fleet
manager to decide what to do when a particular vehicle health
condition arises. Using MineFleet the fleet manager can quantify
Figure 7. Predictive vehicle maintenance data analysis
module. © Copyright, Agnik, LLC.
Figure 7 shows the interface for a module that analyzes the
vehicle maintenance data and links that with the vehicle
diagnostic data. The goal is to detect unusual patterns in the
vehicle maintenance operations and identify their reasons.
42
Figure 9. Modeling through advanced engine analysis.
© Copyright, Agnik, LLC.
Figure 11. Predictive fuel consumption analysis module of
MineFleet. © Copyright, Agnik, LLC.
the main capabilities of the MineFleet driver behavior monitoring
module with short-term return on investment are listed below:
1)
Identify the speeding, braking, idling characteristics of
the driver and use that for driver retraining policy
execution.
2)
Assign performance measures to the drivers based on
various characteristics and identify outlier drivers.
3)
Identify unusual maintenance operations caused by suboptimal driver performance.
Figure 12 shows the MineFleet interface for quantifying the effect
of various driving characteristics on fuel economy. For example,
the fuel savings calculator shows the effect of idling on fuel
economy and quantifies the saving. The following section
discusses the emissions monitoring capabilities of MineFleet.
Figure 10. Correlation of vehicle health events with fuel
economy. The fuel savings calculator quantifies the effect
on fuel economy. © Copyright, Agnik, LLC.
how much money the organization is likely to save by fixing the
health condition. Similar analysis is also performed for driver
behavior which will be discussed in the following section.
Figure 11 shows the MineFleet fuel subsystem benchmarking
module where the distributions of a vehicle can be compared with
those of other vehicles. The module can also be used to optimize
the fuel economy by changing the policy parameters prior to
designing a policy. For example, one may vary the speeding
policy and find out the optimal fuel economy based on the
predictive models learnt from that vehicle.
7. DRIVER BEHAVIOR MONITORING
MineFleet allows the fleet owner to monitor both the short-term
and long-term behaviors of the drivers in a fleet. MineFleet
Onboard monitors the driving related data characterized by speed,
acceleration, braking, idling and several other parameters. It also
correlates the information with vehicle performance parameters
(e.g. fuel economy) and fleet maintenance parameters. Some of
Figure 12. Correlation driver behavior with fuel economy.
© Copyright, Agnik, LLC.
43
bigger picture correlating emission data with data collected from
the different facets of fleet operations.
8. EMISSIONS MONITORING
Greenhouse gas (GHG) emissions that contribute to climate
change are a global problem. Although future concentrations,
damages and costs are unknown, it is widely recognized that
major emissions reduction efforts are needed. Of the four primary
GHG under scrutiny, carbon dioxide (CO2), and the need to lower
carbon emissions in general, is of paramount concern. It is
estimated that transportation activities are responsible for
approximately 25% to 30% of total U.S. GHG emissions, with the
on-highway commercial truck market accounting for over 45% of
transportation GHG.
However, the transportation sector
emissions remain almost entirely unaddressed with respect to
GHG and CO2 reduction.
The Intergovernmental Panel on Climate Change (IPCC) provided
guidelines for calculating carbon emission offer estimations only
for certain common types of fuels; even the estimates are not
available novel fuel blends and gaseous fuels such as CNG and
LNG. Indeed, these and other references have documented the
uncertainty in model-based theoretical carbon emissions
calculations3 and the need for a standardized, consistent method
of accurately characterizing CO2 emissions. Moreover, correlating
various vehicle performance and traffic parameters may open up
new insights resulting in better techniques for controlling
emissions. For example, it is widely known that vehicle speed,
engine load and state of repair/maintenance play important roles
in governing emissions. Mining the emissions data along with the
traffic patterns in a metropolitan area, vehicle performance (load,
rpm, and vehicle oxygen sensor characteristics) and the driving
behavior may provide useful information to design speed limits,
traffic signals and fleet maintenance policies. Such advanced
analysis of emission data will be possible only when we can
directly and accurately measure emissions in the vehicle.
Figure 13. Emissions monitoring web-page in MineFleet. ©
Copyright, Agnik, LLC.
The emissions offset trading market and the demand for cleaner
transportation systems is driving several market incentives. Figure
14 shows the web portal of one such carbon offset trading
company. The MineFleet technology offers a verifiable
methodology to quantify the greenhouse and air-pollution
emissions in a vehicle in real-time. As a result, this allows
accurate computation of the carbon offsets and reductions in a
commercial fleet which lays the foundation of the business of
carbon trading.
MineFleet offers some of these possibilities. For example,
MineFleet can be used for wireless emissions test. It can measure
the emissions data in real-time, correlate that with the vehicle
performance and traffic data using advanced statistical and
machine learning-based techniques such as clustering, predictive
modeling, correlation analysis and eigen analysis. These analytics
can be used to offer a new generation of decision support tools to
develop fleet and greenhouse gas emissions management policies.
MineFleet computes emissions in real-time onboard the vehicle. It
also performs various other tests such as the wireless emissions
test required by motor vehicle administrations. Figure 13 shows
the emissions monitoring web-page of MineFleet Web Service.
Vehicle emission characteristics depend on different vehicle and
driver-related parameters. Vehicle health is often a function of the
type of the vehicle, maintenance policies and operating policies
(e.g. delivery schedule of supply truck). Driver behavior is also
correlated with traffic condition and driver training programs in a
commercial fleet. Therefore, the next generation of decisionsupport tools for emissions management will have to look at the
3
Figure 14. Web portal of a carbon trading company.
9. MINEFLEET IN BUSINESS:
ENGINEERING COMPLEX ECO-SYSTEM
EPA OTAQ Publication, no. EPA 420-F-05-001, Average Carbon
Dioxide Emissions Resulting from Gasoline and Diesel Fuel,” February,
2005, notes the following: “These calculations and the supporting data
have associated variation and uncertainty. EPA may use other values in
certain circumstances, and in some cases it may be appropriate to use a
range of values.”
The basic tenets of the value proposition in any business often
depend upon the following NABC cornerstones:
1.
44
N: What is the customer/market need?
2.
A: What is our specific approach to satisfy that need?
3.
B: What are the benefits that the customers and their
affiliates will get from the approach?
4.
C: What is the competition or alternative to the
approach?
companies. MineFleet addressed this problem by going to market
only through its resellers and channel partners. Alliance with
fairly large companies with large marketing infrastructure helped
gaining market share.
MineFleet product design choices also highly influenced the
evolution of the business eco-system and its sustainability. Figure
15 shows the conceptual depiction of the MineFleet product in
2003. It had a PDA, Bluetooth GPS module and the vehicle
diagnostic port adapter. This conceptual model evolved a lot over
time in order to support a sustainable relationship with MineFleet
go-to-market channel partners and resellers. For example, the
PDA-based approach was not adopted because of the high cost
issue. On the other hand, the Bluetooth GPS module was dropped
mainly to build a relationship with many other vendors that offer
a GPS tracking solution.
Note that the NABC (Need, Approach, Benefit, and Approach)
tenet depends upon the behavior of the customers, their affiliates,
the business offering the products/services, and the competition
(underscored terms in the itemized list). This essentially means
that the value proposition of any product or service depends on
the collective behavior of the entire business eco-system
comprised of the provider, consumer, competition and others.
Moreover, Agnik as an early stage company focused not just on
value creation rather sustainable value creation where the
business relationships among Agnik and its go-to-market partners
for the MineFleet product would be able to sustain the challenges
faced by many early stage technology companies.
The MineFleet system penetrated the market by evolving rules of
engagement that aided sustainable relationship among the
different players of the Mobile Resource Management vertical.
Initial MineFleet product placement faced several challenges.
Some of those are listed below:
1.
Quantification of ROI.
2.
Lack of familiarity with the data mining technology in
the target vertical.
3.
Lack of large marketing infrastructure.
4.
Lack of adequate support infrastructure.
Figure 15. Early conceptualization of the MineFleet Onboard
system. © Copyright, Agnik, LLC.
Each of these topics is discussed further below.
The above go-to-market approach also helped the support
scenario. The need for large support infrastructure was avoided by
training the support team of the go-to-market channel partners and
resellers. This alleviated the load on the Agnik team to develop an
extensive on-the-ground installation and support team for
MineFleet.
Today, the MineFleet system offers many market-tested features
that offer direct short-term ROI and enough case studies exist to
back up the claims. However, this was not the case when
MineFleet was initially introduced to a select group of potential
clients in the early stage. Active collaboration between MineFleet
team and other organizations that were willing to explore the
technology resulted in the development of many useful features in
MineFleet with immediate ROI.
MineFleet is widely adopted by many companies in the Machineto-Machine and GPS tracking verticals. For samples names of
such clients please visit the Agnik web-site. MineFleet is already
integrated with several large vehicle-onboard hardware
manufacturers. MineFleet-powered third-party solutions are
currently being deployed through many of Agnik’s channel
partners each with more than hundred thousands of vehicles in
their respective rosters. Example of some of those clients are
listed at Agnik website. A detailed report4 analyzing MineFleet’s
technical and business approaches is available from Frost &
Sullivan. A copy of the detailed report is available upon request.
MineFleet is available in the software-as-a-service model. The
following section concludes this paper.
The initial experimental versions of the MineFleet system was full
of many features that required advanced knowledge of data
analysis and modeling techniques. The interface looked like the
traditional data mining systems that commercially available. This
approach did not work. Advanced visualization and analytic tools
often had to be either replaced or backed up by simple text-based
actionable intelligence. One the main reason was that the typical
fleet management executives are usually not very familiar with
the statistics and data mining technology. The user interface had
to be non-threatening and relatively easy to understand. Once the
vertical became familiar with the role of data mining technology
to some extent, advanced analysis and visualization techniques
could be introduced.
4
Another major challenge was the lack of large marketing
infrastructure, which is probably common for many early stage
45
http://finance.yahoo.com/news/Agnik-Enhances-Mobileprnews-3142016515.html?x=0&.v=1
Distributed Data Mining. Advances in Distributed and
Parallel Knowledge Discovery, Eds: Hillol Kargupta and
Philip Chan. MIT/AAAI Press.
10. CONCLUSIONS
This paper offered an overview of the MineFleet system and the
business case behind it. It described the architecture, main
functionalities, and how these features are useful in solving the
everyday problems in commercial fleet management. The paper
also shared some of the experiences in placing a new distributed
data mining technology-based product in a vertical that was not
very familiar with advanced decision support systems. The paper
identified some of the engagement rules that evolved during the
course of time resulting in successful partnership between the
existing products from the mobile resource management
companies and the MineFleet.system.
[4] H. Kargupta, R. Bhargava, K. Liu, M. Powers, P. Blair, S.
Bushra, J. Dull, K. Sarkar, M. Klein, M. Vasa, and D.
Handy. (2004). VEDAS: A Mobile and Distributed Data
Stream Mining System for Real-Time Vehicle Monitoring.
Proceedings of the SIAM International Data Mining
Conference, Orlando.
[5] H. Kargupta, V. Puttagunta, M. Klein, K. Sarkar (2006). Onboard Vehicle Data Stream Monitoring using MineFleet and
Fast Resource Constrained Monitoring of Correlation
Matrices. Next Generation Computing. Invited submission
for special issue on learning from data streams, volume 25,
no. 1, pp. 5--32, 2007.
MineFleet is probably the first commercially successful widely
adopted distributed data mining system for a new vertical where
data mining systems were not used before. The development of
MineFleet and its adoption in the mobile resource management
and fleet management industry came through long-term
interactions with the leading companies in that vertical. It
required adopting a different architecture for the data mining
system. Unlike the traditional centralized data mining system
commonly used in the most applications today, MineFleet
adopted the distributed data mining technology where data must
be analyzed in a distributed manner and then aggregated at the
server for comparative analysis.
[6] B. Park and H. Kargupta (2002). Distributed Data Mining:
Algorithms, Systems, and Applications. Data Mining
Handbook. Editor: Nong Ye.
[7] S. Krishnaswamy, S. Loke, A. Rakotonirainy, O. Horovitz,
and M. Gaber. (2005) Towards Situation-awareness and
Ubiquitous Data Mining for Road Safety: Rationale and
Architecture for a Compelling Application, Proceedings of
Conference on Intelligent Vehicles and Road Infrastructure
(IVRI’05), held at the University of Melbourne, pp. 16-17
February 2005.
11. ACKNOWLEDGMENTS
[8] S. Pittie, H. Kargupta, and B. Park. (2003). Dependency
Detection in MobiMine: A Systems Perspective. Information
Sciences Journal. Volume 155, Issues 3-4, pp. 227-243,
Elsevier.
We thank Agnik for supporting the work and this
publication. We would also like to thank the large number of
developers involved with this project at Agnik. We particularly
thank the following individuals for their contributions to the
development of the MineFleet system: Nick Lenzi, Derek
Johnson, Subhash Paruchuru, Robert Gilligan, Barnali Sinha,
Parag Namjoshi, Thiraphat Pongsudhiraks, Jacob Graham,
Kamalika Das, Michael Beck, Padma Sethu, Brian Bende, Martin
D. Klein, James Dull and Patrick T. Joyce. We would also like to
thank all our channel partners for marketing the MineFleet
product.
[9] A. N. Srivastava, W. Buntine. (1995). Predicting Engine
Parameters using the Optical Spectrum. Proceedings of the
AIAA Electrochemical Conference.
[10] A. N. Srivastava, J. Stroeve. (2003). Onboard Detection of
Snow, Ice, Clouds, and Other Processes. Proceedings of the
ICML 2003 Workshop on Machine Learning Technologies
for Autonomous Space Sciences. International Conference
on Machine Learning.
12. REFERENCES
[11] H. Dutta, H. Kargupta, and A. Joshi. (2005). Orthogonal
Decision Trees for Resource-Constrained Physiological Data
Stream Monitoring using Mobile Devices. Proceedings of the
High Performance Computing Conference.
[1] S. Datta, K. Bhaduri, C. Giannella, R. Wolff, H. Kargupta.
(2006). Distributed Data Mining in Peer-to-Peer Networks.
(Invited submission to the IEEE Internet Computing special
issue on Distributed Data Mining), Volume 10, Number 4,
pp. 18--26.
[12] S. Pirttikangas, J. Riekki, J. Kaartinen, J. Miettinen, S.
Nissila, and J. Roning. (2001). Genie of the Net: A New
Approach for a Context-Aware Health Club. Workshop
Title: Ubiquitous Data Mining for Mobile and Distributed
Environments. Joint 12th European Conference on Machine
Learning (ECML'01) and 5th European Conference on
Principles and Practice of Knowledge Discovery in
Databases (PKDD'01). September 3-7, 2001, Freiburg,
Germany.
[2] H. Kargupta and K. Sivakumar, (2004) Existential Pleasures
of Distributed Data Mining. Data Mining: Next Generation
Challenges and Future Directions. Editors: H. Kargupta, A.
Joshi, K. Sivakumar, and Y. Yesha. AAAI/MIT Press.
[3] H. Kargupta, B. Park, D. Hershberger, and E. Johnson
(1999). Collective Data Mining: A New Perspective Toward
46
Download