FirstDIG First Data Investigation on the Grid Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131 650 5155 email:t.sloan@epcc.ed.ac.uk Description First plc - UK’s largest public transport operator Data sources Huge range – mileage, revenue, fuel, maintenance, routes … Collected – manually, ticket machines, GPS … Disparate DBMS Acquisitions, historical, OS, physical location, representation … Issues NOT unique to the bus industry Fine for day to day operations, but … Business questions – data from >1 source Complaints vs Lateness, Revenue vs Lost Miles … Aggregation – by service, by day, weekdays only … Introduces challenges for data analysis Description First South Yorkshire situation No common interface No common reporting process Statistics produced manually when required Labour intensive Not performed often or well Process to produce what is needed Expensive Impractical Description and Aims Open Grid Services Architecture: Data Access and Integration Assists with the access and integration of data from separate data sources via Grid Services Our remit:To evaluate the suitability of the use of OGSA-DAI in a commercial environment. If OGSA-DAI: Is appropriate, secure, straightforward to deploy and use … Does what we need! Provide feedback to OGSA-DAI team Aims 1. 2. Demonstrate deployment of OGSA-DAI within the First South Yorkshire bus operational environment and learn from it Short data analysis using OGSA-DAI service enabled data sources to answer business questions posed by First South Yorkshire Status: Workpackages WP 1: Data Source requirements capture (FINISHED) D1.1 Data Source Requirements Capture & D1.2 Organisation Data Schema (COMMERCIAL-IN-CONFIDENCE) WP 2: Development of data interfaces (FINISHED) OGSA-DAI Deployment WP 3: Deployment & refinement of OGSA-DAI (FINISHED) First Data Service Browser User Guide First Data Service Browser Software WP 4: Data mining requirements capture (FINISHED) D4.1 Data Mining Requirements Capture (COMMERCIAL-INCONFIDENCE) WP 5: Initial data mining analysis (FINISHED) D5.1 Initial Data Mining Report (COMMERCIAL-IN-CONFIDENCE) WP 6: Data mining detailed analysis (FINISHED) D6.1 Final Data Mining Report (COMMERCIAL-IN-CONFIDENCE) Technical Achievements 1 Data Mining Combined two databases to answer First’s business questions The Customer Contact System Microsoft Access Information on customer complaints e.g. time, service, nature The Mileage database dBASE IV Information on bus mileage e.g. lost miles Also investigated Revenue and Schedule Adherence suitability for data mining Produced detailed data mining report Technical Achievements 2 OGSA-DAI deployment at First South Yorkshire Created Grid Data Services for DBMS previously unsupported by OGSA-DAI MS Access – CCS, dBASE IV – Mileage Investigated GDS for SQL Server and CVS-based DBMS Rigorously exercised use of OGSA-DAI in a commercial setting: Identified numerous areas for improvement in OGSADAI Identified new requirements for use of OGSA-DAI in business Confirmed the relevance and potential of OGSA-DAI for business Technical Achievements 3 Data Service Browser Identified need to aid ‘ease of use’ for OGSA-DAI Middleware Developed a generic Grid Data Service Browser Simple GUI – avoids XML etc Allows SQL queries and updates to databases Enables JOIN queries across databases Will be included in future OGSA-DAI releases … demo later Achievements – First’s perspective Project has proven that: There is a cost-effective solution that First South Yorkshire can utilise First can get to its data and analyse it in a useful manner With considerably reduced labour time First can produce more accurate and more wideranging information for the business management Achievements “the results of this exercise will revolutionise the way we do things in the bus industry” Darren Unwin Divisional IT Manager Dissemination Presentations Ernst & Young, WestInfo Services, Strategy & Performance Associates, SingTel Optus, Executive Briefing Centre, Curtin Business School, Curtin University of Technology, Perth Australia, February 24th, 26th, 2004. Curtin Business School Information Systems Seminar, Curtin University of Technology, Perth, Australia, February 20th 2004 UK e-Science booth, Supercomputing 2003, Phoenix, USA, November 2003 Flyers UK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003 Posters UK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003 Articles T.M.Sloan, A.Carter, P.J.Graham, D.Unwin, I.Gregory, "First Data Investigation on the Grid: FirstDIG", Proceedings of the 2nd UK eScience All Hands Meeting, 2-4 September, 2003, Nottingham, UK Exploitation First Data Service Browser is being used and extended in the INWA project with Curtin Business School, Perth, Australia First are keen to extend their deployment to other databases Future Plans Project is finished, no effort remaining. Incorporation of First Data Service Browser into future releases of OGSA-DAI First South Yorkshire want to build management reporting applications based on OGSA-DAI Demo Data Service Browser Accessing three different DBMS Mileage, CCS, MySQL A JOIN – similar to the queries required for the data mining Easy within one DB, requires intermediary steps for distributed DB Without OGSA-DAI would have been impractical Looking at Lost Miles and Customer Complaints Run the Demo Lost miles and Number of Complaints 350 300 250 200 Lost miles 150 Complaints 100 50 Date 29 /0 4/ 20 02 22 /0 4/ 20 02 15 /0 4/ 20 02 08 /0 4/ 20 02 01 /0 4/ 20 02 0 In Conclusion Successfully demonstrated the use of Grid middleware in a ‘real-world’ environment OGSA-DAI team: Gained (in)valuable feedback Incorporated Data Service Browser First Discovered valuable information from their data which would have otherwise been practically unobtainable Keen to extend to other DBMS