New PRS in RDBMS Platform and Nget Study of existing IT Systems of Indian Railways running in CRIS Centre For Railway Information Systems Outline of Presentation Introduction to PRS System & NGeT Achievements and Limitations of Present System Network Issues related to Present System RDBMS Platforms for New PRS Solutions other than RDBMS ◦ No SQL ◦ New SQL Conclusion Centre For Railway Information Systems Introduction to PRS System And NGeT Centre For Railway Information Systems Brief History Centre For Railway Information Systems CONCERT CONCERT - COUNTRY-WIDE NETWORK FOR COMPUTERIZED ENHANCED RESERVATION AND TICKETING CONCERT, a fully automated passenger reservation system It allows passengers to do a booking from anywhere for a journey in any train in any class from anywhere to anywhere across counter. It handles reservations, modifications, cancellation/ refunds, 39 supervisory on-line functions ,and 30 on-line enquiries. Centre For Railway Information Systems CONCERT(cont.) It handles 265 concessions, 165 coach types and 40 types of quotas Currently PRS network has more than 2600 POPs (points of presence i.e. booking locations) spread across the country with around 9100 POS (points of sale i.e. terminals). Easy fare calculation and accounting Easy Enquiry MIS system is associated with PRS data to generate predefined MIS reports Accounting MIS Charting MIS Data Warehouse Centre For Railway Information Systems CONCERT(cont.) Based on the distributed Architecture. 4 Sites (Delhi, Mumbai, Chennai, Kolkata) RTR as Middleware for transaction routing and reliability. Reservation based on the TDRC (Train, date, Route, Class) ARP Rules Different Quota & Classes Centre For Railway Information Systems PRS APPLICATION ARCHITECTURE IRCTC Internet Booking System 139 TOUCH SCREEN Firew all Enquiry Charting Internet Client Reservation FRONT-END SYSTEMS Indian Rail website WEB Server CONCERT Interconnect Network BACK-END SYSTEMS DELHI DATA CENTRE CHENNAI DATA CENTRE STORAGE FRONTEND SERVERS BACKEND SERVERS Centre For Railway Information Systems MUMBAI DATA CENTRE KOLKATA DATA CENTRE Booking Flow Fill forms Validate PNR generation Fare Calculation Start Txn Allocation PNR update Commit Display Details Seat /berth Fare Centre For Railway Information Systems Proceed Proceed/Flush Flush Flush Release berth PNR Flush Proceed Accounting Print Ticket yes Non Issue Success No Return to Form Centre For Railway Information Systems Introduction to NGeT Next Generation e-Ticketing System It is one client to PRS Can book ticket from any location at any time Easy to generate reports for NGeT transactions Paperless ticket Complete interface software between the IRCTC front-end server and the back end Alpha server. Complete e-reservation and enquiries back end servers. Ticket printing and reset facilities in existing client. Accounting reports for the IRCTC transactions. Centre For Railway Information Systems The Technology used PRS integration Layer & Centralised Server NGeT Centre For Railway Information Systems Achievements of Present System Centre For Railway Information Systems Achievements of PRS System • • • • • • • PRS application is capable of handling 200 Booking Transaction(Tickets) Per Second per site . Overall TPS of 800 (based on performance testing done for the system in year 2011-during implementation stage of Itanium Migration). This is equivalent to 28.8 lakh tickets per hour. Number of passengers booked per day: 15 Lakhs Transactions per second: ◦ a. Tickets Booked per second (peak): 550 ◦ b. Enquiries served per second (peak): 5600 Point of sale: 9100 Number of Reserved Trains: 1500 Centre For Railway Information Systems Transaction volume Un Reserved Reserved Average number of tickets booked per day 75 Lakhs 10 Lakhs Average number of passengers booked per day 200 Lakhs 15 Lakhs Average earning per day Rs. 50 Crores Rs 77 Crores Total number of terminals 10,408 9,100 Total number of booking locations 5,676 2,639 Transaction Type Number of Transactions per day Bookings 10 Lakhs Cancellations 2.5 Lakhs Enquiries 110 Lakhs Special Transactions 0.5 Lakhs Total 123 Lakhs Centre For Railway Information Systems MILESTONES ACHIEVED BY NGeT Number of maximum concurrent sessions on nget.irctc.co.in was 1,33,710 on 13th August, 2014. Ticket Booking Statistic Maximum booking per day on 27th August’ 14 6,24,855 Maximum booking in 08-09 on 27th August’ 14 1,35,561 Maximum booking in 10-11 on 16th August’14 1,27,489 Maximum booking per minute on 27th August,'1410,554 Maximum booking per sec. on 18th August’14 289 Average transaction response time for user to book ticket on NGeT excluding payment is 40-55 sec. Centre For Railway Information Systems Issues Ad-hoc query on the online database is not available in PRS system Data is maintained in flat-file format and extraction of any information from the database in flat-file format requires development of routines to do the same. • There are peripheral systems available from which adhoc queries are answered to some extend (GUIDBA & Data Warehouse) • Migration of Accounting and Charting MIS on RDBMS is required to facilitate the system in providing response to all types of ad-hoc queries. End of support for OpenVMS by HP. Centre For Railway Information Systems Network Issues Related to Present System Centre For Railway Information Systems PRS Network It Consist of Backbone network and UTN network Backbone Network Inter-connects all PRS datacenters and CRIS datacenter with each other UTN Network is Multi-tier hierarchical Network connected to corresponding PRS DC Datacenters- NDLS, MAS, NKG, CSTM & Chanakyapuri Centrally managed from NOC in CRIS, Chanakyapuri till Tier-1 Location Implemented in 1999 for supporting PRS (CONCERT) networked transactions in year 1999 Full Mesh Topology in Backbone UTN Network - Inverted Tree with Partial Mesh Centre For Railway Information Systems PRS Design Considerations High Availability (>99.9% upto Tier-1, 99.8% overall) Scalability Lower Round Trip Time (<150 ms) Lower routing convergence time Route Diversity / Service Provider Diversity. Better Manageability Centre For Railway Information Systems Eastern Railway North Eastern Railway Northern Railway South Eastern Railway NKG East Coast Railway North Central Railway North Western Railways South East Central Railway East Central Railway North Frontier Railway Kolkata Delhi Chennai MAS RIDC NGeT DC Southern Railway CRIS South Western Railway Mumbai DR DC at SC Secunderabad Central Railway South Central Railway CSTM West Central Railway Western Railway Centre For Railway Information Systems UTN Network architecture PRS Servers Zonal Router Area-1 Area-3 Area-2 different Service Providers Links to other locations Tier 2/3 Location routers Tier-1 Locations with Routers in 1:1 High Availability (HA) Mode Centre For Railway Information Systems Network Failure in PRS 15% 5% 35% Power failure Channel Failure 20% Packet Loss Routing Issues 25% Hardware Failure Centre For Railway Information Systems RDBMS Centre For Railway Information Systems Relational Model Primary data model for data storage and processing. Based on first order predicate logics. Data is stored in format of table representing some entity. Set of tuples(rows) representing relational instance, no duplicate row. Columns represents attributes of the stored record . Tables can relate each other, relation is depicted using Entity-Relational diagram. Centre For Railway Information Systems Relational Model Train Info Train Date Route Class 12345 01-07-2015 A-B 1 23456 01-01-2015 C-D 2 Passenger Information Passenger_Name Age ABCD 20 PQRS 22 Sex M F Seat Allocation Train Date Route Class Seat PNR 12345 01-01-2015 A-B 1 10 12345678 23456 01-01-2015 C-D 2 12 23245664 PNR 12345678 23245664 Centre For Railway Information Systems DBMS DBMS- Set of programs designed to store, manage and process database. Convenient and efficient way to store and retrieve database information. Interface between database and application program. Centre For Railway Information Systems DBMS Architecture Centre For Railway Information Systems DBMS Characteristics ◦ Multiple Views (Data Abstraction) ◦ Isolation of data and application (Program data independency) Logical Data Independence Physical Data Independence Centre For Railway Information Systems DBMS Characteristics Cont. ◦ Relation-based tables ◦ Less Redundancy Normalization of data. ◦ Consistency Database transaction changes affected data only in allowed ways. ◦ ◦ ◦ ◦ ◦ Query language Transaction Management Multiuser and concurrent access. Recovery System Security Centre For Railway Information Systems How RDBMS can serve PRS its needs Structured Data ◦ High Degree of organization ◦ Readily searchable by simple SQL engines on key value. Programming language support. ◦ ODBC(Open Database Connectivity) ◦ JDBC(Java Database Connectivity) SQL (Structured Query Language) ◦ PRS generates several type of MIS reports each one having its own program. ◦ SQL allows to query database for the conditions specified. ◦ No need for separate schedule creation. ◦ Reduced development cost. Centre For Railway Information Systems How RDBMS can serve PRS its needs Transaction Management ◦ Transaction must satisfy ACID properties. Database Transaction Management Centre For Railway Information Systems How RDBMS can serve PRS its needs Transaction Management ◦ In PRS a transaction updates the berth based on TDRC in one file and update Berth to PNR relation in another. ◦ These two updates need to be done atomically. ◦ Atomicity and durability are done by keeping track of each update made by transaction. ◦ In case of failure DBMS undoes all the updates made by the transaction, thus ensuring atomicity. ◦ In case of successful transaction DBMS commits the updates. Database Transaction Management Centre For Railway Information Systems How RDBMS can serve PRS its needs The commit could be done using two phase commit to ensure both commit are successful. Database Transaction Management Centre For Railway Information Systems How RDBMS can serve PRS its needs Isolation among transaction is handled by locking. ◦ Locking is record level (TDRC). ◦ Prevents other transactions from reading/updating the same record. ◦ Releases lock after completion of two phase commit. Consistency: Any data written to the database must be valid according to all defined rules,. ◦ DBMS check if X is following all rules specified. ◦ Verifies Referential integrity constraints are satisfied. Database Transaction Management Centre For Railway Information Systems How RDBMS can serve PRS its needs Indexing ◦ Access path for faster database record access. ◦ Indexed access of record on key. ◦ RDBMS support B/B+ tree index Automatic Re- organizes with changes. Time required to search is order of logpN. ◦ Reduced read/write latency. PRS uses RMS service in OpenVMS for record access Database Access Methods Centre For Railway Information Systems How RDBMS can serve PRS its needs B+ tree indexing Database Access Methods Centre For Railway Information Systems How RDBMS can serve PRS its needs • Hashing TDRC records are numbered for random access. Hashing in RDBMS can provide random access to database records. Database Access Methods Centre For Railway Information Systems Database Partitioning/Sharding Database Sharding/Partitioning ◦ Database Sharding can be done to improve performance when tables get large in size. ◦ Sharding is partitioning database horizontally. ◦ Index size reduces. ◦ Different shards can be deployed on deferent servers. ◦ Manage parallel access in the application Partition tables map keys to nodes Application decides where to route storage or lookup requests ◦ Scales well for both reads and writes Centre For Railway Information Systems Database Partitioning/Sharding • Limitations of Database Partitioning Not transparent • application needs to be partition-aware Increase database complexity Centre For Railway Information Systems How RDBMS can serve PRS its needs Database Compression ◦ Saves disk space, thus reducing storage cost significantly. ◦ Reduces memory use in the database buffer cache ◦ Significantly speed query execution during reads. Centre For Railway Information Systems How RDBMS can serve PRS its needs Security mechanism ◦ ◦ ◦ ◦ ◦ Identification and Authentication Authorization and Access Control Encryption Auditing Role-Based Access Control for Multilevel Security Centre For Railway Information Systems Limitations of RDBMS Scalability ◦ The major constraint with RDBMS is that relational databases are not largely scalable. Parallelism ◦ Not capable of providing high level of parallelism. ◦ Increasing number of nodes increases database complexity to unmanageable levels. Performance ◦ If size of tables is large tables themselves effect performance in responding to SQL queries. Centre For Railway Information Systems Different RDBMS Solutions and Their Performance Comparison Centre For Railway Information Systems Base of different solutions Each RDBMS have different features like ◦ Interface (GUI, SQL, others) ◦ Language Support (C, C#,C++, Java, Ruby, Objective C, and many more) ◦ Operating Systems (Windows, Linux, Solaris, HP-UX, OS X, z/OS, AIX, FreeBSD, Solaris) ◦ Licensing (Proprietary, Open source, Proprietary) Centre For Railway Information Systems Some Promising RDBMSs Maintaine r First Public Release Date Latest Stable Version Latest Release Date License Oracle Oracle Corporatio n 11-1979 12cRelease 1 25-06-2013 Proprietary MySQL Oracle Corporatio n 11-1995 5.6.26 24-07-2015 GPL v2 or Proprietary MemSQL MemSQL 06-2012 1.8 12-2012 Proprietary 1989 2014 18-03-2014 Proprietary 04-06-2015 PostgreSQL License SQLServer Microsoft Postgre SQL PostgreSQL 06-1989 Global Developmn et Group 9.4.3 DB2 IBM 10.5 1983 23-04-2013 Centre For Railway InformationProprietary Systems MySQL MySQL is the most popular one of all the large-scale database servers Feature rich Stand-alone database server open-source Powers a lot of web-sites and applications online. Centre For Railway Information Systems MySQL Advantage Disadvantage • • • • Easy to work with • Known Feature Rich Limitations Secure • Reliability Issues Scalable and • Stagnated Powerful Development • Speedy Centre For Railway Information Systems MySQL When to use When not to use • Distributed Operations • High Security • Web-sites and Web Applications • Custom solutions • SQL compliances • Concurrency • Lack of Features Centre For Railway Information Systems PostgreSQL Advanced, open-source [object]-RDBMS the main goal of being standards-compliant and extensible tries to adopt the ANSI/ISO SQL standards together with the revisions. it support highly required and integral object-oriented and/or relational database functionality, such as the complete support for reliable transactions, i.e. ACID. Centre For Railway Information Systems PostgreSQL extremely capable of handling many tasks very efficiently Support for concurrency is achieved without read locks thanks to the implementation of Multi version Concurrency Control (MVCC), which also ensures the ACID compliance. highly programmable, and therefore extendible, with custom procedures that are called "stored procedures". Centre For Railway Information Systems PostgreSQL Advantage Disadvantage • An open-source • Performance SQL standard • Popularity compliant RDBMS • Hosting • Strong community • Strong third-party support • Extensible • Objective Centre For Railway Information Systems PostgreSQL When to use When not to use • Data integrity • Speed • Complex, custom • Simple set ups procedures • Replication • Integration • Complex designs Centre For Railway Information Systems Operating System Support Windows OS X Linux BSD UNIX Oracle Yes Yes Yes No Yes MySQL Yes Yes Yes Yes Yes SQL Server Yes No No No No PostgreSQL Yes Yes Yes Yes Yes DB2 Yes Yes YEs NO Yes memSQL Centre For Railway Information Systems Fundamental Features ACID Referential Transactions Integrity Finegrained Locking Unicode Interface Oracle Yes Yes Yes except for DDL Yes(RowLevel locking) Yes API & GUI & SQL MySQL Yes Yes Yes-except for DDL Yes (RowLevel Locking) Yes GUI & SQL SQL Server Yes Yes Yes Yes (RowLevel Locking) Yes GUI & SQL PostgreSQL Yes Yes Yes Yes (RowLevel Locking) Yes GUI & SQL & API DB2 Yes Yes Yes Yes GUI & SQL Yes Centre For Railway Information Systems Benchmarking results According to a benchmarking of PostgreSQL 8.3 and Oracle 10g the PostgreSQL handled roughly 16,000 transactions per second Latency incurred was 0.9ms at 95th percentile, and Oracle was comparable (Oracle licensing does not permit disclosing benchmark results). The benchmarking was done on a pair of HP blade servers with 2x 2.4GHz quad-core Opterons and 8GB each, and one 320GB FusionIO PCIe SSD module each. The transactions comprised of 50-50 read-write operations with roughly around 10 queries per transaction. Centre For Railway Information Systems Benchmarking results The test was made using Grinder, a multithreaded Java load-testing framework on a machine as powerful as the DB servers, with about 200 simultaneous connections. Our current PRS system handles about 500 to 800 transactions per second ( around 200 per site). Moreover the performance of the application depends on many other factors beside the Database engine. Centre For Railway Information Systems Solutions Other Than RDBMS Centre For Railway Information Systems Centre For Railway Information Systems NoSQL means Not Only SQL Implying that when designing a software solution or product, there are more than one storage mechanism that could be used based on the needs. Not using the relational model Running well on clusters Mostly open-source Built for the 21st century web estates Schema-less No fixed schema (formally described structure) Centre For Railway Information Systems NoSQL means Not Only SQL(Contd.) No joins (typical in databases operated with SQL) Expensive operation for combining records from two or more tables into one set Joins require strong consistency and fixed schemas Database Scaling RDBMS are "scaled up" by adding hardware processing power NoSQL is "scaled out" by spreading the load Partitioning (sharding) / replication Centre For Railway Information Systems NoSQL in Present world Google (BigTable, LevelDB) LinkedIn (Voldemort) Facebook (Cassandra) Twitter (Hadoop/Hbase, FlockDB, Cassandra) Netflix (SimpleDB, Hadoop/HBase, Cassandra) CERN (CouchDB) Centre For Railway Information Systems Features of NoSQL Elastic scaling Bigger Data Handling Capability Maintaining NoSQL Servers is Cheaper Lesser Server Cost No Schema or Fixed Data model Integrated Caching Facility To improve programmer productivity by using a database that better matches an application's needs. To improve data access performance via some combination of handling larger data volumes, reducing latency, and improving throughput. Centre For Railway Information Systems Comparison between sql and nosql S.No RDBMS Nosql 1 2 3 Table based Document based, key-value pairs, graph databases Predefined schema Dynamic schema for unstructured data Vertically scalable Horizontally scalable 4 5 Structured query language Unstructured Query Language We can classify SQL databases as either open-source or close-sourced from commercial vendors. NoSQL databases can be classified on the basis of way of storing data as graph databases, key-value store databases, document store databases, column store database and XML databases. 6 7 8 9 Not best fit for hierarchical data storage Fits better for the hierarchical data storage Excellent support are available Rely on community support Emphasizes on ACID properties Emphasizes CAP theorem Exampe-MySql, Oracle, Sqlite, Postgres . Example-MongoDB, BigTable, Redis, RavenDb, Cassandra, Hbase, Neo4j and CouchDb. Centre For Railway Information Systems Some graphical comparision Centre For Railway Information Systems Performance Centre For Railway Information Systems Insert operation Centre For Railway Information Systems Types of NoSQL Databases: Key-Value databases Document databases Column family stores Graph Databases Centre For Railway Information Systems Key-Value databases Examples: Riak, Redis , Memcached a nd its flavors, Berkeley DB, HamsterDB (especially suited for embedded use), Amazon DynamoDB (not open-source), Project Voldemort and Couchbase. Centre For Railway Information Systems Document databases Examples: MongoDB, CouchDB , Terrastore, OrientDB, RavenDB , Centre For Railway Information Systems Column family stores Examples: Cassandra, HBase, Hypertable, and Amazon DynamoDB Centre For Railway Information Systems Graph Databases Examples: Neo4J, Infinite Graph, OrientDB, or FlockDB Centre For Railway Information Systems Challenges of NoSQL Maturity Support Analytics and business intelligence Administration Expertise Centre For Railway Information Systems NoSQL - Conclusion NoSQL databases are becoming an increasingly important part of the database landscape, and when used appropriately, can offer real benefits. However, enterprises should proceed with caution with full awareness of the legitimate limitations and issues that are associated with these databases Centre For Railway Information Systems Centre For Railway Information Systems New SQL NewSQL is a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (OLTP) readwrite workloads while still maintaining the ACID guarantees of a traditional database system Centre For Railway Information Systems New SQL Categories NewSQL systems can be loosely grouped into three categories: New architectures ◦ The first type of NewSQL systems are completely new database platforms. These are designed to operate in a distributed cluster of shared-nothing nodes, in which each node owns a subset of the data. These databases are often written from scratch with a distributed architecture in mind, and include components such as distributed concurrency control, flow control, and distributed query processing. Example systems in this category are Google Spanner, Clustrix, VoltDB, MemSQL,Pivotal's SQLFire and GemFire XD, SAP HANA, FoundationDB, NuoDB, and Trafodion. SQL engines ◦ The second category are highly optimized storage engines for SQL. These systems provide the same programming interface as SQL, but scale better than built-in engines, such as InnoDB. Examples of these new storage engines include MySQL Cluster, Infobright, TokuDB and the now defunct InfiniDB. Transparent sharding ◦ These systems provide a sharding middleware layer to automatically split databases across multiple nodes. Examples of this type of system includes dbShards andScaleBase. Centre For Railway Information Systems Comparison between three Database strategy Centre For Railway Information Systems A typical NewSQL variant: MemSQL is a high-performance, in-memory database that combines the horizontal scalability of distributed systems with the familiarity of SQL. In-Memory Performance Using memory, MemSQL concurrently reads and writes data on a distributed system, enabling access to billions of records in seconds. MemSQL also includes a disk-based column store. Horizontal Scalability By horizontally scaling on commodity hardware, MemSQL is easy to set-up, maintain and scale either on premises or in the cloud—reducing both your up-front investment and long-term maintenance costs. Advanced SQL Analytics Anything that can be expressed in SQL statements becomes available as quickly as data is captured. This enables easy integration with existing applications without costly query rewrites or custom connectors. Centre For Railway Information Systems Features : ANSI SQL Support In-Memory Tables MultiStatement Transactions Full Durability to Disk Compiled Queries Fullydistributed JOINs Centre For Railway Information Systems Features : (cont...) On-Disk Tables Cluster Management Massively Parallel Execution JSON Support Lock Free Data Structures Geospatial Support Centre For Railway Information Systems Centre For Railway Information Systems Conclusion Present PRS System(CONCERT) is very well designed and comprehensively developed for it’s problem requirements. It’s upgradation is not a very urgent need today but in order to keep its technology up to date and to cater Indian Railways passenger quality service with faster, smarter, better software solution, it might be redesigned using one or more of these Next Generation Database solutions which suits it more. However this upgradation may help to improve performance for communicating with RDBMS based PRS wings NGeT and DW. Centre For Railway Information Systems Centre For Railway Information Systems