New PRS in RDBMS Platform and NGet

advertisement
New PRS in RDBMS
Platform and Nget
Study of existing IT Systems of Indian
Railways running in CRIS
Centre For Railway Information Systems
Outline of Presentation
Introduction to PRS System & NGeT
 Achievements and Limitations of Present
System
 Network Issues related to Present System
 RDBMS Platforms for New PRS
 Solutions other than RDBMS

◦ No SQL
◦ New SQL

Conclusion
Centre For Railway Information Systems
Introduction to PRS
System And NGeT
Centre For Railway Information Systems
Brief History
Centre For Railway Information Systems
CONCERT




CONCERT - COUNTRY-WIDE NETWORK FOR
COMPUTERIZED ENHANCED RESERVATION
AND TICKETING
CONCERT, a fully automated passenger
reservation system
It allows passengers to do a booking from
anywhere for a journey in any train in any class
from anywhere to anywhere across counter.
It handles reservations, modifications,
cancellation/ refunds, 39 supervisory on-line
functions ,and 30 on-line enquiries.
Centre For Railway Information Systems
CONCERT(cont.)








It handles 265 concessions, 165 coach types and 40
types of quotas
Currently PRS network has more than 2600 POPs
(points of presence i.e. booking locations) spread
across the country with around 9100 POS (points of
sale i.e. terminals).
Easy fare calculation and accounting
Easy Enquiry
MIS system is associated with PRS data to generate
predefined MIS reports
Accounting MIS
Charting MIS
Data Warehouse
Centre For Railway Information Systems
CONCERT(cont.)

Based on the distributed Architecture.

4 Sites (Delhi, Mumbai, Chennai, Kolkata)

RTR as Middleware for transaction routing
and reliability.

Reservation based on the TDRC (Train, date,
Route, Class)

ARP Rules

Different Quota & Classes
Centre For Railway Information Systems
PRS APPLICATION ARCHITECTURE
IRCTC
Internet
Booking
System
139
TOUCH SCREEN
Firew
all
Enquiry
Charting
Internet Client
Reservation
FRONT-END SYSTEMS
Indian Rail
website
WEB
Server
CONCERT Interconnect Network
BACK-END SYSTEMS
DELHI DATA CENTRE
CHENNAI DATA CENTRE
STORAGE
FRONTEND
SERVERS
BACKEND
SERVERS
Centre For Railway Information Systems
MUMBAI DATA CENTRE
KOLKATA DATA CENTRE
Booking Flow
Fill forms
Validate
PNR generation
Fare Calculation
Start Txn
Allocation
PNR update
Commit
Display Details
Seat /berth
Fare
Centre For Railway Information Systems
Proceed
Proceed/Flush
Flush
Flush
Release berth
PNR Flush
Proceed
Accounting
Print Ticket
yes
Non Issue
Success
No
Return to Form
Centre For Railway Information Systems
Introduction to NGeT
Next Generation e-Ticketing System
 It is one client to PRS
 Can book ticket from any location at any time
 Easy to generate reports for NGeT transactions
 Paperless ticket
 Complete interface software between the IRCTC
front-end server and the back end Alpha server.
 Complete e-reservation and enquiries back end
servers.
 Ticket printing and reset facilities in existing client.
 Accounting reports for the IRCTC transactions.

Centre For Railway Information Systems
The Technology used
PRS integration
Layer & Centralised
Server
NGeT
Centre For Railway Information Systems
Achievements of
Present System
Centre For Railway Information Systems
Achievements of PRS System
•
•
•
•
•
•
•
PRS application is capable of handling 200
Booking Transaction(Tickets) Per Second per site
.
Overall TPS of 800 (based on performance
testing done for the system in year 2011-during
implementation stage of Itanium Migration).
This is equivalent to 28.8 lakh tickets per hour.
Number of passengers booked per day: 15 Lakhs
Transactions per second:
◦ a. Tickets Booked per second (peak): 550
◦ b. Enquiries served per second (peak): 5600
Point of sale: 9100
Number of Reserved Trains: 1500
Centre For Railway Information Systems
Transaction volume
Un Reserved Reserved
Average number of tickets booked per day
75 Lakhs
10 Lakhs
Average number of passengers booked per
day
200 Lakhs
15 Lakhs
Average earning per day
Rs. 50 Crores
Rs 77 Crores
Total number of terminals
10,408
9,100
Total number of booking locations
5,676
2,639
Transaction Type
Number of
Transactions per day
Bookings
10 Lakhs
Cancellations
2.5 Lakhs
Enquiries
110 Lakhs
Special Transactions
0.5 Lakhs
Total
123 Lakhs
Centre For Railway Information Systems
MILESTONES ACHIEVED BY NGeT
Number
of maximum concurrent sessions on nget.irctc.co.in
was 1,33,710 on 13th August, 2014.
Ticket Booking Statistic
 Maximum booking per day on 27th August’ 14 6,24,855
 Maximum booking in 08-09 on 27th August’ 14 1,35,561
 Maximum booking in 10-11 on 16th August’14
1,27,489
 Maximum booking per minute on 27th August,'1410,554
 Maximum booking per sec. on 18th August’14 289
Average
transaction response time for user to book ticket on
NGeT excluding payment is 40-55 sec.
Centre For Railway Information Systems
Issues
Ad-hoc query on the online database is not available
in PRS system
 Data is maintained in flat-file format and extraction
of any information from the database in flat-file
format requires development of routines to do the
same.
• There are peripheral systems available from which adhoc queries are answered to some extend (GUIDBA &
Data Warehouse)
• Migration of Accounting and Charting MIS on
RDBMS is required to facilitate the system in
providing response to all types of ad-hoc queries.
 End of support for OpenVMS by HP.

Centre For Railway Information Systems
Network Issues Related
to Present System
Centre For Railway Information Systems
PRS Network








It Consist of Backbone network and UTN network
Backbone Network Inter-connects all PRS
datacenters and CRIS datacenter with each other
UTN Network is Multi-tier hierarchical Network
connected to corresponding PRS DC
Datacenters- NDLS, MAS, NKG, CSTM &
Chanakyapuri
Centrally managed from NOC in CRIS, Chanakyapuri
till Tier-1 Location
Implemented in 1999 for supporting PRS
(CONCERT) networked transactions in year 1999
Full Mesh Topology in Backbone
UTN Network - Inverted Tree with Partial Mesh
Centre For Railway Information Systems
PRS Design Considerations
High Availability (>99.9% upto Tier-1,
99.8% overall)
 Scalability
 Lower Round Trip Time (<150 ms)
 Lower routing convergence time
 Route Diversity / Service Provider
Diversity.
 Better Manageability

Centre For Railway Information Systems
Eastern Railway
North Eastern Railway
Northern Railway
South Eastern Railway
NKG
East Coast Railway
North Central
Railway
North Western Railways
South East Central Railway
East Central Railway
North Frontier Railway
Kolkata
Delhi
Chennai
MAS
RIDC
NGeT DC
Southern Railway
CRIS
South Western Railway
Mumbai
DR DC at SC
Secunderabad
Central Railway
South Central Railway
CSTM
West Central
Railway
Western Railway
Centre For Railway Information Systems
UTN Network architecture
PRS Servers
Zonal Router
Area-1
Area-3
Area-2
different Service Providers
Links to other locations
Tier 2/3 Location routers
Tier-1 Locations with Routers in 1:1
High Availability
(HA)
Mode
Centre For Railway
Information
Systems
Network Failure in PRS
15%
5%
35%
Power failure
Channel Failure
20%
Packet Loss
Routing Issues
25%
Hardware Failure
Centre For Railway Information Systems
RDBMS
Centre For Railway Information Systems
Relational Model






Primary data model for data storage and
processing.
Based on first order predicate logics.
Data is stored in format of table representing
some entity.
Set of tuples(rows) representing relational
instance, no duplicate row.
Columns represents attributes of the stored
record .
Tables can relate each other, relation is depicted
using Entity-Relational diagram.
Centre For Railway Information Systems
Relational Model
Train Info
Train
Date
Route
Class
12345
01-07-2015
A-B
1
23456
01-01-2015
C-D
2
Passenger Information
Passenger_Name Age
ABCD
20
PQRS
22
Sex
M
F
Seat Allocation
Train
Date
Route
Class
Seat
PNR
12345
01-01-2015
A-B
1
10
12345678
23456
01-01-2015
C-D
2
12
23245664
PNR
12345678
23245664
Centre For Railway Information Systems
DBMS
DBMS- Set of programs designed to store,
manage and process database.
 Convenient and efficient way to store and
retrieve database information.
 Interface between database and application
program.

Centre For Railway Information Systems
DBMS Architecture
Centre For Railway Information Systems
DBMS Characteristics
◦ Multiple Views (Data Abstraction)
◦ Isolation of data and application (Program
data independency)
Logical Data
Independence
Physical Data
Independence
Centre For Railway Information Systems
DBMS Characteristics Cont.
◦ Relation-based tables
◦ Less Redundancy
 Normalization of data.
◦ Consistency
 Database transaction changes affected data only in
allowed ways.
◦
◦
◦
◦
◦
Query language
Transaction Management
Multiuser and concurrent access.
Recovery System
Security
Centre For Railway Information Systems
How RDBMS can serve PRS its needs



Structured Data
◦ High Degree of organization
◦ Readily searchable by simple SQL engines on key
value.
Programming language support.
◦ ODBC(Open Database Connectivity)
◦ JDBC(Java Database Connectivity)
SQL (Structured Query Language)
◦ PRS generates several type of MIS reports each one
having its own program.
◦ SQL allows to query database for the conditions
specified.
◦ No need for separate schedule creation.
◦ Reduced development cost.
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Transaction Management
◦ Transaction must satisfy ACID properties.
Database Transaction Management
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Transaction Management
◦ In PRS a transaction updates the berth based on
TDRC in one file and update Berth to PNR relation
in another.
◦ These two updates need to be done atomically.
◦ Atomicity and durability are done by keeping
track of each update made by transaction.
◦ In case of failure DBMS undoes all the updates
made by the transaction, thus ensuring atomicity.
◦ In case of successful transaction DBMS commits
the updates.
Database Transaction Management
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

The commit could be done using two phase
commit to ensure both commit are
successful.
Database Transaction Management
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Isolation among transaction is handled by
locking.
◦ Locking is record level (TDRC).
◦ Prevents other transactions from reading/updating
the same record.
◦ Releases lock after completion of two phase
commit.

Consistency: Any data written to the database
must be valid according to all defined rules,.
◦ DBMS check if X is following all rules specified.
◦ Verifies Referential integrity constraints are
satisfied.
Database Transaction Management
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Indexing
◦ Access path for faster database record
access.
◦ Indexed access of record on key.
◦ RDBMS support B/B+ tree index
 Automatic Re- organizes with changes.
 Time required to search is order of logpN.
◦ Reduced read/write latency.
 PRS uses RMS service in OpenVMS for record
access
Database Access Methods
Centre For Railway Information Systems
How RDBMS can serve PRS its
needs

B+ tree indexing
Database Access Methods
Centre For Railway Information Systems
How RDBMS can serve PRS its needs
• Hashing

TDRC records are numbered for random access.

Hashing in RDBMS can provide random access to
database records.
Database Access Methods
Centre For Railway Information Systems
Database Partitioning/Sharding

Database Sharding/Partitioning
◦ Database Sharding can be done to improve
performance when tables get large in size.
◦ Sharding is partitioning database horizontally.
◦ Index size reduces.
◦ Different shards can be deployed on deferent
servers.
◦ Manage parallel access in the application
 Partition tables map keys to nodes
 Application decides where to route storage or lookup
requests
◦ Scales well for both reads and writes
Centre For Railway Information Systems
Database Partitioning/Sharding
• Limitations of Database Partitioning
 Not transparent
• application needs to be partition-aware
 Increase database complexity
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Database Compression
◦ Saves disk space, thus reducing storage cost
significantly.
◦ Reduces memory use in the database buffer cache
◦ Significantly speed query execution during reads.
Centre For Railway Information Systems
How RDBMS can serve PRS its needs

Security mechanism
◦
◦
◦
◦
◦
Identification and Authentication
Authorization and Access Control
Encryption
Auditing
Role-Based Access Control for Multilevel
Security
Centre For Railway Information Systems
Limitations of RDBMS

Scalability
◦ The major constraint with RDBMS is that
relational databases are not largely scalable.

Parallelism
◦ Not capable of providing high level of
parallelism.
◦ Increasing number of nodes increases
database complexity to unmanageable levels.

Performance
◦ If size of tables is large tables themselves effect
performance in responding to SQL queries.
Centre For Railway Information Systems
Different RDBMS
Solutions and Their
Performance Comparison
Centre For Railway Information Systems
Base of different solutions

Each RDBMS have different features
like
◦ Interface (GUI, SQL, others)
◦ Language Support (C, C#,C++, Java, Ruby,
Objective C, and many more)
◦ Operating Systems (Windows, Linux,
Solaris, HP-UX, OS X, z/OS, AIX, FreeBSD,
Solaris)
◦ Licensing (Proprietary, Open source,
Proprietary)
Centre For Railway Information Systems
Some Promising RDBMSs
Maintaine
r
First
Public
Release
Date
Latest
Stable
Version
Latest
Release
Date
License
Oracle
Oracle
Corporatio
n
11-1979
12cRelease
1
25-06-2013
Proprietary
MySQL
Oracle
Corporatio
n
11-1995
5.6.26
24-07-2015
GPL v2 or
Proprietary
MemSQL
MemSQL
06-2012
1.8
12-2012
Proprietary
1989
2014
18-03-2014
Proprietary
04-06-2015
PostgreSQL
License
SQLServer Microsoft
Postgre
SQL
PostgreSQL 06-1989
Global
Developmn
et Group
9.4.3
DB2
IBM
10.5
1983
23-04-2013
Centre For
Railway InformationProprietary
Systems
MySQL
MySQL is the most popular one of all
the large-scale database servers
 Feature rich
 Stand-alone database server
 open-source
 Powers a lot of web-sites and
applications online.

Centre For Railway Information Systems
MySQL
Advantage
Disadvantage
•
•
•
•
Easy to work with • Known
Feature Rich
Limitations
Secure
• Reliability Issues
Scalable and
• Stagnated
Powerful
Development
• Speedy
Centre For Railway Information Systems
MySQL
When to use
When not to use
• Distributed
Operations
• High Security
• Web-sites and
Web
Applications
• Custom
solutions
• SQL
compliances
• Concurrency
• Lack of Features
Centre For Railway Information Systems
PostgreSQL
Advanced, open-source [object]-RDBMS
 the main goal of being standards-compliant
and extensible
 tries to adopt the ANSI/ISO SQL standards
together with the revisions.
 it support highly required and integral
object-oriented and/or relational database
functionality, such as the complete support
for reliable transactions, i.e. ACID.

Centre For Railway Information Systems
PostgreSQL

extremely capable of handling many tasks
very efficiently

Support for concurrency is achieved without
read locks thanks to the implementation of
Multi version Concurrency Control (MVCC),
which also ensures the ACID compliance.

highly programmable, and therefore
extendible, with custom procedures that are
called "stored procedures".
Centre For Railway Information Systems
PostgreSQL
Advantage
Disadvantage
• An open-source
• Performance
SQL standard
• Popularity
compliant RDBMS • Hosting
• Strong community
• Strong third-party
support
• Extensible
• Objective
Centre For Railway Information Systems
PostgreSQL
When to use
When not to use
• Data integrity
• Speed
• Complex, custom • Simple set ups
procedures
• Replication
• Integration
• Complex designs
Centre For Railway Information Systems
Operating System Support
Windows
OS X
Linux
BSD
UNIX
Oracle
Yes
Yes
Yes
No
Yes
MySQL
Yes
Yes
Yes
Yes
Yes
SQL Server
Yes
No
No
No
No
PostgreSQL
Yes
Yes
Yes
Yes
Yes
DB2
Yes
Yes
YEs
NO
Yes
memSQL
Centre For Railway Information Systems
Fundamental Features
ACID Referential Transactions
Integrity
Finegrained
Locking
Unicode Interface
Oracle
Yes
Yes
Yes except for
DDL
Yes(RowLevel
locking)
Yes
API & GUI
& SQL
MySQL
Yes
Yes
Yes-except for
DDL
Yes (RowLevel
Locking)
Yes
GUI &
SQL
SQL Server Yes
Yes
Yes
Yes (RowLevel
Locking)
Yes
GUI &
SQL
PostgreSQL Yes
Yes
Yes
Yes (RowLevel
Locking)
Yes
GUI &
SQL & API
DB2
Yes
Yes
Yes
Yes
GUI &
SQL
Yes
Centre For Railway Information Systems
Benchmarking results
According to a benchmarking of PostgreSQL 8.3
and Oracle 10g the PostgreSQL handled roughly
16,000 transactions per second
 Latency incurred was 0.9ms at 95th percentile,
and Oracle was comparable (Oracle licensing
does not permit disclosing benchmark results).
 The benchmarking was done on a pair of HP
blade servers with 2x 2.4GHz quad-core Opterons
and 8GB each, and one 320GB FusionIO PCIe
SSD module each.
 The transactions comprised of 50-50 read-write
operations with roughly around 10 queries per
transaction.

Centre For Railway Information Systems
Benchmarking results



The test was made using Grinder, a
multithreaded Java load-testing framework
on a machine as powerful as the DB servers,
with about 200 simultaneous connections.
Our current PRS system handles about 500
to 800 transactions per second ( around 200
per site).
Moreover the performance of the application
depends on many other factors beside the
Database engine.
Centre For Railway Information Systems
Solutions Other Than
RDBMS
Centre For Railway Information Systems
Centre For Railway Information Systems
NoSQL means Not Only SQL
Implying that when designing a software
solution or product, there are more than one
storage mechanism that could be used based
on the needs.
 Not using the relational model
 Running well on clusters
 Mostly open-source
 Built for the 21st century web estates
 Schema-less
 No fixed schema (formally described
structure)

Centre For Railway Information Systems
NoSQL means Not Only SQL(Contd.)







No joins (typical in databases operated with
SQL)
Expensive operation for combining records from
two or more tables into one set
Joins require strong consistency and fixed
schemas
Database Scaling
RDBMS are "scaled up" by adding hardware
processing power
NoSQL is "scaled out" by spreading the load
Partitioning (sharding) / replication
Centre For Railway Information Systems
NoSQL in Present world
Google (BigTable, LevelDB)
 LinkedIn (Voldemort)
 Facebook (Cassandra)
 Twitter (Hadoop/Hbase, FlockDB,
Cassandra)
 Netflix (SimpleDB, Hadoop/HBase,
Cassandra)
 CERN (CouchDB)

Centre For Railway Information Systems
Features of NoSQL
Elastic scaling
Bigger Data Handling Capability
Maintaining NoSQL Servers is Cheaper
Lesser Server Cost
No Schema or Fixed Data model
Integrated Caching Facility
To improve programmer productivity by using a
database that better matches an application's
needs.
 To improve data access performance via some
combination of handling larger data volumes,
reducing latency, and improving throughput.







Centre For Railway Information Systems
Comparison between sql and nosql
S.No RDBMS
Nosql
1
2
3
Table based
Document based, key-value pairs, graph databases
Predefined schema
Dynamic schema for unstructured data
Vertically scalable
Horizontally scalable
4
5
Structured query language
Unstructured Query Language
We can classify SQL databases as either
open-source or close-sourced from
commercial vendors.
NoSQL databases can be classified on the basis of
way of storing data as graph databases, key-value
store databases, document store databases, column
store database and XML databases.
6
7
8
9
Not best fit for hierarchical data storage
Fits better for the hierarchical data storage
Excellent support are available
Rely on community support
Emphasizes on ACID properties
Emphasizes CAP theorem
Exampe-MySql, Oracle, Sqlite,
Postgres .
Example-MongoDB, BigTable, Redis, RavenDb,
Cassandra, Hbase, Neo4j and CouchDb.
Centre For Railway Information Systems
Some graphical comparision
Centre For Railway Information Systems
Performance
Centre For Railway Information Systems
Insert operation
Centre For Railway Information Systems
Types of NoSQL Databases:
Key-Value
databases
Document
databases
Column
family
stores
Graph
Databases
Centre For Railway Information Systems
Key-Value databases
Examples:
Riak, Redis , Memcached a
nd its flavors, Berkeley
DB, HamsterDB (especially
suited for embedded use),
Amazon DynamoDB (not
open-source), Project
Voldemort and Couchbase.
Centre For Railway Information Systems
Document databases
Examples:
MongoDB, CouchDB , Terrastore, OrientDB, RavenDB
,
Centre For Railway Information Systems
Column family stores
Examples: Cassandra, HBase, Hypertable, and
Amazon DynamoDB
Centre For Railway Information Systems
Graph Databases
Examples: Neo4J, Infinite Graph, OrientDB, or FlockDB
Centre For Railway Information Systems
Challenges of NoSQL
Maturity
Support
Analytics and business intelligence
Administration
Expertise
Centre For Railway Information Systems
NoSQL - Conclusion
NoSQL databases are becoming an
increasingly important part of the
database landscape, and when used
appropriately, can offer real benefits.
 However, enterprises should proceed
with caution with full awareness of the
legitimate limitations and issues that
are associated with these databases

Centre For Railway Information Systems
Centre For Railway Information Systems
New SQL

NewSQL is a class of
modern relational database
management systems that seek to
provide the same scalable performance
of NoSQL systems for online
transaction processing (OLTP) readwrite workloads while still maintaining
the ACID guarantees of a traditional
database system
Centre For Railway Information Systems
New SQL Categories


NewSQL systems can be loosely grouped into three categories:
New architectures
◦ The first type of NewSQL systems are completely new database platforms. These
are designed to operate in a distributed cluster of shared-nothing nodes, in
which each node owns a subset of the data. These databases are often written
from scratch with a distributed architecture in mind, and include components
such as distributed concurrency control, flow control, and distributed query
processing. Example systems in this category are Google
Spanner, Clustrix, VoltDB, MemSQL,Pivotal's SQLFire and GemFire XD, SAP
HANA, FoundationDB, NuoDB, and Trafodion.

SQL engines
◦ The second category are highly optimized storage engines for SQL. These
systems provide the same programming interface as SQL, but scale better than
built-in engines, such as InnoDB. Examples of these new storage engines
include MySQL Cluster, Infobright, TokuDB and the now defunct InfiniDB.

Transparent sharding
◦ These systems provide a sharding middleware layer to automatically split
databases across multiple nodes. Examples of this type of system
includes dbShards andScaleBase.
Centre For Railway Information Systems
Comparison between three
Database strategy
Centre For Railway Information Systems
A typical NewSQL variant:

MemSQL is a high-performance, in-memory database that
combines the horizontal scalability of distributed systems with the
familiarity of SQL.
In-Memory Performance
 Using memory, MemSQL concurrently reads and writes
data on a distributed system, enabling access to billions of
records in seconds. MemSQL also includes a disk-based
column store.

Horizontal Scalability
 By horizontally scaling on commodity hardware, MemSQL
is easy to set-up, maintain and scale either on premises or
in the cloud—reducing both your up-front investment and
long-term maintenance costs.

Advanced SQL Analytics
 Anything that can be expressed in SQL statements
becomes available as quickly as data is captured. This
enables easy integration with existing applications without
costly query rewrites or custom connectors.
Centre For Railway Information Systems
Features :
ANSI SQL
Support
In-Memory
Tables
MultiStatement
Transactions
Full
Durability to
Disk
Compiled
Queries
Fullydistributed
JOINs
Centre For Railway Information Systems
Features :
(cont...)
On-Disk
Tables
Cluster
Management
Massively
Parallel
Execution
JSON
Support
Lock Free
Data
Structures
Geospatial
Support
Centre For Railway Information Systems
Centre For Railway Information Systems
Conclusion
Present PRS System(CONCERT) is very well
designed and comprehensively developed for
it’s problem requirements.
 It’s upgradation is not a very urgent need today
but in order to keep its technology up to date
and to cater Indian Railways passenger quality
service with faster, smarter, better software
solution, it might be redesigned using one or
more of these Next Generation Database
solutions which suits it more.
 However this upgradation may help to improve
performance for communicating with RDBMS
based PRS wings NGeT and DW.

Centre For Railway Information Systems
Centre For Railway Information Systems
Download