gigaspaces - Computer Engineering Research Group

advertisement
The Next Generation
Application Server –
How Event Based Processing yields scalability
Guy Korland
R&D Team Leader
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
2
About me…
• Core Team Leader – GigaSpaces since 2005
• MSc – Technion (Prof. Roy Friedman)
• PhD candidate – Tel Aviv University (Prof. Nir Shavit)
• Lead Deuce STM – (www.deucestm.org)
Java Software Transactional Memory
3
GigaSpaces XAP – Designed For:
Performance
Scalability
Latency
4
About GigaSpaces eXtreme Application Platform (XAP)
A middleware platform enabling applications to run a distributed cluster as if it
was a single machine
2,000+ Deployments
“GigaSpaces has saved
us significant time and
cost”
Phil Ruhlman,
CIO, Gallup
Among Top 50 Cloud Vendors
“GigaSpaces exceeded our
performance requirements and
enabled us to build a flexible,
cost-effective infrastructure”
Julian Browne,
Virgin Mobile
5
100+ Direct Customers
“GigaSpaces has allowed
us to greatly improve the
scalability and
performance of our trading
platform”
Geoff Buhn,
Options Trading Technology
Manager, SIG
5
GigaSpaces Evolution
SLA
container
Partition &
Replication
Event
Container
Load
Balance
NG
Application
Server
PaaS
Cloud
Single
space
2000
2003
2005
2006
2007
6
2008
2009
Not going to talk about…
•
Jini (Java SOA)
•
Data Grid implementation.
•
Map-Reduce.
•
JDBC/JMS/JPA.
•
Cloud computing.
•
Batch processing.
•
Mule ESB.
•
WAN vs LAN
•
Different languages interoperability.
•
TupleSpace model extension.
•
JDK improvements (RMI, Reflection, Serialization, Classloading…)
7
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
8
Today’s Reality – Tier Based Architecture
Separate technology
implementation
bottlenecks
bottlenecks
Separate technology
implementation
Separate technology
implementation
Bottlenecks in all areas where state is stored, architecture can’t scale linearly!
9
Traditional Architecture - path to complexity… (marktplaats.nl)
A
Auction Service
B
Bid Service
T
Trade Service
I
Info Service
T
Timer Service
Auction
Bid
Trade
Info
A
B
T
II
Process
Service
Service
Service
Service
Bid
Result
Bid
Process
Accepted
Validate
Trade
Bid
Result
Place bid
Bidder
Get Bid
Result
Auction
Owner
Timer
T
Service
10
Traditional Architecture - path to complexity…
A
Auction Service
B
Bid Service
T
Trade Service
I
Info Service
T
Timer Service
A
B
T
I
Business tier
Bidder
Auction
Owner
Back-up
Separate failover strategy
and implementation for
each tier
Redundancy doubles
network traffic
11
Bottlenecks are created
Latency is increased
Back-up
11
Do you see the Problem?
Scalability is not linear
Scalability management
Business tier
I
nightmare
A
B
T
B
Bidder
Auction
Owner
Back-up
Back-up
Back-up
12
12
Back-up
There is a huge gap between peak and average loads
1,300,000,000
1,200,000,000
1,100,000,000
1,000,000,000
900,000,000
800,000,000
700,000,000
600,000,000
500,000,000
400,000,000
300,000,000
200,000,000
100,000,000
0
J-04 M-04 M-04 J-04 S-04 N-04 J-05 M-05 M-05 J-05 S-05 N-05 J-06 M-06 M-06 J-06 S-06 N-06 J-07 M-07 M-07 J-07 S-07
13
Bottlenecks, Performance, Scalability and High availability
headaches
Bad Publicity
Revenue Loss
Customer Dissatisfaction
Regulatory Penalties
14
15
TBA – Summary
• Historically the following has been done…
– Tune, tune and tune configuration and code
• Once a bottleneck has been resolved, the next one glooms
– Hardware over provision
• To make sure that at peak times the response times were still acceptable
– Hardware upgrades
• To get rid of bottlenecks, whose origin was impossible to track down
– Alternative patterns
• Avoiding 2-phase-commit, using patterns like ‘compensating transactions’
• Using Active/Passive failover, to make the response times faster, risking and
in fact accepting potential data-loss
• Partition the database, but not for size-reasons
16
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
17
Event Containers
18
Based on JavaSpaces
C++
19
The Master Worker Pattern
20
GigaSpaces - based on Shared Transactional Memory
• Write – writes a data object
• Notify – generates an event on data updates
• Read – reads a copy of a data object
• Take – reads a data object and deletes it
Write + Read  Data Caching
Write + Take  Master Worker
Write + Notify  Messaging - Pub/Sub
21
Event Containers
22
Step 1 – Create a Processing Unit
A
Auction Service
B
Bid Service
A
T
Trade Service
I
Info Service
T
Timer Service
B T I
Processing Unit
Business tier
Bidder
Auction
Owner
Single model for design, deployment and management
No integration effort
Manage data in memory
Collapse the tiers
Collocate the services
23
23
Step 2 – Async Persistency
A
Auction Service
B
Bid Service
A
T
Trade Service
I
Info Service
T
Timer Service
B T I
Place
Bid
Processing Unit
Validate
Bidder
Process Bid
Auction
Owner
Process Trade
Get Bid
Results
Process Results
Collocation of data, messaging and services in memory:
 Minimum Latency (no network hops)
 Maximum Throughput
24
24
Persist for Compliance &
Reporting purposes:
- Storing State
- Register Orders
- etc.
Step 3 – Resiliency
Backup
SLA Driven
Container
A
B T I
A
B T I
Processing Unit
Single, built-in failover/redundancy investment strategy
Fewer points of failure
Automated SLA driven failover/redundancy mechanism
Continuous High Availability
25
Step 3 – Resiliency
SLA Driven
Container
Primary
Backup
Backup
Processing Unit
Single, built-in failover/redundancy investment strategy
Fewer integration points mean fewer chances for failure
Automated SLA driven failover/redundancy mechanism
Continuous Availability
Self Healing Capability
26
Step 4 – Scale
Backup
A
Backup
B T I B T I B T I
A
A
A
B T I
Processing Unit
Write Once Scale Anywhere:
Linear scalability
Single monitoring and management engine
Automated, SLA-Driven deployment and management
- Scaling policy, System requirements, Space
cluster topology
27
27
Event Containers
28
Step 5 – Auto Scale Out
29
Processing Unit – Scalability Unit
Single Processing Unit
Processing Unit - Scaled
Involves
Config Change
No code changes!
30
Processing Unit – High-Availability Unit
Primary - Processing Unit
Business logic – Active mode
Backup - Processing Unit
Business logic – Standby mode
Sync Replication
31
Database Integration - Async persistency
Primary - Processing Unit
Business logic – Active mode
Backup - Processing Unit
Business logic – Standby mode
Sync Replication
Initial Load
Async
Replication
Async
Replication
Mirror Process
ORM
32
XAP = Enterprise Grade Middleware
•
Scale-out application server
End 2 End scale-out middleware for: Web, Data, Messaging, Business logic
Space Based Architecture – designed for scaling stateful applications In-memory
•
Proven performance, Scalability, Low latency, Reliability
•
SLA Driven
•
Unique database scaling solution that fits cloud environment
In Memory Data Grid
O/R mapping support
•
Support major Enterprise languages
Java, .Net, C++
33
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
34
Built-in Event Containers
• Polling Container
• Notify Container
Processing Unit
Service
Bean
Service
Bean
Polling
Event Container
Notify
Event Container
Take
Write
Data
35
Notify
Messaging
Polling Container
• Used for point-to-point messaging
• Container polls the Space
Processing Unit
for events
Service
• Comparable with the
Bean
way Ajax works
Polling
Event Container
Take
36
Write
Notify Container
• Used for publish-subscribe messaging
• Space notifies the container
Processing Unit
Service
Bean
Notify
Event Container
Notify
37
Typical Application
38
Service Grid Summary
Powerful Universal Container
Java/Net/C++
Distributed
Fault Tolerant
Object based
Transactional
Publish/Subscribe
39
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
40
Event Containers
41
The POJO Based Data Domain Model
@SpaceClass(fifo=true)
public class Data {
…
@SpaceId(autoGenerate = true)
public String getId() {
return id;
}
public String setId(String id) {
this.id = id;
}
SpaceClass indicate that this
is a SpaceEntry – SpaceClass
includes classlevel attributes
such as FIFO,Persistent…
SpaceId used to
define the key for
that entry.
public void setProcessed(boolean processed) {
this.processed = processed;
}
public boolean isProcessed(boolean processed) {
return this.processed;
}
}
42
Data Processor Service Bean
@SpaceDataEvent
to be called when an
event is triggered.
public class DataProcessor{
@SpaceDataEvent
public Data processData(Data data){
…
data.setProcessed(true);
//updates the space
return data;
}
Updates the data
in the Space.
}
43
Wiring Order Processor Service Bean through Spring
<bean id="dataProcessor“ class="com.gigaspaces.pu.example1.processor.DataProcessor" />
<os-events:polling-container id="dataProcessorPollingEventContainer" giga-space="gigaSpace">
<os-events:tx-support tx-manager="transactionManager"/>
<os-core:template>
<bean class="org.openspaces.example.data.common.Data">
<property name="processed" value="false"/>
</bean>
The
event
Template
</os-core:template>
<os-events:listener>
<os-events:annotation-adapter>
<os-events:delegate ref="dataProcessor"/>
</os-events:annotation-adapter>
The
event
Listener
</os-events:listener>
</os-events:polling-container>
44
Data Feeder
public class DataFeeder {
public void feed(){
Data data = new Data(counter++);
data.setProcessed(false);
//feed data
gigaSpace.write(data);
}
}
45
Feed
Data
Remoting – Taking one step forward
Event
46
Remoting – Taking one step forward
Reduce
47
Remoting – IDataProcessor Service API
public interface IDataProcessor {
// Process a given Data object
Data processData(Data data);
}
48
Remoting - DataProcessor Service
@RemotingService
public class DataProcessor
implements IDataProcessor {
public Data processData(Data data) {
…
data.setProcessed(true);
return data;
}
}
49
Remoting - Order Feeder
public class DataFeeder {
private IDataProcessor dataProcessor;
public void setDataProcessor(…) {
this.dataProcessor = dataProcessor;
}
public Data feed(){
Data data = new Data(counter++);
// remoting call
return dataProcessor.process(data)
}
}
50
Summary
51
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
52
Scale up Throughput Benchmark – Physical Deployment Topology
Embedded (one machine , one process)
X4450
Client
GigaSpaces
8 spaces
Remote (multiple machines , multiple processes)
white box
Client
Switched Ethernet LAN
X4450
GigaSpaces
4 spaces , one per GSC
X4450
GigaSpaces
4 spaces , one per GSC
53
Scale up Throughput Benchmark – Embedded mode
x4450 - stac-sun-1 - Embedded Space - TP vs. Multiple Threads - 8 Partitions
2000000
1.8 Million
read sec!
1800000
TP (operations/sec)
1600000
1400000
1.1 Million
write/take sec!
1200000
1000000
800000
600000
Write TP
400000
Read TP
200000
Take TP
0
1
2
3
4
6
8
Client Threads
54
10
12
16
20
30
Scale up Throughput Benchmark – Remote mode
x4450 - stac-sun-3 - Remote Space - TP vs. Multiple Threads - 4
partitions
100000
90000
Write TP
80000
TP (operations/sec)
Read TP
70000
90,00 read sec!
Take TP
60000
50000
40000
30000
45,00 write/take sec!
20000
10000
0
1
2
3
4
5
8
12
16
20
24
Client Threads
55
26
30
34
38
42
46
50
54
58
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
56
Event Containers
57
Web Container
Grid
58
Web application – Pet Clinic
59
Classic Architecture – Step 1- Request Submission
Data Grid
Processing Unit
Processing Unit
Processing Unit
Processing Unit
Service
Bean
Service
Bean
Service
Bean
Service
Bean
T
T
Replication
Primary 1
Primary 2
Backup 1
3. invocation
Get request
and invoke
Service
Replication
Async Mirroring
Load
Backup 2
3. invocation
Web PU
Task
Web PU
Proxy
Task
Web PU
Proxy
Task
2. Route Request
Apache Load-Balancer
1. User Click Submits request
60
Proxy
Classic Architecture – Step 2- Retrieve Results
Data Grid
Processing Unit
Processing Unit
Processing Unit
Processing Unit
Service
Bean
Service
Bean
Service
Bean
Service
Bean
T
T
Replication
Primary 1
Primary 2
Backup 1
Backup 2
Result
Result
Page
Generation
Load
1. Result returned
1. Result returned
Web PU
Replication
Async Mirroring
Aggregated
Result Web
PU
Reducer
Proxy
Task
Web PU
Proxy
Task
2. Route Request
Apache Load-Balancer
3. User getting Page
61
Proxy
Web Application Benchmark Results - Capacity
70
60
50
40
30
20
10
0
Users
62
0
10
90
80
70
60
50
40
20
15
10
5
4
3
2
1 Server
2 Servers
3 Servers
1
Latency(ms)
Web Benchmark (pet clinic) - Latency vs. Users
Web Application Benchmark Results - Capacity
Web Benchmark (pet clinic) - Latency vs. Users
3500
3000
1 Server
2 Servers
3 Servers
2000
1500
1000
500
10
00
20
00
25
00
30
00
35
00
40
00
45
00
50
00
75
0
50
0
25
0
20
0
15
0
10
0
0
50
Latency
2500
Users
63
Game Server
64
Space Based Architecture – Game Server
Scaling out
Game Servers
Intercepts update
events
Table Feeder
GameTable
 Loading Game Tables
into the partitioned
spaces
Partitioned
Space
Notify
Query
Publisher (II)
Publisher (lobby)
 Randomly updates the
game tables
Game Table Directory
Game Table search
Pub/Sub
messaging
Player search
65
Space Based Architecture – Game Server
Game Servers
Publisher Servers
GigaSpaces Service Grid
Java runtime
GameTable
Physical backup
Partitioned
Space
GameTable
Partitioned
Space
Notify / Query

Running continues query per user
GameTable
Partitioned
Space
GameTable
Partitioned
Space
Notify / Query
GameTable
Partitioned
Space
 Uploading 30,000 players for 6000 tables
Randomly updates game tables
66
GameTable
Partitioned
Space
Dynamic repartitioning and load sharing I
Indexed Notify / Query
template
Notify / Query template
Partitioned Space
Partitioned Space
Partitioned Space
SLA Driven Container
67
Dynamic repartitioning and load sharing II
Partitioned Space
Partitioned Space
Partitioned Space
SLA Driven Container
SLA Driven Container
68
Scaling
• 2000 tables
• 4000 tables
• 6000 tables
• 10,000 players
• 20,000 players
• 30,000 players
Partitioned Space
Partitioned Space
Partitioned Space
Backup Space
Backup Space
Backup Space
SLA Driven Container
SLA Driven Container
Throughput:
~12K/sec
Throughput:~18K/sec
~6K/sec
69
SLA Driven Container
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
70
Challenges
•
Distributed queries (Join, Subqueries…)
– Select * from Person p where p.name in (Select * from Managers)
– Select * from Person p, Address a where p.addressId = a.addressId
and a.street =“MyStreet”
•
Dynamic partition
– Consistent hashing, buckets
– Update routing tables (proxy)
– Live queries
•
Distributed transactions
•
Cluster of Clusters data integration over the WAN.
•
Integration with External Data Source – e.g. DB (bottleneck)
71
Challenges
•
Integration with External Data Source – e.g. DB (bottleneck)
•
Index of a complex event query/Blocking query. (Notify)
•
Cluster status Consensus (Who is alive?)
•
Even distribution of data.
•
Technical: How do you maintain 100K tcp connection?
•
Cloud computing?
•
Too big set of data LRU/LFU cache?
•
Scalable distributed Lookup service.
•
Network Split Brain.
72
Agenda
•
Preview
•
The Problem
•
The Solution
•
Event Containers
•
Example
•
Benchmarks
•
Customer Use Cases
•
Challenges
•
Summary Q&A
73
Thank You!
Q&A
Appendix
SLA Driven Deployment
SLA:
• Failover policy
• Scaling policy
• Ststem requirements
• Space cluster topology
PU Services beans definition
76
Continuous High Availability
Failure
Fail-Over
77
Dynamic Partitioning = Dynamic Capacity Growth
P - Primary
Max Capacity=6G
Capacity=2G
Capacity=4G
B - Backup
VM 1 ,2G
GSC
P
A
P
E
VM 2 ,2G
GSC
VM 3 , 2G
GSC
B
Partition 2
F
Partition 1
P
C
D
Partition 3
GSC
A
In some point VM 1 free memory is
below 20 % - it about the time to
Later .. Partition 2 needs to
increase the capacity – lets move
move… After the move , data
Partitions 1 to another GSC and
is recovered from the backup
recover the data from the running
backup!
VM 4 ,4G
B
VM 5 , 4G
GSC
B
Partition 2
C
B
E
F
Partition 1
78
D
Partition 3
B
Executors
79
Task Executors – Task Execution
Executing a task is done using the execute method
AsyncFuture<Integer> future = gigaSpace.execute(
new MyTask(2)
);
int result = future.get();
Client
Task
Processing Unit
1
2
Task
Proxy
3
Result
4
80
Task Executors – Task Routing
Routing a task can be done in three ways
1. Using the task itself
2. Passing a POJO to the execute method
3. Specifying a routing-parameter in the execute method
Processing Unit
Processing Unit
Processing Unit
Processing Unit
Processing Unit
Task
Client
Task
Proxy
Task
Client
Proxy
81
Result
Processing Unit
Task Executors – DistributedTask Execution
Executing a distributed task is done using the execute method
AsyncFuture<Integer> future = gigaSpace.execute(
new MyDistTask()
);
int result = future.get();
Processing Unit
Processing Unit
Task
Task
Processing Unit
Processing Unit
Task
Processing Unit
Task
Task
Result
Client
Task
Client
Proxy
82
Processing Unit
Aggregated
Result
Task
Result
Result
Proxy
Reducer
Task Executors – DistributedTask Routing
Routing a distributed task can be done
1. In the same ways as with the plain Task interface
2. By broadcasting
3. Specifying a number of routing-parameters in the execute method
Processing Unit
Processing Unit
Task
Processing Unit
Processing Unit
Task
Processing Unit
Task
Task
Result
Client
Task
Client
Proxy
83
Aggregated
Result
Result
Proxy
Reducer
Processing Unit
Service Executors
84
Service Executors
85
IMDG
Operations
86
IMDG Basic Operations
Space
Application
Write
Space
Application
WriteMultiple
Space
Application
Space
Read
Application
ReadMultiple
Space
Application
Take
Space
Application
TakeMultiple
Space
Application
Space
Notify
Application
87
Execute
IMDG Access – Space Operations – Write Operation
write-operation writes a new object to a space
Instantiate an object
Set fields as necessary
Write the object to the space
Space
Application
Write
Auction auction = new Auction();
auction.setType("Bicycle");
gigaSpace.write(auction);
88
IMDG Access – Space Operations – Read Operation
read-operation reads an object from a space
A copy of the object is returned
The original copy remains in the space
Build a template/query (more on this later)
Read a matching object from the space
Space
Application
Read
Auction template = new Auction();
Auction returnedAuction = gigaSpace.read(template);
89
Object SQL Query Support
Supported Options and Queries
Opeations: =, <>, <,>, >=, <=, [NOT] like, is [NOT] null, IN.
GROUP BY – performs DISTINCT on the POJO properties
Order By (ASC | DESC)
SQLQuery rquery = new SQLQuery(MyPojo.class,"firstName rlike '(a|c).*' or ago > 0 and lastName
rlike '(d|k).*'");
Object[] result = space.readMultiple(rquery);
Dynamic Query Support
SQLQuery query = new SQLQuery(MyClass.class,“firstName = ? or lastName = ? and ago>?");
query.setParameters(“david”,”lee”,50);
Supported Options via JDBC API
COUNT, MAX, MIN, SUM, AVG , DISTINCT , Blob and Clob , rownum , sysdate , Table aliases
Join with 2 tables
Non Supported
HAVING, VIEW, TRIGGERS, EXISTS, BETWEEN, NOT, CREATE USER, GRANT, REVOKE, SET PASSWORD, CONNECT USER, ON.
NOT NULL, IDENTITY, UNIQUE, PRIMARY KEY, Foreign Key/REFERENCES, NO ACTION, CASCADE, SET NULL, SET DEFAULT, CHECK.
Union, Minus, Union All.
STDEV, STDEVP, VAR, VARP, FIRST, LAST.
# LEFT , RIGHT [INNER] or [OUTER] JOIN
90
IMDG Access – Space Operations – Take Operation
take-operation takes an object from a space
The matched object is removed from the space
Build a template/query (more on this later)
Take a matching object from the space
Space
Application
Take
Auction template = new Auction();
Auction removedAuction = gigaSpace.take(template);
91
IMDG Access – Space Operations – Update Operation
update is equivalent to performing take and write
Executed in a single atomic call
Space
Application
Update
AuctionItem item = new AuctionItem();
item.setType("Bicycle");
gigaSpace.write(item);
item = gigaSpace.read(item);
item.setType("Motorbike");
Object returnedObject = space.update(item, null,
Lease.Forever, 2000L,
UpdateModifiers.UPDATE_OR_WRITE);
92
IMDG Access – Space Operations – Batch API
Apart from the single methods GigaSpaces also provides
batch methods
The methods are:
writeMultiple: writes multiple objects
readMultiple: reads multiple objects
updateMultiple: updates multiple objects
takeMultiple: reads multiple objects and deletes them
Notes:
Performance of the batch operations is generally higher
Requires one call to the space
Can be used with Template matching or SQLQuery
93
IMDG Access – Space Operations – Batch API
• writeMultiple writes the specified
objects to the space.
Space
Application
WriteMultiple
Auction[] auctions = new Auction[] {
new Auction(10),
new Auction(20)
};
auctions = gigaSpace.writeMultiple(auctions, 100);
94
IMDG Access – Space Operations – Batch API
• readMultiple reads all the objects
matching the specified template from the
space.
Space
Application
ReadMultiple
Auction auction = new Auction();
Auction[] auctions = gigaSpace.readMultiple(auction, 100);
95
IMDG Access – Space Operations – Batch API
• takeMultiple takes all the objects
matching the specified template from the
space.
Space
Application
TakeMultiple
Auction auction = new Auction();
Auction[] auctions = gigaSpace.takeMultiple(auction, 100);
96
IMDG Access – Space Operations – Batch API
• updateMultiple updates a group of
specified objects.
Space
Application
UpdateMultiple
Auction[] auctions = new Auction[] {
new Auction(10),
new Auction(20)
};
auctions = gigaSpace.updateMultiple(auctions, 100);
97
IMDG Summary
Powerful shared memory Service
Distributed
Fault Tolerant
Object based
Single and Batch Operations
Transactional
98
Download