Team 1: Box Office ENGINEERING & 17-654: Analysis of Software Artifacts

advertisement
Team 1: Box Office
17-654: Analysis of Software Artifacts
18-846: Dependability Analysis of Middleware
JunSuk Oh, YounBok Lee, KwangChun Lee, SoYoung Kim, JungHee Jo
Electrical &Computer
ENGINEERING
Team Members
JunSuk Oh
YounBok Lee
KwangChun Lee
SoYoung Kim
JungHee Jo
http://www.ece.cmu.edu/~ece846/team1/index.html
2
Baseline Application
• System description
– Box Office is a system for users to search movie tickets and reserve tickets
• Base Features
– A user can login
– A user can search movies
– A user can reserve tickets
• Configuration
– Operating System
• Server: Windows 2000 Server, Windows XP Professional
• Client: Windows XP Professional
– Language
• Java SDK 1.4.2
– Middleware
• Enterprise Java Beans
– Third-party Software
•
•
•
•
Database: MySQL
Web Application Server: Jboss
Java-IDE: Eclipse, Netbean
J2EE Eclipse Plug-in: Lomboz
3
Baseline Application - Configuration Selection Criteria
• Operating System
– Easier to set up the development environment than Linux Cluster
– Easier to handle by ourselves
• JBoss
– Environment is supported by teaching assistants
• EJB
– Popular technology in the industry, members’ preference
• MySQL
– Easy to install and use
– Easy to get the developing document
• Eclipse
– All team members have experience in this technology
• Lomboz
– Enables Java developers to build, test and deploy using J2EE
4
Baseline Architecture
Client Tier
Middle Tier
DB Tier
Entity Beans
JNDI
cardinfo
login
movie
DataBase
reserv
session
user
Client Pool
Session Bean
DB
JNDI Lookup
RPC
Client
Entity Bean
Table
DB Access
5
Fault-Tolerance Goals
• Replication Style
– Passive Replication
• Approach
– Replication
• 2 replicas are located on separate machines
– Sacred Components
• Replication manager
• Database
• Client
– Fault Detector
• Client
– State
• All beans are stateless
• States are stored in the database
6
FT-Baseline Architecture
Client side
(Sacred)
Fault Tolerant
Sacred
Machine 1
Primary Replica
Client 1
Client 2
Client n
JNDI
Database
Factory
Machine 2
Backup Replica
JNDI
Replication
Manager
Factory
7
Mechanisms for Fail-Over (1)
•
Fault Injector
– Periodically, fault injector
kills replica in turn (1 min)
•
Replication manager
– 10 seconds after server fails,
Replication Manager
invokes factory to relaunch
the failed replica .
•
Fail-over mechanism
–
–
–
–
Fault detection
Replica location
Connection establishment
Retry
Fail-Over Mechanism
Replication
Manager
Fault
Injector
Replicate
Inject fault
Factory
Factory
Primary
Replica
1. Request
Backup
Replica
3. Connection established
2. Server Failed
4. Retry
Client
8
Mechanisms for Fail-Over (2)
• Fault Detection
– Exception handling by Client
• RemoteException – NoSuchObjectException, ConnectException (RMI)
• NameNotFoundException (JNDI Failure)
• Replica location
– Client knows the servers from whom it should request service
• Connection establishment
– Get a connection to new replica
– Server reference should be looked up:
• When client request the service for the first time
• When client detects server failure and try request to other server
– Client retries the request to backup replica until service becomes available
• Retry
– Request service again
9
Failover Mechanism (3) - Avoid Duplicate Transaction
•
Target case
– Transaction is stored in the DB but it cannot be informed to client
1. Service request
2. Store to DB
Server
Client
3. Return result
4. Inform client
•
Database
Mechanism
Replica 1
Client
Database
Replica 2
10
Fail-Over Measurements
–
–
Round Trip Time in Failover
(14 Fault Injections)
High Peak: RemoteException
Low Peak:
NameNotFoundException
10000
RTT (ms)
1000
200
100
10
0
10
20
30
40
50
60
70
80
90
100
# of Invocations
11
Fail-Over Measurements
Decomposition of RTT in Failover (Low Peaks)
FD, 16ms ,
7%
Retry,
116ms ,
55%
Decomposition of RTT in Failover (High Peaks)
CE, 93ms ,
1%
Retry,
123ms , 2%
CE, 82ms ,
38%
FD, 7661 ,
97%
FD: Fault Detect
CE: Connection Establishment
12
RT-FT-Baseline Architecture
• Two steps to the Optimization
– Step 1: Reduce the connection establishment time
• Client needs to reconnect to available replica after fault detection
• Pre-established connection: Connector on the client side will maintain the
connection to each replica in the background
► Reconnection time disappeared but still graph shows spikes due to the time for
catching connection exception
– Step 2: Reduce the fault detection time
• Reducing the catching exception time
– RemoteException – NoSuchObjectException, ConnectException
• Having fault detector on client side
• Fault detector will update the status of replicas periodically.
• Clients can know the status of replicas beforehand.
► Getting rid of fault detection time as well as spikes!!
13
RT-FT-Baseline Architecture
statusServer1
statusServer2
checking
Client
update
Replica 1
Local
FD
Connector
Replica 2
: Establishing connection as background
: Pinging for checking status
14
Bounded “Real-Time” Fail-Over Measurements
• Fail-over graphs after optimization step1
200
15
Bounded “Real-Time” Fail-Over Measurements
• Fail-over graphs after optimization step2
200
16
Analysis on Fail-over Optimization
Failover Optimization (Low Peaks)
Failover Optimization (High Peaks)
FD, 16ms ,
7%
Retry,
116ms ,
55%
CE, 93ms ,
1%
Retry,
123ms , 2%
CE, 82ms ,
38%
:Reduced
part
FD, 7661 ,
97%
Low Peaks
FD
CE
Retry
7661ms 93ms
123ms
After
0ms
0ms
104ms
FD:optimization
Fault Detect CE: Connection
Establishment
0ms
0ms
104ms
Reduction
100%
100%
15.45%
Before optimization 16ms
100%
CE
High Peaks
82ms
100%
Retry
116ms
10.34%
FD
17
High Performance: Load Balancing
• Distributed clients’ requests among multiple servers
• Having separate load balancer to control the access to the servers
• Strategy
– Static load balancing
• Round Robin way
• Assign server in turns
– Dynamic load balancing
• Load balancer periodically checks the current number of client of each server
• Dynamically assign the server to each client
– Simulation strategy
•
•
•
•
•
Measurements on the actual server A&B RTT
Move to the SIMULATION environment
Find the working load balancing strategy
Confirm the load balancing strategy in the actual environment
Find alternative load balancing strategies
18
Load Balancing Strategy
Strategy 1 (Round Robin)
Replica A
Replica B
Strategy 2 (Check for # of clients)
Replica A
Replica B
2. How many Clients?
3. Two
Client 1
Load
Balancer
5. Server B
4. Which Server?
Load
Balancer
3. Ten
Client 2
…
Client N
Client 1
Client 2
…
Client N
19
Performance Measurements
Load Balance Test
- RTT of aClient
1Client
1600
replica
Single Server
1400
Load Balance1 (Round Robin)
Load Balance2
RTT (ms)
1200
1000
800
600
400
200
0
0
10
20
30
# of Clients
40
50
20
Load Balancing Strategy
• Load balancing strategy by using historical data and simulation systems
• Testing load balancing strategy under the simulation environment
• Predict load balancing strategy performance
Server A
60
80
100
120
100
80
60
40
Client 50 : Sample50,1 , Sample50, 2 , Sample50,n
Min Max
Load Balancing
Algorithm
40
Client 2 : Sample2,1 , Sample2, 2 , Sample2,n
Min Max Load Balancing
120
Random Load Balancing
Client1 : Sample1,1 , Sample1, 2 , Sample1,n
Client # 4
Histogram of islands
40
30
20
Frequency
0
0
5000
10000
15000
0
0
0
0.08 0.10
0.06
11
0.02 0.04
Density
19
5
3
2
1 0 0
0.00
20
40
60
80
sqrt(islands)
100 120 140
1
15000
islands
25 30 35
Frequency
1
Histogram of sqrt(islands)
10 15 20
0
Load Balancing
Algorithm Development
1
10000
islands
0
Server B
1
5000
Histogram of sqrt(islands)
5
Client 50 : Sample50,1 , Sample50, 2 , Sample50,n
1
0
2
Client 2 : Sample2,1 , Sample2, 2 , Sample2,n
Data Collection
41
10
20
Frequency
30
40
Histogram of islands
10
Client1 : Sample1,1 , Sample1, 2 , Sample1,n
Round Robin
Algorithm
Client # 4
0
20
40
2
60
3
80
2
100 120 140
sqrt(islands)
Algorithm
Performance Prediction
21
More on Strategy
Consider X clients
Client1 : Sample1,1 , Sample1, 2 , Sample1,n
Client 2 : Sample2,1 , Sample2, 2 , Sample2,n
Min Max Algorithm
Random
Client 50 : Sample50,1 , Sample50, 2 , Sample50,nSamples
Server A
ALLOCATE
Repeat 1000
Y clients
from Server A
Average Y
Clients RTT
Average X-Y
Clients RTT
Average RTT
Comparison of Load Balancing Strategy
800
Random
Samples
Server B
Client 50 : Sample50,1 , Sample50, 2 , Sample50,n
200
400
Average RTT
Client 2 : Sample2,1 , Sample2, 2 , Sample2,n
600
Client1 : Sample1,1 , Sample1, 2 , Sample1,n
X-Y clients
from Server B
10
20
30
40
50
Client #
22
Server A & B Performance Measurements (RTT)
1500
0
500
1000
Server B RTT[ms]
1000
500
0
Server A RTT[ms]
1500
2000
Server B
2000
Server A
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49
Client #
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49
Client #
23
Performance Measurements (II)
2000
1000
Server B RTT[ms]
500
0
0
0
500
500
1000
Server A RTT[ms]
1500
2000
Server B
1500
2000
1500
1000
Server B RTT[ms]
1500
1000
500
0
1
7
14
22
30
38
1
7
14
22
30
38
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49
Client #
46
1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49
Comparison of Load Balancing Strategy
Client #
Comparison of Load Balancing Strategy Client #
600
400
Min Max Loading Balancing
Average RTT
400
Random Load Balancing
300
LP Load Balancing
10
20
30
Client #
40
50
200
Average RTT
500
800
600
Client #
46
200
1
100
Server A RTT[ms]
Server A
Server B
2000
Server A
10
20
30
Client
40
50
24
Other Features
Server B
Server A
Experimental
Data from Server
Algorithm Testing
With Empirical Data
& Parameter Updates
Histogram of islands
30
10
20
Frequency
30
20
Frequency
120
100
10
120
100
Load Balancer
Intelligence Update
Histogram of islands
1
1
1
1
0
0
2
0
5000
10000
15000
0
5000
10000
0
0
1
15000
islands
80
islands
80
Server B
41
40
Min Max Load Balancing
40
Random Load Balancing
0.08 0.10
0.06
0.02 0.04
Density
20
40
60
80
sqrt(islands)
Client # 4
19
11
5
3
2
1 0 0
0.00
0
0
Client # 4
Histogram of sqrt(islands)
25 30 35
Frequency
10 15 20
60
40
40
60
Histogram of sqrt(islands)
5
Server A
100 120 140
0
20
40
2
60
3
80
Client
Client
2
100 120 140
sqrt(islands)
25
Insights from Measurements
• FT
– Two different types of peak were measured according to different exception.
• RT-FT
– Connection Establishment time was removed
• Pre-connection before the failover.
• But, still high peak remained.
– Fault Detection time was removed
• Watchdog before catching exception.
• RT-FT Performance
– Round Robin is good for our situation
• Servers have similar capacity.
– Load balancing algorithm can be selected considering running environment
• Test Environment
– Keep clean environment to reduce jitter.
26
What we learned & accomplished
• What we learned?
– How to handle JBoss
• First experience for majority of team members
– Careful analysis of the test result definitely save the time
– How to control the factors to get the better data
• What we accomplished?
– FT
• Passive replication strategy
• Avoid duplicate transaction
– RT-FT
• Pre-established connection strategy
• Local Fault Detector for checking status of server beforehand
– Performance
• Implement Static Load Balancing
• Implement Dynamic Load Balancing
• Simulate several load balancing strategy
27
Open Issues & Future Challenge
• Open Issue
– FindAll() doesn’t work on Jboss on Linux
• It works well on Windows OS
– Implementing several load balancing strategy
• Min Max, LP (Linear Programming) algorithm
• Future Challenge
–
–
–
–
Separate JNDI service
Get server list from Replication Manager dynamically
Try Active Replication
Try development without IDE tool
28
Download