박혜웅 - Tistory

advertisement
Message Driven Architecture
for Massive Service
Elastic Scalability, High Availability
2011.11.18
박혜웅
Massive Service
Think different
No good solution for all cases
Good
Bad
디자인이 이쁘다.
귀가 무겁다.
선이 없어어 편하다.
가끔 끊긴다.
겨울에 귀가 따뜻하다
여름에 귀에 땀이 난다.
3
Cloud Architecture
•
Elastic Scalability (기민한 확장성)
–
시스템 부하에 따라 빠르게 확장,축소할 수 있어야 한다.
•
•
High Availability (고가용성)
–
가용성이 99%와 99.999%는 매우 큰 차이이다.
•
•
•
–
•
Availability = 서비스 가능 시간 / 전체 시간
99.999% (무중단 시스템)
– downtime: 26초/월 (약5분/년)
– 원자력 발전소
서비스 정기 점검도 장애시간(downtime)에 포함됨.
Single Point Of Failure 을 제거하는 것이 중요.
Automatic Resource Management (자동 리소스 관리)
–
•
부하의 종류에 따라 확장할 수 있는 아키텍쳐가 필요하다.
Resources: CPU, MEM, Disk...
Self-healing (자동 복구/치료)
"클라우드 컴퓨팅 구현 기술(김형준 외)"의 p66에서 발췌
4
What we need for Massive Service?
•
coupled vs decoupled architecture
–
decoupled architecture
•
•
•
systems for removing SPOF
–
for All System
•
–
–
health-checking script
for RDBMS/NoSQL
•
•
Hadoop/HBase dual namenode (next version, 0.23)
MySQL cluster or MySQL replication( + heartbeat) or MySQL multiple-master
blocking vs non-blocking (synchronous vs asynchronous)
–
–
•
distributed coordinator
for Load balancer
•
•
distributed data cache
distributed message queue
blocking(synchronous): easy coding, big resources
non-blocking(asynchronous): hard coding, small resources
multi-thread(single-port) vs single-thread(multi-port)
–
advantage of single thread cheap server
•
•
No locking, No Synchronization
easy to coding
5
What we need for Massive Service?
•
low cost
–
money
•
•
–
time:
•
•
–
•
hardware based vs software based
commercial software vs free software
development & debugging
management
human resouces
performance tunning
–
Linux options
•
–
JVM options
•
–
stress test
socket options
•
–
Xms, Xmx, GC option
the number of processes, threads (each system)
•
–
ulimit, ...
TCP_NODELAY, SEND/RECV_BUFFERSIZE...
RDBMS/NoSQL options
6
What we need for 칼퇴근?
•
experts for each technical area = DRI(Directly Responsible Individual in Apple
Inc.)
–
coding & interface
•
–
DB & storage
•
•
•
–
Google Protocol Buffer, Guice, Log4j, Slf4, Xstream, Jackson, Java mail, ....
system management
•
–
coordinator(Zookeeper)
cache server(Redis, Memcached, Ehcache)
queue server(RabbitMQ, ZeroMQ)
util software
•
–
MapReduce, machine learning
distributed system software
•
•
•
–
Java NIO, Netty
data analysis
•
–
RDBMS(MySQL, MyBatis)
NoSQL(Hbase)
storage(DAS, NAS, HDFS, Haystack ...)
network & threading
•
–
code convention, design pattern, UML
Linux, monitoring tools, JMX
hardware
•
L4 switch
7
What we need for 칼퇴근?
•
fast & easy development/debugging
–
good architecture
•
•
•
–
common util classes
•
–
JUnit
well-known system or not?
•
•
•
•
Apache Commons, Google Guava,...
Test Driven Development (TDD)
•
–
system architecture
design pattern
code convention
RDBMS vs NoSQL
JSON vs Google Protocol Buffer
JUnit vs Guice
easy management
–
logging system
•
–
–
logging, collecting, parsing, log visualization
JMX
Admin/Monitoring tools or web pages
8
many Kinds of Decoupling
•
decoupling(removing) of SPOF and our system
–
Distributed Coordinator
process
process
Coordinator
SPOF
SPOF
SPOF
process
process
•
decoupling of business logic and data
–
Distributed Cache
process
logic
process
data
logic
DB
Cache
data
process
logic
process
DB
data
logic
data
•
decoupling of function and control(message)
–
Message Queue
process
function
function
9
process
Queue
process
function
message
function
the steps of Decoupling (step1)
•
Distributed Coordinator
–
registry: important data (small size)
•
•
•
–
server status
server configuration
common data
removing SPOF from our system
Coordinator
registry
process
process
function
function
function
function
data
data
data
data
registry
DB
DB
process
process
function
function
function
function
data
data
data
data
registry
10
the steps of Decoupling (step2)
•
Distributed Data Cache
–
fast read/write in memory
•
–
alleviate DB overload
•
•
–
–
–
10~100times faster than DB query.
read query: read cache instead of DB.
write query: lazy update for DB with write-through queue.
remove duplicated data
remove overhead of data synchronization among processes.
fault tolerant system
•
no matter what process terminated in the same cluster.
Coordinator
registry
Coordinator
registry
process
function
function
data
data
process
function
function
Cache
data
DB
data
data
cluster
data
process
function
process
function
process
function
function
data
data
function
DB
data
11
function
the steps of Decoupling (step3)
•
Distributed Message Queue
–
scale out (elastic scalibility)
•
•
–
fault tolerant system
•
–
but lazy processing
system monitoring
•
Coordinator
registry
when all process terminated, message queue server preserves messages.
prevent server overload or failure.
•
–
auto scaling by fan-out exchange rule.
light-weight processes(daemons).
just monitor queue status.
process
function
Coordinator
registry
function
Cache
Cache
data
data
data
data
data
data
process
Queue
process
function
message
function
cluster
process
function
process
function
DB
data
function
DB
data
12
process
Queue
process
function
message
function
Scale Out
cluster
cluster
Coordinator
registry
node
Cache
data
data
data
node
node
node
cluster
node
Cache
cluster
data
cluster
data
node
data
node
node
node
process
process
task #1
Queue
message
function
n connections
function
node
Queue
task #2
message
node
node
node
DB
data
13
message
message
work
queue
process
function
node
node
node
SEDA vs Message Driven Architecture
process
data/heap area
global variable
thread
Queue
thread
function
event
function
thread
Queue
thread
function
event
function
SEDA
thread
data
DB
data
service
node
node
node
node
node
process
Queue
process
function
message
function
Cache
data
node
node
process
Queue
process
function
message
function
DB
data
14
Coordinator
registry
MDA
code of Message Driven Architecture
•
simple chatting service (simple client-server based model vs MDA)
/** Simple Client-Server Model **/
/* Send Thread */
myInfo = xml.getInfo(xmlFile); // from local file
db.setAlive(myInfo); // updates server status
/** Message Driven Architecture **/
/* Send Thread (Process) */
myInfo = Zookeeper.getInfo(zookeeperList, myIp, myPort);
Zookeeper.setAlive(myInfo);
servers = connectAll(relayServers);//connects to other servers.
queue = Queue.getQueue(myInfo.queue);
cache = Cache.getCache(myInfo.cache);
while( (input=client.getInput()) !=null ){
roomInfo = localData.getRoomInfo(client.userId);
for( userId: roomInfo.getUserIds() ){
for( server : servers ){
if( server.hasUser(userId) )
server.send(userId, input);
}
}
}
while( (input=client.getInput()) !=null ){
roomInfo = cache.getRoomInfo(client.userId);
for( userId : cache.getUserIds(roomInfo.no) ){
queue.publish(new Message(userId, input));
}
}
/* Receive Thread */
while(true){
message = socket.receive(); // from other server
user = localData.getUser(message.userId); //from local
client = getClient(message.userId);
client.send(user.name + ":" + message.input);
}
/* Receive Thread (Process) */
while(true){
message = queue.consume(); // from queue
user = cache.getUser(message.userId); // from cache
client = getClient(message.userId);
client.send(user.name + ":" + message.input);
}
inter-server networking
(p2p)
queueing/dequeuing
(work queue)
15
Summary
개발자 관점
Client-Server Based
시스템/역할 분담
Message Driven
서비스별
기능별 (e.g. API, file, DB, logging, ....)
개인 전문성
비지니스 로직(서비스 흐름)
기술적 지식
서비스 개발
개인별
협업
없어도 개발 시작 가능
process간 연동 문서 필요
inter-process interface (queue)
shared data scheme (cache)
약함 (개인별 프로젝트 진행)
긴밀 (한 서비스를 위해 구성원 대부분의 협의 필요)
모든 개발자
일부 담당자
선행 개발 문서(필수)
팀내 의사소통
타부서와 협의(PM)
기획/마케팅팀
디자인팀
클라이언트팀
PM (service manager)
API Part
inter-process
interface
(queue)
Logic Part
inter-process
interface
(queue)
shared data scheme (cache)
16
DB Part
Summary
시스템 관점
Client-Server Based
Message Driven
서버간 복잡도
매우 복잡 (서버끼리 모두 연결 필요)
덜 복잡 (coordinator, cache, queue에만 연결)
확장성/효율성
낮음 (불필요한 로직도 구동)
높음 (간단한 로직의 process만 구동)
서버 업데이트
어려움 (전체 패치만 가능)
쉬움 (queue서버가 임시로 task 저장가능)
(상위 버전용 process 미리 구동 가능)
서비스 단위 장애
부분 장애 (로직의 크기에 따라 다름)
프로토콜 수정
쉬움 (함수 재정의)
어려움 (message scheme를 공유해야 함)
서버상태/로깅
서비스별 (개인별)
중앙식 (queue 서버만 모니터링/로깅하면 됨)
비즈니스 로직
모든 비즈니스 로직 가능
loop또는 rollback이 필요한 비즈니스 로직 어려움.
코드 복잡도
복잡
단순 (간단한 로직 단위)
코딩 스타일
서비스별로 다름
기능별로 다름
process내에 다양한 모델이 공존
process종류별로 다른 Thread모델 사용
서비스 장애
Thread/Worker
Model
17
Appendix
Think deeply
Single-thread vs Multi-thread
•
Multi-thread
–
I/O intensive task (blocked task)
process
thread
thread
DB
thread
data
•
Single-thread
–
CPU/Mem intensive task (non-blocked task)
process
thread
data
process
thread
MEM
data
process
thread
data
19
Download