THUCloud: Learn by Doing(边干边学云计算)

advertisement
THUCloudDisk:
Learn by Doing
清云网盘:边干边学
云计算
李振华
GreenOrbs云计算与未来网络组
http://www.greenorbs.org/people/lzh/
Dec. 26th, 2013
1
Cloud Storage Service
 Enabled by Cloud Computing & Internet Broadband
 Extremely popular in recent years





SkyDrive: 200 M users
Dropbox: 100 M users
Google Drive: numerous …
Apple iCloud: countless …
Box.com: 14 M users
2
Our Cloud Storage Research (1)
Started from the end of 2011
- Focus on the most representative service, i.e. Dropbox
 Black-box measurement: Traffic Overuse Problem
and Computation Overuse Problem
 用户直观认为的资源开销 << 系统实际的资源开销
3
Our Cloud Storage Research (2)
 Solve the problems by developing middleware
- Significantly reduce the traffic/computation overuse
 Modify the Linux kernel
to thoroughly address
this problem
 Zhenhua Li, et al. Efficient Batched Synchronization in Dropbox-like Cloud Storage
Services. The 14th ACM/IFIP/USENIX International Middleware Conference
(Middleware), Dec. 9-13, 2013, Beijing, China. (accept ratio: 24/128 = 18.8%)
 Zhenhua Li, et al. Is the Cloud Storage Service Traffic All Necessary? Understanding
the Data Sync Traffic Usage Effectiveness. In submission.
4
Drawback of Our Research
Black-box measurement and middleware
solution are very, very insufficient
What happens after the data
packet dives into the cloud?
“Google Drive, SkyDrive and
Dropbox do have problems.
But have you considered the
problems from a system
design/tradeoff perspective?”
5
So the ThuCloudDisk project started …
We are re-developing a small-scale Dropbox
from scratch
 White-box
measurement
 Full knowledge of
the system
 Add any function
as we like
6
 Amazon云计算的
开源等价物
 称为“云计算的
Linux”
 新浪SAE—微博
 除了Keystone,
别的组件都独立
一定使用官方教程,
虽然它很长……
7
http://www.thucloud.com
8
Three Potential Problems
1. Does RAID Conflict with Cloud ?
2. How to Properly Configure Openstack Parameters?
3. When Moving Servers to the Real Data Center
4. Numerous Smartphones Aggregate Data into Cloud
9
Problem 1
Our HP Servers have internal RAID Cards and
the RAID function CANNOT be disabled
 “RAID on the storage drives is not required and not
recommended. Swift's disk usage pattern is the worst case
possible for RAID, and performance degrades very quickly
using RAID 5 or 6.”
Openstack Swift official deploy manual
Our findings:
1) RAID penalty does exist for Swift
2) Only for some patterns of data streams
3) Only for some kinds of RAID
Bounding their reciprocities and conflicts
10
Problem 2
How to properly configure Openstack paras?
swift-ring-builder account.builder create 18 3 1
swift-ring-builder container.builder create 18 3 1
swift-ring-builder object.builder create 18 3 1
Openstack Swift official deploy manual
Thierry’s findings:
1) Some people are discussing about the
parameters in their blogs
2) Sometimes the paras are very bad
3) But we do not know the rules …
Finding the rules of Openstack paras
11
Problem 3
When moving servers to CERNET data center
1、机房默认封锁托管服务器所有端口,要几个开几个
2、服务器一般不能重启,否则……
3、机房的所有托管服务器登录之后都只能看到局域网地址
Openstack官方教程、网上各种攻略绝
大多数地方使用的是公网地址
Openstack官方教
程有多处错误,
但却是唯一靠谱
的教程
我们的应对办法:
1) 现学防火墙、端口扫描、NAT网络地址转换
技术
2) 反复研究官方教程中的每一句话、每一行命令
3) 遇到无法解决的问题,还得亲自进机房
12
Problem 4
Numerous smartphones aggregating data
into Openstack Swift
 We find almost all the data are Appended to certain files
 But Swift can only Create or Delete files
Like what Dropbox does, we implement a
rsync layer between the clients and Swift
 However, we find most traffic and
computation overheads are unnecessary
X
 This is why we are now implementing
an additional “virtual” APPEND API
for Openstack Swift
“适合静态动态数据变动模式的综合业
务云服务平台”
13
云计算到底是什么?
云计算其实什么都不是
翻阅数百页Openstack官方文档,压根就没有”Cloud”这个词,
但是有一个对Openstack的定义:
Openstack是一族Linux API和工具的集合,让服
务器管理员能够活的更轻松、用户能够用的更流畅
Openstack = Linux + ssh + ssl + MySQL + Apache +
rsync + scp + xfs
 用Python脚本(还有一点C/C++)粘合到一起
14
云计算到底有什么好处?
跳过务虚的千条万条……
只举一个实际例子
 10月份在本地搭好云平台
 11月上旬送进CERNET机房,服务一直稳定运行
 12月上旬突然发现部分服务器已经挂了很久了
 然而,这一个月里,我们的用户没有任何感觉
 替换掉坏服务器的部分硬盘后,十几条命令就让挂掉的服务
器恢复工作,即使在这段时间里云服务依然稳定运行
如果纯用Linux构建同样稳定容错的云计算系统,
则需要几个Linux高手中的高手奋战数月
15
Web端和客户端演示(Esc)
16
世纪末日的话
如果世纪末日到了,让你留下一句话给孩子
 今天的云计算已经被太多人过度神话了……
这个世界最大的痛苦不在于不圆
满,而在于被扭曲。
人生最主要的进步就是不断摆脱
语言对人格的扭曲。
 当我们真的需要云计算的时候,“贴着地面步行,
不在云端跳舞”。对待真实普适的系统问题,给出
我们学术界科学、量化、坚实的独特解答。
17
Join us & Learn by doing!
Dr. Zhenhua Li
Dr. Jian Li
PhD students:
Linsong Cheng
Zhen Lu (potential)
Master students:
He Xiao
Xin Zhong
Yinlong Wang
Thierry
Download