Advanced data management

advertisement
Advanced data
management
Jiaheng Lu
Department of Computer Science
Renmin University of China
www.jiahenglu.net
Course purpose
 Teach in English
 The objective is to expose
graduate students to exciting
data management topics
2
Course contents
 Cloud computing and cloud data
management
 XML data management
 Column-store database
 Data processing in bioinformatics
3
Lecturer Academic experience

2006.9 ~2008.6 University of California,
Irvine, Postdoc researcher

2002.8 ~2006.8 National University of
Singapore, PhD candidate

1998.9 ~ 2001.1 Shanghai Jiao Tong
University Master candidate
University of California, Irvine
Research in Postdoc
Data integration in medical system
[US patent]
Approximate string search [ICDE08]
6
6
National University of Singapore
7
Course grading
 Report
30%
 Google App Engine 30%
 In-class presence and quiz 40%
8
Any question and any comments ?
2016/6/28
9
Cloud computing
Why we use cloud computing?
Why we use cloud computing?
Case 1:
Write a file
Save
Computer down, file is lost
Files are always stored in cloud, never lost
Why we use cloud computing?
Case 2:
Use IE --- download, install, use
Use QQ --- download, install, use
Use C++ --- download, install, use
……
Get the serve from the cloud
What is cloud and cloud
computing?
Cloud
Demand resources or services over Internet
scale and reliability of a data center.
What is cloud and cloud
computing?
Cloud computing is a style of computing
in which dynamically scalable and often
virtualized resources are provided as a
serve over the Internet.
Users need not have knowledge of,
expertise in, or control over the technology
infrastructure in the "cloud" that supports
them.
The architecture of cloud computing
system
Characteristics of cloud
computing

Virtual.
software, databases, Web servers,
operating systems, storage and networking
as virtual servers.

On demand.
add and subtract processors, memory,
network bandwidth, storage.
Types of cloud service
SaaS
Software as a Service
PaaS
Platform as a Service
IaaS
Infrastructure as a Service
SaaS
Software delivery model




No hardware or software to manage
Service delivered through a browser
Customers use the service on demand
Instant Scalability
SaaS
Examples

Your current CRM package is not
managing the load or you simply don’t
want to host it in-house. Use a SaaS
provider such as Salesforce.com

Your email is hosted on an exchange
server in your office and it is very slow.
Outsource this using Hosted Exchange.
PaaS
Platform delivery
model



Platforms are built upon
Infrastructure, which is expensive
Estimating demand is not a science!
Platform management is not fun!
PaaS
Examples

You need to host a large file (5Mb) on your
website and make it available for 35,000 users for
only two months duration. Use Cloud Front from
Amazon.

You want to start storage services on your network
for a large number of files and you do not have the
storage capacity…use Amazon S3.
IaaS
Computer infrastructure
delivery model



A platform virtualization
environment
Computing resources, such as
storing and processing capacity.
Virtualization taken a step further
IaaS
Examples

You want to run a batch job but you don’t
have the infrastructure necessary to run it
in a timely manner. Use Amazon EC2.

You want to host a website, but only for a
few days. Use Flexiscale.
Cloud computing and other computing
techniques
An Industry Transformed
Delgo www.delgo.com
http://www.boxofficemojo.com/
Shrek, Delgo, and Others
•Why did Dreamworks use this?
•Upsides?
•Downsides?
Grid Computing & Cloud Computing


share a lot commonality
intention, architecture and technology
Difference
programming model, business model,
compute model, applications, and
Virtualization.
Grid Computing & Cloud Computing

the problems are mostly the same



manage large facilities;
define methods by which consumers
discover, request and use resources
provided by the central facilities;
implement the often highly parallel
computations that execute on those
resources.
Grid Computing & Cloud Computing

Virtualization
 Grid


do not rely on virtualization as much
as Clouds do, each individual
organization maintain full control of
their resources
Cloud

an indispensable ingredient for
almost every Cloud
Any question and any comments ?
2016/6/28
35
Google App Engine
Google App Engine
 Does
one thing well: running web apps
 Simple
app configuration
 Scalable
 Secure
37
App Engine Does One Thing
Well
 App
Engine handles HTTP(S)
requests, nothing else


Think RPC: request in, processing,
response out
Works well for the web and AJAX; also
for other services
 App

configuration is dead simple
No performance tuning needed
38
App Engine Architecture
req/resp
stateless APIs
urlfech
mail
R/O FS
Python
VM
process
stdlib
app
images
stateful
APIs
memcache
datastore
39
How to use Google App engine

Download Java 6

Download Eclipse and Google plug in

Register a user account in Google

Create an application (python, Java) and
upload the code
In class quiz

Please answer all questions

You may be requested to answer a question
later. Your performance will affect your final
score.
Study Google App Engine
http://code.google.com/intl/en/appengine/docs/j
ava/gettingstarted/
Download