EECS 221: INFORMATION STORAGE LECTURE NOTES

advertisement
EECS 221: INFORMATION STORAGE
LECTURE NOTES
Instructor: Zhiying Wang
Winter Quarter
Week 1
1/5/2016 - 1/7/2016
L ECTURE : 1/5/2016
From 2012 until 2020, data will double every two years
Every person will have an avg of 5,200 GB of data, which results in 40 Zettabytes (1021 ) of data in 2020.
Challenges for Storage
Cloud, Public & Private
• subscription based storage
• storage policies change faster
In-memory computing
• necessary for new storage technology
• example: server systems
• computing from CPU to memory (storage part)
SW/HW Integration Need optimized platform storage interconnect
Need optimized software storage access methods
Object Storage
• file systems (manage data as file hierarchy)
• block storage (manage data as blocks)
• object storage (content that is stored)
• distinguish content requested
Virtualization
• Pooling Physical Storage from multiple network storage devices into what appears to be a single storage
device
1
• Challenges of shared storage structure
Big data and various data sources
New applications (mobiles, wearables)
Resouce allocation and networking strategies
How to measure storage system?
List of Considerations:
• $/GB (density)
• Throughput
– (i) IOPS # of I/O operations supported in system per second
– (ii) transfer rate: amount of data transferred per second
How much money to spend to achieve certain IOPS?
How much energy does it cost to achieve certain IOPS?
Latency
• (i) response time: amount of time elapsed from issued to serviced or command completed
• (ii) service time: time that elapsed from the command service start to command completed
Typically, latency and throughput are inversely proportional
IOPS x Average Block Size = Transfer rate
IOPS is also rate of service request (arrival rate)
Ulitization: arrival rate x service time
M/M/1 Queue: Poisson Process ( arrival, service rates are exponential)
response time =
Service Time
1 - Utilization
Data Storage Hierarchy
• Tier 0: Ultra Low latency (Cache)
• Tier 1: Main Memory, Low Latency (In-memory computing, Virtual Server)
• Tier 2: Mixed used, read intensive (Image Retrieval, Web Search)
• Tier 3: Nearline Storage. High Capacity, Low Power, Higher Latency (Tapes)
HIgh Frequency Trading
1ms improvement = $100M profit/year
$/transaction: Virtual Desktop
Server provides display data through terminals
How much does it cost me to complete one transaction?
Storage Media Considerations
SCM: Memristor, PCM, STT-RAM, Flash, HD (non-volatile)
2
Consider technology maturity
Important Figure of Merits: Density, energy/bit, read time, write time, retention, endurance, 3D Capability
Replace SSD every two years (new tech leads to new densities)
Increase density in 3D space
HDD: First Disk Drive, IBM RAMAC (Random Access Memory with Accounting Control) 1955
Before HDD: Magnetic tapes developed by IBM (1953)
1950: First Magnetic Drum invented for United States Navy (1m bits storage space)
IBM RAMAC: 50 24" disks, 5 million character (1 character = 7 bits)
Average access = 1 second
Better than magnetic tapes which could only implement one dimensional search
Magnetic Disk: Two Dimensional, allows for Random Access (vs. Magnetic Tapes)
Developments: minituarization
1962: every disk has own head (easy to access to one surface of the disk)
1963: Removable disks, not used anymore
These led to Floppy disks
HDD Diameter: 24 inches in 1955 –> 1 in in 2000
Size decrease = Density increase
Flash Storage
• NAND (decreases in nm throughout the years, better in areal perspecitve )
• Logic
Flash Storage is enabled from software
Storage vs. Communication Systems = Communication Systems can retransmit on error
Cannot ask storage device to write again.
Need very small error probability (achieved by signal processing, error correction, and calibration)
Class Topics
HD
Flash Memory and other non-volatile memory
Storage Architecture in Data Centers
Networking in Data Centers
Power Consumption in Data Centers
Object Storage
Virtualization, Consistent Storage
Grading
Scribing: 5%
HW: 75%
Final Project: 20&
3
Hard Disks: Physical Layer
Angular and Radial Coordinates
How to find information on disk?
To locate point on disk:
• angle: angular position θ
• distance from center to point r
Seek: head moves to radial position desired
Ferromagnetic materials
• separated into grains
• electron spin alignment (aligned magnetic field, magnetizing = writing info)
Hysteresis Loop
Applying Current will magnetize material to a saturation point
Mr ∗ Hc = strength of material
Track = circular area on the disk
Write (applying magnetization)
Direction of magnetization can indicate 0 or 1 theoretically correct but not in practice
Instead, look at transitions for magnetic field
transition or magnetic reversal = 1
Write has to be in one block (to account for transitions, block is known as a sector)
textcolorblueWrite Head
Components of the Write Head
Core, Coil, Poles
Inducing current on coil will emit field to disk, magnetizing the material in a specific direction depending on the
direction of the current
Properties of Write Head
Core material must be easy to magnetize
Head must be strong and durable
Flux must be strong enough to magnetize material
Read Head
Magnetoresistive (MR) Heads
2 directions
Easy and hard direction
Change of magnetic field changes resistance
Transitions cannot be close together or else the resulting signal from the transition pulses will be lower than the
intended signal and may cause errors
4
Download