ppt

advertisement
Enterprise Application of SSD
曹庆玲
[email protected]
• Towards SSD-Ready Enterprise
Platforms
• Building Large Storage Based On Flash
Disks
• Towards SSD-Ready Enterprise
Platforms
• Building Large Storage Based On Flash
Disks
Outline
• Motivation
• Platform and methodology
• Platform bottleneck analysis
Platform latency bottlenecks
I/O processing bottlenecks
Performance scaling bottlenecks
• Conclusion
Motivation
• SSD deliver 2-3 orders of magnitude
increase in IOPS over HDD
• Platform have long been optimized for
HDD
• Is it ready for SSD?
Platform and methodology
Platform and methodology
Platform and methodology
• Use Linux* as a reference OS for experiment
• Focus on fixed-size 4KB random reads .
Random read to avoid I/O merging policies
and if the platform ready for read , then it must
be ready for write.
Platform bottleneck analysis
•
•
•
Platform latency bottlenecks—determine
component dominates I/O latency
I/O processing bottlenecks—determine
software contribute the most CPU overhead
for I/O processing
Performance scaling bottlenecks—determine
component limits scaling of performance
Platform bottleneck analysis
—Platform latency
Total I/O latency is the time from application
issue an I/O to the time it receives completion.
Time due to media
Time due to platform
Platform bottleneck analysis
—Platform latency
The platform only contribute 26% of the total latency.
Optimizing the media is necessary.
Platform bottleneck analysis
—I/O processing cost
35000
Platform bottleneck analysis
—I/O processing cost
•
ahci_interrupt() and ahci_scr_read() executed
uncacheable (UC) reads. The UC reads incurred
averaging 2,100 clocks per UC read.
Device interfaces that adopt message signaled
interrupts (MSI),and the added intelligence to push
status to drivers , can eliminate such UC reads.
Can reduce overhead about 8,400 clocks/IO.
Platform bottleneck analysis
—I/O processing cost
•
I/O processing when done through an MSI-based
interface like LSI’s, incurred 25,000 clocks/IO
Platform bottleneck analysis
—I/O processing cost
•
The LSI’s driver return path (5250 clocks/IO) is
still substantial.
It can be reduced by employing interrupt
coalescing. Then only 650 clocks remain in the
driver return path, resulting in about 20,000
clocks/IO.
Platform bottleneck analysis
—Performance scaling
Ensure that I/O processing scales with cores and SSDs.
The single core with 3 SSDs is fully
saturated,more cores are required.
Platform bottleneck analysis
—Performance scaling
One adapter enable 177K IOPS.
With more throughput scaled up to 445K IOPS.
Conclusion
•
•
•
•
Existing platforms to be ready for SSDs.
Scalability of file system
I/O behavior of real application
Implementation of RAID
• Towards SSD-Ready Enterprise
Platforms
• Building Large Storage Based On Flash
Disks
Outline
•
•
•
•
•
Introduction
SSD RAID configuration
Scalability
Solution alternatives
Conclusion
RAID0
Input data stream
Input data
RAID controller
parallel
SSD1
SSD2
SSD3
SSD4
SSD5
RAID1
Input data stream
RAID controller
Parallel
Work disk
Mirror disk
SSD1
SSD2
Group 1
SSD3
SSD4
Group 2
RAID Levels
—
RAID 10
Two RAID 1’s Striped
RAID5
Input data stream
Input data
RAID controller
parity
parity
parity
parity
Introduction
SSD RAID shows the performance loss.
Test setup and workload
Test setup:
• 16 core server with 64GB RAM
• 3 RAID controllers with 512MB cache
• Intel 64GB SSD
Workloads:
• Workload light – one worker,32 queue;
• Workload heavy – ten worker,queue depth 16;
• Workload latency – single request,one worker,
queue depth 1.
SSD RAID Configurations
—throughput(workload heavy)
RAID 0,5,10 With 8 SSDs on a single controller
SSD RAID Configurations
—throughput(workload heavy)
RAID 0,5,10 With 8 SSDs on a single controller
SSD RAID Configurations
—throughput(workload light)
Volume=240GB
Show single SSD data for comparison
SSD RAID Configurations
—throughput(workload light)
Volume=240GB
Show single SSD data for comparison
saturate
Scalability
Experiment data above indicate:
Exist a bottleneck along the IO chain
Is it RAID controller or PCIe bus?
Scalability
With the best throughput,the utilization PCIe bus is
less than 50%.
RAID controller is the bottleneck.
Scalability
Two SSDs are enough to saturate the controller!
Scalability
With read-ahead
With write cache
Scalability
Without write cache
Solution alternatives
Combination of hardware and software.
A. Without controller. Devices connect directly with
software RAID on top
B. Use controller just as simple device aggregator while
running software RAID on top
C. Use simple RAID level on multiple RAID controller while
running software on top
Solution alternatives
Compare option A and B
RAID with 2 SSDs
Solution alternatives
Compare option B and C
Second controller have a profound effect on performance.
conclusions
• Software RAID-approaches
• Multiple blocksize
• RAID controllers are not designed for the
characteristic of SSD
Thank you~
Download