LocalAndGlobal_Poster_final

advertisement
Service-Oriented Local And Global Visualization
with Sorting On-demand for Climate Data
Zhe Zhang
Ye Jin
Xusheng Xiao
zzhang13@ncsu.edu
yjin6@ncsu.edu
xxiao2@ncsu.edu
Solution 1: Service-Oriented Histogram [2]
Motivation
15
10
5
0
10
12
14
16
18
20
x-value
y-counts
0
50
10
20
30
40
100
y-counts
150
y-counts
20
25
200
Huge amount of climate simulation data are collected from different Locally And Global Visualization
Locally compute min, max, and count
areas (e.g., cities, countries).
Transmitting the local min, max and count to compute global min,
Climate scientists keep trying to predict the trends of the variation of
max and count
climate both locally and globally.
Each data sources compute the histogram based on the global min,
Exploring visualization of data mining (e.g., histogram) has been used
max and count
more and more frequently to get a general view ahead of predicting.
Only transferring the computed histogram data, which is much
Climate experts would like to analyze data by navigating among levels
smaller compared to all the climate data
of data ranging from the most summarized (drill-up) to the most detailed
Merge the transmitted histograms to show the global histograms
(drill-down) (e.g., drill-down shown in Figure 1).
-40
-35
-30
-25
-20
-15
-10
6
5
35
40
45
80
30
50
15
-30
-20
-10
0
10
20
30
40
10
0
y-counts
20
25
0
30
20
40
y-counts
60
x-value
20
x-value
3
4
3
y-counts
2
2
1
1
0
0
-1
40
-2
y-counts
-3
60
0
80
x-value
0
5
x-value
-10
-8
-6
-4
-2
0
x-value
-1.0
-0.5
0.0
0.5
1.0
15
x-value
0
5
y-counts
10
Figure 1: Drill-down to interval [-1,1]
-20
-18
-16
-14
-12
-10
10
x-value
6
4
0
2
y-counts
8
Challenge
-35
-30
-25
-20
-15
-10
x-value
Globally transferring caused problems:
Time-consuming (see Table 1)
Package Lost during data transfer (see Table 1)
Frequently drill-up and drill-down navigation of data consumes
computation resources. (e.g., scanning same data set multiple times see
Table 2)
Table 1 [1]
Here are the raw data in multiple domains have already collected, we
can see the latest data sets are all for year 2008.
Data Domain
Single data
set size
Number of
data sets
Total Size
Collecting Time
In Best Case
VOCALS 2008
~70000 KB
56
~3920 MB
~10 Hrs
ASCOS 2008
~140000 KB
25
~3500 MB
~10 Hrs
AEROSE 2008
~80000 KB
36
~2880 MB
~7 Hrs
STRATUS 2007
~70000KB
21
~1470 MB
~5 Hrs
Figure 2: System Framework
Solution 2: On-demand Sorting [3]
Cache data and parameters (min, max, count) locally
Index data with break number (e.g., 0.5 is in the break [0, 1] )
Check whether the data in the requested breaks are sorted or not
If sorted, transfer data directly
If data is not sorted, sort only the data in the corresponding break and
mark the break as sorted
Transfer local histogram data (min, max, count) for global computation
Merge data from different sources
Result
Table 2
Total time needed to discovery meaningful or user specified parameters
visualization results, we need to speed up those visualization algorithms.
Data Size
Run Once
Histogram
Discovery Histogram
Run log(n) Times
User specified 30
Times
~1500 MB
2 Mins
~17 * 2 = 34 Mins
60 Mins
~3000 MB
4 Mins
~18 * 2 = 36 Mins
120 Mins
~4500 MB
6 Mins
~19 * 2 =38 Mins
180 Mins
http://csc.ncsu.edu/ NCSU Computer Science
References
1. http://www.esrl.noaa.gov/psd/psd3/cruises/
2. Felix Halim, Panagiotis Karras, and Roland H.C. Yap. 2009. Fast and effective
histogram construction. ACM, New York, NY, USA, 1167-1176.
3. C. A. R. Hoare. Quicksort. The Computer Journal, 5(1):10–16, January 1962.
Download