IT 241 Information Discovery Fall 2012 Exam 1
Name _____________________________
1.
Below is one of the visualization pipelines from the text.
Page 1
Thursday, Sept. 20, 2012
[18 pts] a.
Excluding the Internet, describe 3 sources where the raw data may reside? [4] b.
Describe three data transformations that are possibly used to generate a useful data table for input to a visualization tool. [6] c.
Give an example of a visual mapping (color, line, dot, position, etc.) you might use to represent each of the following attribute types in a scatterplot: [8]
Nominal data of 4 distinct categories:
Nominal data of 10 categories that can be ranked:
Ordinal data:
Relational data (connections among data points):
2.
Why is Minard’s map of Napoleon’s march to Moscow and back considered a good visualization example?
[6 pts]
Is Minard’s Map an exploratory visualization, explanatory visualization or an example of visual art?
IT 241 Information Discovery Fall 2012 Exam 1
3.
Why is Nightengale’s rose petal visualization of causes of death a worthy visualization example?
Page 2
[6 pts]
Was this visualization informative or persuasive, or an example of visual art? Explain your choice.
[20 pts]
4.
Data coding a.
The value 0110 1101 in binary is _______________ in decimal
and its corresponding hexadecimal digits are _____.
Converting decimal 55 to binary becomes ______________ .
If the 8bit ASCII codes in hex for “A” and “a” are 41 and 61, respectively, then hex codes for the string “Bad” are _____________________________________.
If we want to store 366 unique values, we would need minimally _______ bits to represent those values. b.
A 1250 x 800 pixel color image coded in RGB (+alpha) format requires _________Mbytes. c.
Why is a .gif file considered a “lossy” image compression? d.
A 30 second mono sound clip sampling rate at 48000 samples per second with a 16 bit depth will result in storing __________________ bytes. e.
What are the colors for these RGB hexadecimal encodings?
FFFFFF = __________________ 888888 = ______________________
00FF00 = __________________ FFFF00 = ______________________ f.
If your data is simply a table of data with rows and columns, then what editable file structure is appropriate?
___________ (choose from XML, CSV, BMP, XLS)
If your data contains hierarchical relationships, what editable data file structure would be appropriate?
_______________ (XML, CSV, BMP, XLS)
IT 241 Information Discovery Fall 2012 Exam 1 Page 3
5.
Plot this set of 15 univariate numbers {65, 85, 93, 77, 48, 65, 50, 63, 44, 80, 55, 87, 47, 92, 73} then superimpose a
Tukey box plot representing the median and 25 th and 75 th percentiles.
[12 pts]
┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼
40 45 50 55 60 65 70 75 80 85 90 95
Without calculating do you expect the mean of this set of numbers to be greater or smaller than the median? _______
6.
We saw the following relational database SQL query in class. Fill in the blanks below regarding the query.
[4 pts]
There are _____ (number) tables participating in the query. The combining of the tables in called a(n) _______ operation. There are ______ (number) attributes in the result. The result will have ______ (more/less) tuples than the
Student relation has.
7.
Describe three possible ways to handle missing data in a data set. a. b. c.
8.
What is normalizing data in an attribute? Give a concrete example of normalizing.
[6 pts]
[5 pts]
IT 241 Information Discovery Fall 2012 Exam 1
9.
The following attributes are found in a daily weather data set in some state.
Page 4
[14 pts] a.
Associate the best descriptor for each attribute. If you do not understand the semantics of the attribute, please ask for clarification. Not every descriptor may be used.
Choices: Nominal-Categorical (NC), Nominal-Ranked (NR), Nominal-Arbitrary (NA)
Ordinal Continuous (OC), Ordinal Discrete (OD), Ordinal Statistical (OS)
Spatial/geometric (Sp), Temporal (T)
Date
Day of Week
Latitude-Longitude
County
Low temperature
High temperature
Rainfall
Snowfall
Number of highway fatalities
Prominent cloud type b.
Give two attributes that are independent: _________________ and ______________________ c.
Give two attributes that are dependent: ___________________ and ________________________
10.
Data mining concept views. Refer to the above weather data set attributes.
[9 pts] a.
Give a classical concept view from the data set above. That is, come up with a property in an IF- condition
THEN assertion ELSE assertion pattern. Be specific using attributes.
Do something more creative than IF cloudy THEN rainfall>0 ELSE rainfall=0 ! b.
Recast your classical concept as probabilistic concept views. E.g. when there was rainfall it was cloudy c.
Give two exemplar views of your classical concept (Examples of the concept).