Test

advertisement

IT 241 Information Discovery Fall 2012 Exam 1

Name _____________________________

1.

Below is one of the visualization pipelines from the text.

Page 1

Thursday, Sept. 20, 2012

[18 pts] a.

Excluding the Internet, describe 3 sources where the raw data may reside? [4] b.

Describe three data transformations that are possibly used to generate a useful data table for input to a visualization tool. [6] c.

Give an example of a visual mapping (color, line, dot, position, etc.) you might use to represent each of the following attribute types in a scatterplot: [8]

Nominal data of 4 distinct categories:

Nominal data of 10 categories that can be ranked:

Ordinal data:

Relational data (connections among data points):

2.

Why is Minard’s map of Napoleon’s march to Moscow and back considered a good visualization example?

[6 pts]

Is Minard’s Map an exploratory visualization, explanatory visualization or an example of visual art?

IT 241 Information Discovery Fall 2012 Exam 1

3.

Why is Nightengale’s rose petal visualization of causes of death a worthy visualization example?

Page 2

[6 pts]

Was this visualization informative or persuasive, or an example of visual art? Explain your choice.

[20 pts]

4.

Data coding a.

The value 0110 1101 in binary is _______________ in decimal

and its corresponding hexadecimal digits are _____.

Converting decimal 55 to binary becomes ______________ .

If the 8bit ASCII codes in hex for “A” and “a” are 41 and 61, respectively, then hex codes for the string “Bad” are _____________________________________.

If we want to store 366 unique values, we would need minimally _______ bits to represent those values. b.

A 1250 x 800 pixel color image coded in RGB (+alpha) format requires _________Mbytes. c.

Why is a .gif file considered a “lossy” image compression? d.

A 30 second mono sound clip sampling rate at 48000 samples per second with a 16 bit depth will result in storing __________________ bytes. e.

What are the colors for these RGB hexadecimal encodings?

FFFFFF = __________________ 888888 = ______________________

00FF00 = __________________ FFFF00 = ______________________ f.

If your data is simply a table of data with rows and columns, then what editable file structure is appropriate?

___________ (choose from XML, CSV, BMP, XLS)

If your data contains hierarchical relationships, what editable data file structure would be appropriate?

_______________ (XML, CSV, BMP, XLS)

IT 241 Information Discovery Fall 2012 Exam 1 Page 3

5.

Plot this set of 15 univariate numbers {65, 85, 93, 77, 48, 65, 50, 63, 44, 80, 55, 87, 47, 92, 73} then superimpose a

Tukey box plot representing the median and 25 th and 75 th percentiles.

[12 pts]

┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼

40 45 50 55 60 65 70 75 80 85 90 95

Without calculating do you expect the mean of this set of numbers to be greater or smaller than the median? _______

6.

We saw the following relational database SQL query in class. Fill in the blanks below regarding the query.

SELECT S.lastName, S.firstName, S.major

FROM Student S, Enroll E, Class C, Faculty F

WHERE F.name='Byrne' AND F.facId=C.facId

AND C.classNumber=E.classNumber AND E.stuId=S.stuId

[4 pts]

There are _____ (number) tables participating in the query. The combining of the tables in called a(n) _______ operation. There are ______ (number) attributes in the result. The result will have ______ (more/less) tuples than the

Student relation has.

7.

Describe three possible ways to handle missing data in a data set. a. b. c.

8.

What is normalizing data in an attribute? Give a concrete example of normalizing.

[6 pts]

[5 pts]

IT 241 Information Discovery Fall 2012 Exam 1

9.

The following attributes are found in a daily weather data set in some state.

Page 4

[14 pts] a.

Associate the best descriptor for each attribute. If you do not understand the semantics of the attribute, please ask for clarification. Not every descriptor may be used.

Choices: Nominal-Categorical (NC), Nominal-Ranked (NR), Nominal-Arbitrary (NA)

Ordinal Continuous (OC), Ordinal Discrete (OD), Ordinal Statistical (OS)

Spatial/geometric (Sp), Temporal (T)

Date

Day of Week

Latitude-Longitude

County

Low temperature

High temperature

Rainfall

Snowfall

Number of highway fatalities

Prominent cloud type b.

Give two attributes that are independent: _________________ and ______________________ c.

Give two attributes that are dependent: ___________________ and ________________________

10.

Data mining concept views. Refer to the above weather data set attributes.

[9 pts] a.

Give a classical concept view from the data set above. That is, come up with a property in an IF- condition

THEN assertion ELSE assertion pattern. Be specific using attributes.

Do something more creative than IF cloudy THEN rainfall>0 ELSE rainfall=0 ! b.

Recast your classical concept as probabilistic concept views. E.g. when there was rainfall it was cloudy c.

Give two exemplar views of your classical concept (Examples of the concept).

Download