Big Data Analytics Spring 2014 Presentation (ppt)

advertisement
Big Data Analytics
for Institutional Effectiveness
Alex Rudniy, Ph.D.
Raymond Calluori, Ph.D.
Perry Deess, Ph. D.
Presented at NJAIR 20th Annual Conference, Institutional Effectiveness: IE and IR.
St. Peter’s University, Jersey City, NJ,4/4/2014.
Goals
 Simplify and automate IR activities
 Produce graphical and tabular reports in a user-friendly way
 For users with fewer technical skills within IR
 Self-serve report generation for other departments
 Simpler than Cognos
 The big picture for
senior management
 Audit operational
database data
 Overcome restrictions
 FERPA protects student data
 Many stakeholders don’t have access privileges
 Dynamic report generation overcomes complexity of large
static reports
Technology
 Backend Microsoft SQL Server database
 Frontend designed in Microsoft Visual Studio
 Hosted on Windows Server
 Accessible from:
 Any platform via an internet browser
 Standalone desktop application
NJIT IRP Factbook
Past Reports as Adobe PDF
Current Reports as Excel Pivot Tables
01 Enrollment by Year, Class , AttSt, Ethn, and Gender, 2001-2013
Sum of CountOfSID
Ethnicity
Gender
Am.Ind./ Al.Nat.
Asian
Black
Hispanic
Year
Level
Class LevelAttendance
F MStatus
n/a F
M n/a F
M n/a F
M n/a
2013 Ugrad 1 Freshman
FT
72 245
28 84
54 235
PT
5 29
10
9 28
1 Freshman Total
77 274
28 94
63 263
2 Sophomore
FT
56 203
23 88
52 177
PT
1
3 16
2 12
6 26
2 Sophomore Total1
59 219
25 100
58 203
3 Junior FT
1 1
74 233
38 115
65 230
PT
3 23
8 18
4 45
3 Junior Total
1 1
77 256
46 133
69 275
4 Senior FT
65 204
29 103
71 238
PT
1
13 46
5 31
14 61
4 Senior Total
1
78 250
34 134
85 299
5 Ugrad Unclassfied
FT
1
1
2
2
3
PT
39 66
23 24
21 45
5 Ugrad Unclassfied Total
40 67
23 26
23 48
Ugrad Total
2 2
331 1066
156 487
298 1088
Grad
5 Ugrad Unclassfied
PT
4
2
2
1
2
5 Ugrad Unclassfied Total
4
2
2
1
2
Grad Unclassfied
FT
2
PT
2
3
2 3
1
3
Grad Unclassfied Total
2
5
2 3
1
3
GraduateFT
38 71
13 48
23 42
PT
2 1
59 123
37 95
48 120
Graduate Total 2 1
97 194
50 143
71 162
Grad Total
2 1
103 201
52 148
73 167
2013 Total
4 3
434 1267
208 635
371 1255
International
Multirace
F
M
n/a F M F
18 43
12 36
1
2 1
18 44
14 37
17 49
13 17
1
5
3
18 54
13 20
30 35
12 23
3
7
3
33 42
12 26
21 55
10 17
2
7
1 2
23 62
11 19
4
3
1
5
9
1 4
9 12
1 5
101 214
51 107
4
6
1
4
6
1
3
7
3
1
7
6
1
358 756
4 6
31 52
5 7
389 808
9 13
400 820
9 15
501 1034
60 122
Nat.Haw./ Unknown
Pac.Isl.
M F
M n/a
2
13 69
1 3
2
14 72
15 71
4 5
19 76
1 2 39 146
6 11
1 2 45 157
1 66 172
4 25
1 70 197
6
217 186
217 192
3 3 365 694
2 3
2 3
3
12 32
12 35
23 47
47 131
70 178
84 216
3 3 449 910
White
Grand Total
F M
n/a
72 389
1372
9 31
129
81 420
1501
68 341
1190
4 30
118
72 371
1308
90 419
1554
9 50
190
99 469
1744
80 433
1565
18 103
333
98 536
1898
5
28
65 102
807
65 107
835
415 1903
7286
1
3
31
1
3
31
1
2
11
4 15
88
5 17
99
48 108
1585
84 287
1129
132 395
2714
138 415
2844
553 2318
10130
Big Data Analytics
 In high demand by the industry and academia
 Scale differs by industry, e.g. bioinformatics vs. academics
 Features:
 Large scale of data
 Powerful servers are required for processing
 Components of a dashboard:
 Backend database
 Tabular representation
 Graphic representation
ETL Complications
 ETL = Extract, Transform, Load process
 ETL is needed to build a backend database
 Historical data is spread among multiple databases
 Data specifications lost/unknown
 Data need to be unified
 Attributes missing
 Attributes coded differently
 Attributes spread among multiple tables within the same
database
The Dashboard
 More than 30 years of data
 Accurate: from 1982
 Partial: 1957-1981
 Multiple dimensions
 Tabular & graphical representation
 Overcomes FERPA restrictions by aggregating data
 Impossible to identify a person
 Privacy concerns
 Does not contain: names, SSNs,emails, etc.
 Access allowed for secured user accounts
Dashboard Main Screen
Dashboard Structure
 Consists of multiple tabs on the top
 Each tab contains a pivot table and a linked chart
 Pivot table has several areas: filter area, column headers, row
headers, and data area
 Attributes can be moved between areas
Enrollment Tab, 1983-2013
 This view of enrollment contains
 Student IDs in the data area
 Semester type and year in the row header area
Enrollment Tab, 1983-2013 (cont.)
 Added student level (U/G)
Enrollment Tab (continued)
 Past 5 years enrollment by level and attendance status
Bachelors Retention, 1988-2012 Cohorts
 Total retention by year
Bachelors Retention, 1988-2012 (cont.)
 Female vs. male retention
Bachelors Graduation, 1988-2007 cohorts
 Six-year full-time first-time undergraduates’ graduation rates
 By Ethnicity
Dashboard Screens
 Main dashboard (static)
Dynamic:
 Applications
 Enrollment
 Bachelors Retention
 Masters Retention
 Bachelors Graduation
 Masters Graduation
 Awarded Degrees
 Ph.D. Retention
 Ph.D. Graduation
Download