Data - Personal Page

advertisement
Big Data
IS 101Y/CMSC 104Y
First Year IT
Marie desJardins
University of Maryland Baltimore County
How Much Data Is There?
 “IBM has estimated that “Every day, we create 2.5
quintillion bytes of data — so much that 90% of the
data in the world today has been created in the last two
years alone.’”
http://www.weeklyramble.com/tag/how-much-data-is-there-in-the-world
 “A report from Stanford University found that the whole
of humanity produces around 1,200 exabytes of data
every year.”
http://www.weeklyramble.com/tag/how-much-data-is-there-in-the-world
 80.53 billion 50G iPhones would circle the earth 100
times if laid end to end
1,200 Exabytes
 1.2 zettabytes ==
 1,200 exabytes ==
 1,200,000 petabytes ==
 1,200,000,000 terabytes ==


There was about a terabyte of Internet traffic per month in 1990 (the first year of
significant commercial activity)
Now there is about a terabyte every tenth of a second
 1,200,000,000,000 gigabytes ==

A gigabyte is about how much space a movie takes
 1,200,000,000,000,000 megabytes ==

A megabyte is about one minute of a song
 1,200,000,000,000,000,000 kilobytes ==

A kilobyte is about one page of ASCII text
 1,200,000,000,000,000,000,000 bytes [that’s 1.2 sextillion!]

A byte is one character
Important Problems
 Health

Cure disease, solve the obesity epidemic, eliminate world hunger, increase health
coverage, educate people, eliminate drug abuse
 Prosperity

End poverty, restore world economy, end world hunger, improve safety and
cybersecurity, spread education, create job opportunities, decrease gas prices
 Environment

Manage natural resources, reduce pollution, solve the energy crisis, stop global
warming, reduce parking and traffic
 Scientific/technical discovery

Develop hovercars, explore space
 Freedom/justice

Increase equality, stand against oppression, reduce partisanship, establish world
peace, fix foreign policy, ensure personal privacy
 Personal fulfillment

Be happy, build good relationships, learn from mistakes, express self, be
financially independent, take risks/chances, spread love, improve time
management, get enough sleep, find your keys
Questions From Reading
 How did Sergey Brin and Larry Page meet each other?
 What does the Google article refer to as “the largest graph ever
created?”
 What did Google do that was so different?
 Name some sources of “big data” that you might encounter in your
daily life or read about in the newspaper
 What is “cloud computing”?
 Are you worried about your personal privacy because of social media
or government surveillance?
 Where does Deb Roy work, and what word did he document?
 Do you think it would be cool, or creepy, to have your home wired like
Roy’s?
Data Topics in Courses
 Business Technology Administration (BTA)
 Information Systems (IS)
 Computer Science (CMSC)
 Computer Engineering (CMPE)
Data Topics in Courses
 Business Technology Administration (BTA)
 Required




IS 300: Management Information Systems
IS 320: Advanced Business Applications
ECON 121: Principles of Accounting I
ECON 122: Principles of Accounting II
 Electives
 IS 317: Accounting Information Systems
 IS 387: Information Architecture for the Web
 IS 460: Health Care Informatics
 Certificates
 Auditing for Information Management (IAS)
Data Topics in Courses
 Information Systems (IS)
 Required





IS 300: Management Information Systems
IS 410: Introduction to Database Design
IS 420: Database Application Development
ECON 121: Principles of Accounting I
ECON 122: Principles of Accounting II
 Electives
 IS 460: Health Care Informatics
 Certificates
 Auditing for Information Management (IAS)
Courses in Data Topics
 Computer Science (CMSC)
 Required
 STAT 355: Probability and Statistics
 CMSC 313: Assembly Language and Computer Organization
 CMSC 341: Data Structures
 Electives
 CMSC 436: Data Visualization
 CMSC 442: Information and Coding Theory
 CMSC 461: Databases
 CMSC 471: Artificial Intelligence
 CMSC 473: Machine Learning
 CMSC 476: Information Retrieval
 CMSC 491: Clinical informatics
 Track
Courses in Data Topics
 Computer Engineering (CMPE)
 Required
 ENES101: Introduction to Engineering Science
 CMPE 306: Circuits
 CMPE 314: Microelectronics
 CMPE 320: Probability and Random Processes
 CMSC 341: Data Structures
 CMPE 450/451: Capstone
 Electives
 CMPE 323: Signals and Systems
 CMSC 471: Artificial Intelligence
 CMSC 473: Machine Learning
Careers in Data
Database
Administrator/Architect
 Description:
 Administer, test, and implement computer databases.




Coordinate changes to computer databases. May plan,
coordinate, and implement security measures to safeguard
computer databases.
Skills:
 Database usage, computational thinking, system analyst
skills
Majors: Information Systems (Administrator), Computer Science
(Architect)
Other names:
 DBA, SQL Architect, Server Database Administrator
Companies:
 AT&T, Matrix Resources, L3 Communications, etc.
Intelligence Analyst
 Description:
 Intelligence refers to discrete information with currency and
relevance, and the abstraction, evaluation, and understanding
of such information for its accuracy and value.
 An intelligence analyst reviews data and presents significant
patterns to an audience in an understandable way
 Skills:
 Data Visualization, inductive/deductive reasoning
 Majors: Information Systems, Computer Science
 Other names:
 Malware Analyst
 Companies:
 NSA, DoD, etc
Business Intelligence Analyst
 Description
 A business intelligence analyst reviews data and presents




significant patterns to an audience in an understandable way,
looking for financial patterns/good investments
Skills:
 Data visualization, inductive/deductive reasoning
Majors: Information Systems, BTA
Other names:
 Business Data Analyst
Companies:
 Bloomberg Financial
Data Scientist
 Description
 A data scientist helps an organization to collect, manage, and
analyze “big data,” spotting trends and enabling rapid
information access.
 Skills:
 Statistics, machine learning/data mining, data analysis,
understanding business needs and trends
 Majors: Computer Science, Information Systems
 Companies:
 IBM, Facebook, Google, Twitter, PayPal, CIA, NIH, BAH,
Lockheed Martin, Walmart, Target...
Administrative Services
Manager
 Description:
 Plan, direct, or coordinate one or more administrative services




of an organization, such as records and information
management, mail distribution, and other office support
services.
Skills:
 Clerical, organization, human resources
Majors: Information Systems, BTA
Other names:
 Student Information Management,
Companies:
 NSA, DoD, etc
Archivist
 Description:
 Appraise, edit, and direct safekeeping of permanent records




and historically valuable documents. Participate in research
activities based on archival materials.
Skills:
 Clerical, organization, history, antique knowledge
Majors: Information Systems, BTA
Other names:
 Archivist, Registrar, Archives Director, Manuscripts Curator,
Collections Manager, Museum Archivist, Records Manager,
University Archivist, Archival Records Clerk, Collections
Director
Companies:
 Walters Art Gallery, Smithsonian
Health Informatics
 Description:
 Apply knowledge of nursing and informatics to assist in the




design, development, and ongoing modification of
computerized health care systems. May educate staff and
assist in problem solving to promote the implementation of the
health care system.
Skills:
 Clerical, organization, biological/medical knowledge
Majors: Information Systems, BTA, BioInformatics
Other names:
 Clinical Informatics Director, Clinical Information Systems
Director, Clinical Applications Specialist, Nursing Information
Systems Coordinator
Companies:
 BlueCross BlueShield, American Red Cross
Scientific Data Management
 Description:
 Provide researchers with sophisticated query tools for fast data




analysis.
Skills:
 Geology, Science
Majors: Information systems, bioInformatics
Other names:
 Environmental Scientist/Specialist, Information Research
Scientist, Geospatial Information Scientist, other science
disciplines
Companies:
 NASA, Lincoln Labs, MIT
Software Developer
 Description:
 Develop, create, and modify general computer applications




software or specialized utility programs. Analyze user needs
and develop software solutions. May analyze and design
databases within an application area, working individually or
coordinating database development as part of a team
Skills:
 Application design, database design, data parsing
Majors: Computer Science
Other names:
 Software Engineer, Software Architect
Companies:
 Everywhere! (specifically Lockheed Martin, Northrup
Grumman, Microsoft, Bloomberg Financial, Google, etc)
Network Architect/Analyst
 Description:
 Design and implement computer and information networks,




such as local area networks (LAN), wide area networks
(WAN), intranets, extranets, and other data communications
networks. Perform network modeling, analysis, and planning.
Skills:
 Network comprehension, electronics knowledge
Majors: Computer Science, Computer Engineering
Other names:
 System Architect/Analyst, Network Manager
Companies:
 SourceFire, BlueCross, etc.
Cartographer
 Description:
 Collect, analyze, and interpret geographic information




provided by geodetic surveys, aerial photographs, and
satellite data. Research, study, and prepare maps and other
spatial data in digital or graphic form for legal, social, political,
educational, and design purposes. May design and evaluate
algorithms, data structures, and user interfaces for GIS and
mapping systems.
Skills:
 Graphics, coordination/synchronization
Majors: Computer Science, Geography and Environmental
Systems
Other names:
 Photogrammetrists, GIS Specialist, Stereo Compiler
Companies:
 DoE, BAE Systems, SAIC, CIA, etc
Remote Sensing Scientists
and Technologists
 Description:
 Apply remote sensing principles and methods to analyze data and solve
problems in areas such as natural resource management, urban planning, or
homeland security. May develop new sensor systems, analytical techniques,
or new applications for existing systems.
 Skills:
 Embedded software, autonomous (unmanned) vehicle design, radar
technology
 Majors:
 Computer Engineering, Computer Science
 Other names:
 Remote Sensing Analyst, Remote Sensing Program Manager, Remote
Sensing Scientist, Research Scientist, Sensor Specialist
 Companies:
 DoE, APL, Exxon, etc.
Download