Big Data IS 101Y/CMSC 104Y First Year IT Marie desJardins University of Maryland Baltimore County How Much Data Is There? “IBM has estimated that “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.’” http://www.weeklyramble.com/tag/how-much-data-is-there-in-the-world “A report from Stanford University found that the whole of humanity produces around 1,200 exabytes of data every year.” http://www.weeklyramble.com/tag/how-much-data-is-there-in-the-world 80.53 billion 50G iPhones would circle the earth 100 times if laid end to end 1,200 Exabytes 1.2 zettabytes == 1,200 exabytes == 1,200,000 petabytes == 1,200,000,000 terabytes == There was about a terabyte of Internet traffic per month in 1990 (the first year of significant commercial activity) Now there is about a terabyte every tenth of a second 1,200,000,000,000 gigabytes == A gigabyte is about how much space a movie takes 1,200,000,000,000,000 megabytes == A megabyte is about one minute of a song 1,200,000,000,000,000,000 kilobytes == A kilobyte is about one page of ASCII text 1,200,000,000,000,000,000,000 bytes [that’s 1.2 sextillion!] A byte is one character Important Problems Health Cure disease, solve the obesity epidemic, eliminate world hunger, increase health coverage, educate people, eliminate drug abuse Prosperity End poverty, restore world economy, end world hunger, improve safety and cybersecurity, spread education, create job opportunities, decrease gas prices Environment Manage natural resources, reduce pollution, solve the energy crisis, stop global warming, reduce parking and traffic Scientific/technical discovery Develop hovercars, explore space Freedom/justice Increase equality, stand against oppression, reduce partisanship, establish world peace, fix foreign policy, ensure personal privacy Personal fulfillment Be happy, build good relationships, learn from mistakes, express self, be financially independent, take risks/chances, spread love, improve time management, get enough sleep, find your keys Questions From Reading How did Sergey Brin and Larry Page meet each other? What does the Google article refer to as “the largest graph ever created?” What did Google do that was so different? Name some sources of “big data” that you might encounter in your daily life or read about in the newspaper What is “cloud computing”? Are you worried about your personal privacy because of social media or government surveillance? Where does Deb Roy work, and what word did he document? Do you think it would be cool, or creepy, to have your home wired like Roy’s? Data Topics in Courses Business Technology Administration (BTA) Information Systems (IS) Computer Science (CMSC) Computer Engineering (CMPE) Data Topics in Courses Business Technology Administration (BTA) Required IS 300: Management Information Systems IS 320: Advanced Business Applications ECON 121: Principles of Accounting I ECON 122: Principles of Accounting II Electives IS 317: Accounting Information Systems IS 387: Information Architecture for the Web IS 460: Health Care Informatics Certificates Auditing for Information Management (IAS) Data Topics in Courses Information Systems (IS) Required IS 300: Management Information Systems IS 410: Introduction to Database Design IS 420: Database Application Development ECON 121: Principles of Accounting I ECON 122: Principles of Accounting II Electives IS 460: Health Care Informatics Certificates Auditing for Information Management (IAS) Courses in Data Topics Computer Science (CMSC) Required STAT 355: Probability and Statistics CMSC 313: Assembly Language and Computer Organization CMSC 341: Data Structures Electives CMSC 436: Data Visualization CMSC 442: Information and Coding Theory CMSC 461: Databases CMSC 471: Artificial Intelligence CMSC 473: Machine Learning CMSC 476: Information Retrieval CMSC 491: Clinical informatics Track Courses in Data Topics Computer Engineering (CMPE) Required ENES101: Introduction to Engineering Science CMPE 306: Circuits CMPE 314: Microelectronics CMPE 320: Probability and Random Processes CMSC 341: Data Structures CMPE 450/451: Capstone Electives CMPE 323: Signals and Systems CMSC 471: Artificial Intelligence CMSC 473: Machine Learning Careers in Data Database Administrator/Architect Description: Administer, test, and implement computer databases. Coordinate changes to computer databases. May plan, coordinate, and implement security measures to safeguard computer databases. Skills: Database usage, computational thinking, system analyst skills Majors: Information Systems (Administrator), Computer Science (Architect) Other names: DBA, SQL Architect, Server Database Administrator Companies: AT&T, Matrix Resources, L3 Communications, etc. Intelligence Analyst Description: Intelligence refers to discrete information with currency and relevance, and the abstraction, evaluation, and understanding of such information for its accuracy and value. An intelligence analyst reviews data and presents significant patterns to an audience in an understandable way Skills: Data Visualization, inductive/deductive reasoning Majors: Information Systems, Computer Science Other names: Malware Analyst Companies: NSA, DoD, etc Business Intelligence Analyst Description A business intelligence analyst reviews data and presents significant patterns to an audience in an understandable way, looking for financial patterns/good investments Skills: Data visualization, inductive/deductive reasoning Majors: Information Systems, BTA Other names: Business Data Analyst Companies: Bloomberg Financial Data Scientist Description A data scientist helps an organization to collect, manage, and analyze “big data,” spotting trends and enabling rapid information access. Skills: Statistics, machine learning/data mining, data analysis, understanding business needs and trends Majors: Computer Science, Information Systems Companies: IBM, Facebook, Google, Twitter, PayPal, CIA, NIH, BAH, Lockheed Martin, Walmart, Target... Administrative Services Manager Description: Plan, direct, or coordinate one or more administrative services of an organization, such as records and information management, mail distribution, and other office support services. Skills: Clerical, organization, human resources Majors: Information Systems, BTA Other names: Student Information Management, Companies: NSA, DoD, etc Archivist Description: Appraise, edit, and direct safekeeping of permanent records and historically valuable documents. Participate in research activities based on archival materials. Skills: Clerical, organization, history, antique knowledge Majors: Information Systems, BTA Other names: Archivist, Registrar, Archives Director, Manuscripts Curator, Collections Manager, Museum Archivist, Records Manager, University Archivist, Archival Records Clerk, Collections Director Companies: Walters Art Gallery, Smithsonian Health Informatics Description: Apply knowledge of nursing and informatics to assist in the design, development, and ongoing modification of computerized health care systems. May educate staff and assist in problem solving to promote the implementation of the health care system. Skills: Clerical, organization, biological/medical knowledge Majors: Information Systems, BTA, BioInformatics Other names: Clinical Informatics Director, Clinical Information Systems Director, Clinical Applications Specialist, Nursing Information Systems Coordinator Companies: BlueCross BlueShield, American Red Cross Scientific Data Management Description: Provide researchers with sophisticated query tools for fast data analysis. Skills: Geology, Science Majors: Information systems, bioInformatics Other names: Environmental Scientist/Specialist, Information Research Scientist, Geospatial Information Scientist, other science disciplines Companies: NASA, Lincoln Labs, MIT Software Developer Description: Develop, create, and modify general computer applications software or specialized utility programs. Analyze user needs and develop software solutions. May analyze and design databases within an application area, working individually or coordinating database development as part of a team Skills: Application design, database design, data parsing Majors: Computer Science Other names: Software Engineer, Software Architect Companies: Everywhere! (specifically Lockheed Martin, Northrup Grumman, Microsoft, Bloomberg Financial, Google, etc) Network Architect/Analyst Description: Design and implement computer and information networks, such as local area networks (LAN), wide area networks (WAN), intranets, extranets, and other data communications networks. Perform network modeling, analysis, and planning. Skills: Network comprehension, electronics knowledge Majors: Computer Science, Computer Engineering Other names: System Architect/Analyst, Network Manager Companies: SourceFire, BlueCross, etc. Cartographer Description: Collect, analyze, and interpret geographic information provided by geodetic surveys, aerial photographs, and satellite data. Research, study, and prepare maps and other spatial data in digital or graphic form for legal, social, political, educational, and design purposes. May design and evaluate algorithms, data structures, and user interfaces for GIS and mapping systems. Skills: Graphics, coordination/synchronization Majors: Computer Science, Geography and Environmental Systems Other names: Photogrammetrists, GIS Specialist, Stereo Compiler Companies: DoE, BAE Systems, SAIC, CIA, etc Remote Sensing Scientists and Technologists Description: Apply remote sensing principles and methods to analyze data and solve problems in areas such as natural resource management, urban planning, or homeland security. May develop new sensor systems, analytical techniques, or new applications for existing systems. Skills: Embedded software, autonomous (unmanned) vehicle design, radar technology Majors: Computer Engineering, Computer Science Other names: Remote Sensing Analyst, Remote Sensing Program Manager, Remote Sensing Scientist, Research Scientist, Sensor Specialist Companies: DoE, APL, Exxon, etc.