EEL 6935 Big Data Ecosystems 1. Catalog Description – (3 credits) Data mining, statistics, conventional software tools, big data analytics software stack, big system software stack, large- scale machine learning algorithms, recommendation systems, and applications in science, engineering, business, and health. 2. Pre-requisites – EEL 3834 or equivalent 3. Course Objectives – the student will have an understanding of big data generated from natural systems, engineered systems, and human activities and the challenges they present. The student will learn a holistic methodology on the design of big data ecosystems and compare that to real world case studies in the areas of science, engineering, business, and health. 4. Contribution of course to meeting the professional component (ABET only – undergraduate courses) 5. Relationship of course to program outcomes: Skills student will develop in this course (ABET only undergraduate courses) 6. Instructor – Dr. Xiaolin (Andy) Li a. Office location: 433 NEB b. Telephone: 352-392-2651 c. E-mail address: andyli@ece.ufl.edu d. Class Web site: http://www.andyli.ece.ufl.edu/ e. Office hours: TBD 7. Teaching Assistant - TBD a. Office location: b. Telephone: c. E-mail address: d. Office hours: 8. Meeting Times and Location - TBD 9. Class/laboratory schedule - 3 class periods consisting of 50 minutes each 10. Material and Supply Fees - None 11. Textbooks and Software Required a. Title: Mining of Massive Datasets b. Author: Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman c. Publication date and edition: Cambridge University Press, 2014 (Free PDF book is available at http://i.stanford.edu/~ullman/mmds.html) d. ISBN number: 12. Recommended Reading a. Recent conference papers and online resources/documents b. Hadoop: The Definitive Guide, Tom White, O'Reilly Media, 3rd Edition, 2012. c. Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer, c. 2010. d. The Fourth Paradigm: Data-Intensive Scientific Discovery, Tony Hey, Stewart e. Tansley, and Kristine Tolle, Microsoft Research, 2009. f. Artificial Intelligence: A Modern Approach, Stuart Russell and Peter Norvig, Prentice Hall, 3rd Edition, 2009. g. Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer, h. 2007. i. Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Morgan Kaufmann, 3rd Edition, 2011. j. Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997. k. Programming in Scala: A Comprehensive Step-by-Step Guide, Martin l. Odersky, Lex Spoon, and Bill Venners, 2nd Edition, 2011. m. The Way To Go: A Thorough Introduction To The Go Programming n. Language, Ivo Balbaert, iUniverse, 2012 13. Course Outline (provide topics covered by week or by class period) – Introduction o Data Mining and Statistics o Algorithms o Tools: R, Weka, RapidMiner, Julia Big Data Stack o Data Path: Messaging, Online Processing/Query, Nearline/Stream Processing, Offline Processing o Data Store: Databases, Distributed File Systems, Storage o Analytics: Machine Learning, Graph, Search o Control Plane: Coordination and Management o Tools: Kafka, Spark, Storm, GraphLab, MLbase, Hadoop, Cassandra Large-scale Machine Learning o Dimension Reduction o Recommendation Systems o Clustering and Classification o Deep Learning o Mining Data Streams o Mining the Web Data-driven Software-defined Ecosystems o Mesos, YARN, SuperStack o Software-defined Networking Case Studies: Science, Engineering, Business and Health 14. Attendance and Expectations - Attendance is expected from students in order to properly follow class progress. There are no explicit penalties for absence. Cell phones and other electronic devices are to be silenced. No text messaging during class or exams. Additional class policy guidelines are provided in a separate class policies document. Homework and programming assignments are due by 11:55pm of the due date (unless announced in class otherwise). Late homework will NOT be accepted. Late program penalty is 10% per day, according to the timestamp of your online submission. Requirements for class attendance and make-up exams, assignments, and other work in this course are consistent with university policies that can be found in the online catalog at: https://catalog.ufl.edu/ugrad/current/regulations/info/attendance.aspx 15. Grading – homework, reports, and projects = 80%, exams = 20% 16. Grading Scale (e.g., 90-100 A, 85-89 B+, 80-84 B, etc.) If grades are to be curved, so state. Values should not overlap and the full grade to percentage/points map must be included. – A 90-100 A- B+ B B- C+ C C- D+ D D- E 85-89 80-84 75-79 xx-xx 70-74 xx-xx 65-69 xx-xx 60-64 xx-xx 0-59 This statement must be included in every grade scale for undergraduate level 1000-4000 syllabi: “A C- will not be a qualifying grade for critical tracking courses. In order to graduate, students must have an overall GPA and an upper-division GPA of 2.0 or better (C or better).” Note: a C- average is equivalent to a GPA of 1.67, and therefore, it does not satisfy this graduation requirement. For more information on grades and grading policies, please visit: https://catalog.ufl.edu/ugrad/current/regulations/info/grades.aspx This statement must be included in every grade scale for 5000 level graduate syllabi: “Undergraduate students, in order to graduate, must have an overall GPA and an upperdivision GPA of 2.0 or better (C or better). Note: a C- average is equivalent to a GPA of 1.67, and therefore, it does not satisfy this graduation requirement. Graduate students, in order to graduate, must have an overall GPA of 3.0 or better (B or better).” Note: a Baverage is equivalent to a GPA of 2.67, and therefore, it does not satisfy this graduation requirement. For more information on grades and grading policies, please visit: https://catalog.ufl.edu/ugrad/current/regulations/info/grades.aspx This statement must be included in every grade scale for 6000 level graduate syllabi: “In order to graduate, graduate students must have an overall GPA and an upper-division GPA of 3.0 or better (B or better).” Note: a B- average is equivalent to a GPA of 2.67, and therefore, it does not satisfy this graduation requirement. For more information on grades and grading policies, please visit: http://gradschool.ufl.edu/catalog/current-catalog/cataloggeneral-regulations.html#grades 17. Make-Up Exam Policy – Only when verifiable extenuating circumstances can be demonstrated will make-up exams or extended assignment due dates be considered. Verifiable extenuating circumstances must be reasons beyond control of the students, such as illness or accidental injury. Poor performance in class is not an extenuating circumstance. Advise your instructor of the verifiable extenuating circumstances in advance or as soon as possible. In such situations, the date and nature of the make-up exams and the extended due dates for the assignments will be decided by the instructor. If you have a University-approved excuse and arrange for it in advance, or in case of documented emergency, a make-up exam will be allowed and arrangements can be made for making up missed work. University attendance policies can be found at: https://catalog.ufl.edu/ugrad/current/regulations/info/attendance.aspx Otherwise, make-up exams will be considered only in extraordinary cases, and must be taken before the scheduled exam. The student must submit a written petition to the instructor two weeks prior to the scheduled exam and the instructor must approve the petition. 18. Honesty Policy – Discussion of techniques and ideas covered in class is encouraged. However, every line of all assignments must be your own. A statement required by the university: "Care must be taken that exam answers are not seen by others, that term papers or projects are not plagiarized by others or otherwise misused by others, etc. Even passive cooperation in a dishonest enterprise is unacceptable." In programming assignments, discussion of techniques in a natural language (such as English) is allowed, but a discussion in a computer or algorithmic language is not allowed. (Computer language discussions and questions are to be limited to the language and should not concern the assignment.) Stealing, giving or receiving any code, drawings, diagrams, texts or designs (from others or Internet) is not allowed. Project reports should be written in your own words; apparent copy (ONE sentence) is assumed as plagiarism, if not quoted. In examinations, no discussion of any kind (except with the instructor) is allowed. No access to any type of written material is allowed. Students who do not comply with the above described collaboration policy will receive a grade of F in the course. Furthermore, the case will be reported to the University Officials. UF students are bound by The Honor Pledge which states, “We, the members of the University of Florida community, pledge to hold ourselves and our peers to the highest standards of honor and integrity by abiding by the Honor Code. On all work submitted for credit by students at the University of Florida, the following pledge is either required or implied: “On my honor, I have neither given nor received unauthorized aid in doing this assignment.” The Honor Code (http://www.dso.ufl.edu/sccr/process/student-conduct-honor-code/) specifies a number of behaviors that are in violation of this code and the possible sanctions. Furthermore, you are obligated to report any condition that facilitates academic misconduct to appropriate personnel. If you have any questions or concerns, please consult with the instructor or TAs in this class. 19. Accommodation for Students with Disabilities – Students requesting classroom accommodation must first register with the Dean of Students Office. That office will provide documentation to the student who must then provide this documentation to the course instructor when requesting accommodation. 20. UF Counseling Services – Resources are available on-campus for students having personal problems or lacking clear career and academic goals. The resources include: · · · UF Counseling & Wellness Center, psychological and psychiatric services, 3190 Radio Rd, 392-1575, online: http://www.counseling.ufl.edu/cwc/Default.aspx, Career Resource Center, Reitz Union, career and job search services, 392-1601. University Police Department, 392-1111 or 911 for emergencies 21. Software Use – All faculty, staff and student of the University are required and expected to obey the laws and legal agreements governing software use. Failure to do so can lead to monetary damages and/or criminal penalties for the individual violator. Because such violations are also against University policies and rules, disciplinary action will be taken as appropriate. We, the members of the University of Florida community, pledge to uphold ourselves and our peers to the highest standards of honesty and integrity. 22. Course Evaluation – Students are expected to provide feedback on the quality of instruction in this course based on 10 criteria. These evaluations are conducted online at: https://evaluations.ufl.edu. Evaluations are typically open during the last two or three weeks of the semester, but students will be given specific times when they are open. Summary results of these assessments are available to students at: https://evaluations.ufl.edu/results.