Associate Analytics Objectives: To introduce the terminology, technology and its applications To introduce the concept of Analytics for Business To introduce the tools, technologies & programming languages which is used in day to day analytics cycle Introduction to Analytics (Associate Analytics – I) Unit I Introduction to Analytics and R programming (NOS 2101) Introduction to R, RStudio (GUI): R Windows Environment, introduction to various data types, Numeric, Character, date, data frame, array, matrix etc., Reading Datasets, Working with different file types .txt,.csv etc. Outliers, Combining Datasets, R Functions and loops. Manage your work to meet requirements (NOS 9001) Understanding Learning objectives, Introduction to work & meeting requirements, Time Management, Work management & prioritization, Quality & Standards Adherence, Unit II Summarizing Data & Revisiting Probability (NOS 2101) Summary Statistics - Summarizing data with R, Probability, Expected, Random, Bivariate Random variables, Probability distribution. Central Limit Theorem etc. Work effectively with Colleagues (NOS 9002) Introduction to work effectively, Team Work, Professionalism, Effective Communication skills, etc. Unit III SQL using R Introduction to NoSQL, Connecting R to NoSQL databases. Excel and R integration with R connector. Unit IV Correlation and Regression Analysis (NOS 9001) Regression Analysis, Assumptions of OLS Regression, Regression Modelling. Correlation, ANOVA, Forecasting, Heteroscedasticity, Autocorrelation, Introduction to Multiple Regression etc. Unit V Understand the Verticals - Engineering, Financial and others (NOS 9002) Understanding systems viz. Engineering Design, Manufacturing, Smart Utilities, Production lines, Automotive, Technology etc. Understanding Business problems related to various businesses Requirements Gathering Gathering all the data related to Business objective Text Books: 1. Student’s Handbook for Associate Analytics. Reference Books: 1. Introduction to Probability and Statistics Using R, ISBN: 978-0-557-24979-4, is a textbook written for an undergraduate course in probability and statistics. 2. 3. 4. An Introduction to R, by Venables and Smith and the R Development Core Team. This may be downloaded for free from the R Project website (http://www.r-project.org/, see Manuals). There are plenty of other free references available from the R Project website. Montgomery, Douglas C., and George C. Runger, Applied statistics and probability for engineers. John Wiley & Sons, 2010 The Basic Concepts of Time Series Analysis.http://anson.ucdavis.edu/~azari/sta137/AuNotes.pdf Time Series Analysis and Mining with R,Yanchang Zhao. Big Data Analytics (Associate Analytics – II) Unit I: Data Management (NOS 2101) Design Data Architecture and manage the data for analysis, understand various sources of Data like Sensors/signal/GPS etc. Data Management, Data Quality (noise, outliers, missing values, duplicate data) and Data Preprocessing. Export all the data onto Cloud ex. AWS/Rackspace etc. Maintain Healthy, Safe & Secure Working Environment (NOS 9003) Introduction, workplace safety, Report Accidents & Emergencies, Protect health & safety as your work, course conclusion, assessment Unit II Big Data Tools (NOS 2101) Introduction to Big Data tools like Hadoop, Spark, Impala etc., Data ETL process, Identify gaps in the data and follow-up for decision making. Provide Data/Information in Standard Formats (NOS 9004) Introduction, Knowledge Management, Standardized reporting & compliances, Decision Models, course conclusion. Assessment Unit III Big Data Analytics Run descriptives to understand the nature of the available data, collate all the data sources to suffice business requirement, Run descriptive statistics for all the variables and observer the data ranges, Outlier detection and elimination. Unit IV Machine Learning Algorithms (NOS 9003) Hypothesis testing and determining the multiple analytical methodologies, Train Model on 2/3 sample data using various Statistical/Machine learning algorithms, Test model on 1/3 sample for prediction etc. Unit V (NOS 9004) Data Visualization (NOS 2101) Prepare the data for Visualization, Use tools like Tableau, QlickView and D3, Draw insights out of Visualization tool. Product Implementation Text Books: 1. Student’s Handbook for Associate Analytics. Reference Books: 1. Introduction to Data Mining, Tan, Steinbach and Kumar, Addison Wesley, 2006 2. 3. Data Mining Analysis and Concepts, M. Zaki and W. Meira (the authors have kindly made an online version available): http://www.dataminingbook.info/uploads/book.pdf Mining of Massive Datasets Jure Leskovec Stanford Univ. Anand RajaramanMilliway Labs Jeffrey D. Ullman Stanford Univ. (http://www.vistrails.org/index.php/Course:_Big_Data_Analysis) Predictive Analytics (Associate Analytics – III) Unit I Introduction to Predictive Analytics & Linear Regression (NOS 2101) What and Why Analytics, Introduction to Tools and Environment, Application of Modelling in Business, Databases & Types of data and variables, Data Modelling Techniques, Missing imputations etc. Need for Business Modelling, Regression – Concepts, Blue property-assumptions-Least Square Estimation, Variable Rationalization, and Model Building etc. Unit II Logistic Regression (NOS 2101) Model Theory, Model fit Statistics, Model Conclusion, Analytics applications to various Business Domains etc. Regression Vs Segmentation – Supervised and Unsupervised Learning, Tree Building – Regression, Classification, Overfitting, Pruning and complexity, Multiple Decision Trees etc. Unit III Objective Segmentation(NOS 2101) Regression Vs Segmentation – Supervised and Unsupervised Learning, Tree Building – Regression, Classification, Overfitting, Pruning and complexity, Multiple Decision Trees etc. Develop Knowledge, Skill and Competences (NOS 9005) Introduction to Knowledge skills & competences, Training & Development, Learning & Development, Policies and Record keeping, etc. Unit IV Time Series Methods /Forecasting, Feature Extraction (NOS 2101) Arima, Measures of Forecast Accuracy, STL approach, Extract features from generated model as Height, Average, Energy etc and Analyze for prediction. Project Unit V Working with Documents (NOS 0703) Standard Operating Procedures for documentation and knowledge sharing, Defining purpose and scope documents, Understanding structure of documents – case studies, articles, white papers, technical reports, minutes of meeting etc., Style and format, Intectual Property and Copyright, Document preparation tools – Visio, PowerPoint, Word, Excel etc., Version Control, Accessing and updating corporate knowledge base, Peer review and feedback. Text Books: 1. Student’s Handbook for Associate Analytics-III. Reference Books and websites: 1. Gareth James • Daniela Witten • Trevor Hastie Robert Tibshirani. An Introduction to Statistical Learning with Applications in R