DATA FOR EXPLORATION A SUMMER TRAINING REPORT Submitted by SAHIL VERMA 20BCS3791 in partial fulfilment of summer training for the award of the degree of BACHELOR OF ENGINEERING IN COMPUTER SCIENCE AND ENGINEERING BIG DATA ANALYTICS APEX INSTITUTE OF TECHNOLOGY CHANDIGARH UNIVERSITY, GHARUAN PUNJAB JULY 2022 ABOUT THE COMPANY– Google: Google, in full Google LLC formerly Google Inc. (1998–2017), American search engine company, founded in 1998 by Sergey Brin and Larry Page, that is a subsidiary of the holding company Alphabet Inc. More than 70 percent of worldwide online search requests are handled by Google, placing it at the heart of most Internet users’ experience. Its headquarters are in Mountain View, California. Google began as an online search firm, but it now offers more than 50 Internet services and products, from e-mail and online document creation to software for mobile phones and tablet computers. In addition, its 2012 acquisition of Motorola Mobility put it in the position to sell hardware in the form of mobile phones. Google’s broad product portfolio and size make it one of the top four influential companies in the high-tech marketplace, along with Apple, IBM, and Microsoft. Despite this myriad of products, its original search tool remains the core of its success. In 2016 Alphabet earned nearly all of its revenue from Google advertising based on users’ search requests. Major working areas Google Video and YouTube Gmail Google Books Google Earth Google Apps and Chrome Android operating system Social networks and Google+ Coursera: Coursera is an online learning platform founded by two Stanford University computer science professors. It offers thousands of online courses in partnership with over 200 of the world's leading universities and companies, including Yale, Princeton, UPenn, Google, IBM, Amazon, Facebook, and more. The site offers individual courses as well as bachelor's and master's degree programs that reduce barriers to higher education. There are also professional certificate programs designed to aid workers in securing new roles or promotions. HistoryCoursera was founded in 2012 by Stanford University computer science professors Andrew Ng and Daphne Koller. Ng and Koller started offering their Stanford courses online in fall 2011, and soon after left Stanford to launch Coursera. Princeton, Stanford, the University of Michigan and the University of Pennsylvania were the first universities to offer content on the platform. In 2014 Coursera received both the Webby Winner (Websites and Mobile Sites Education 2014) and the People's Voice Winner (Websites and Mobile Sites Education) awards. CERTIFICATE – ACKNOWLEDGEMENT –- It gives us immense pleasure to express our deepest sense of gratitude and sincere thanks to our respected guide Mr. Pramod Vishwakarma, for providing us with the right guidance and advice at the crucial junctures and for showing me the right way and help for completing this work. Her useful suggestions for this whole work and co-operative behaviour are sincerely acknowledged and special thanks to Chandigarh university for sponsoring summer training for the students of Chandigarh university. We also wish to express our indebtedness to our parents as well as our family members whose blessings and support always helped us to face the challenges ahead. PLACE: Gharuan, Mohali Date: JULY 2022 Candidate UID: Sahil Verma 20BCS3791 ABSTRACT– Google Analytics is more than just numbers and stats. It tells the story of how people are interacting with your brand online — what they have questions about, what they’re most interested in, who they are, and more. The course opens with an overview of the data ecosystem, starting with the practices and processes of a junior data analyst in their day-today job, the skills needed for the role, and an explanation of the terms and concepts (like data life cycle) relevant to a data analyst. One of the hardest parts of data science is asking the right questions. This class covers effective questioning techniques that can help provide a framework for analysis with stakeholders. Data comes in all shapes, forms and types. It is essential for an analyst to be cognizant of the methods used to deal with different data types and formats. This includes the method to access, extract, filter and sort the data within databases. A piece of tracking code (with your unique analytics account number) gets added to every page of your site • The code collects data about visitors that land on these pages • This information is sent to your Analytics account (analytics.google.com) for viewing • Within your Analytics dashboard, you can view, filter, and sort this data to better understand how people get to your site, what they’re doing while they’re there, and what content works (and what doesn’t) CONTENTS Title Page i Certificate ii Acknowledgement iii ABSTRACT … CONTENTS COURSE 1 Foundations: Data, Data, Everywhere 1 1.1 Definition … 1.2 Key point of the course … 1.3 Details and Features 1.4 Time to complete every week … COURSE 2 Ask Questions to Make Data-Driven Decisions 2.1 Definition … 2.2 Key point of the course … 2.3 Details and Features 2.4 Time to complete every week … COURSE 3 Prepare Data for Exploration 3.1 Definition … 3.2 Key point of the course … 3.3 Details and Features 3.4 Time to complete every week LEARNING OUTCOMES CONCLUSION REFERENCES … Course 1: Foundations: Data, Data, Everywhere: This is the first course in the Google Data Analytics Certificate. These courses will equip you with the skills you need to apply to introductory-level data analyst jobs. Organizations of all kinds need data analysts to help them improve their processes, identify opportunities and trends, launch new products, and make thoughtful decisions. In this course, you’ll be introduced to the world of data analytics through hands-on curriculum developed by Google. Key Points of the course: Introducing data analytics Data helps us make decisions in everyday life and in business. In this first part of the course, you’ll learn how data analysts use data analytics and the tools of their trade to inform those decisions. You’ll also discover more about this course and the overall program expectations. All about analytical thinking Data analysts balance many different roles in their work. In this part of the course, you’ll learn about some of these roles and the key skills used by analysts. You’ll also explore analytical thinking and how it relates to data-driven decision-making. The wonderful world of data Data has its own life cycle, and the work of data analysts often intersects with that cycle. In this part of the course, you’ll learn how the data life cycle and data analysts' work both relate to your progress through this program. You’ll also be introduced to applications used in the data analysis process. Set up your toolbox As you're learning, spreadsheets, query languages, and data visualization tools are all a big part of a data analyst’s job. In this part of the course, you’ll learn more about the basic concepts involved and explore some examples of how these tools work. Endless career possibilities Businesses of all kinds value the work done by data analysts. In this part of the course, you’ll find out about these businesses and the specific jobs and tasks that analysts perform for them. You’ll also learn how your data analyst certificate will help you meet many of the requirements for a position with these businesses. Details and Features: By the end of this course, you’ll be able to define and explain what data analytics is about, what data analysts do, and what tools they use to carry out everyday activities. The material is broken down into five weeks and to complete this course, you’ll need around 21 hours. Time: Topic Time To Complete Week 1 Introducing data analytics 5 hours Week 2 All about analytical thinking 3 hours Week 3 The wonderful world of data 4 hours Week 4 Set up your toolbox 4 hours Week 5 Endless career possibilities 5 hours This course includes 6 graded assignments. All exams are in the form of a multiple-choice quiz. COURSE 2 Ask Questions to Make Data-Driven Decisions This is the second course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. You’ll build on your understanding of the topics that were introduced in the first Google Data Analytics Certificate course. The material will help you learn how to ask effective questions to make data-driven decisions, while connecting with stakeholders’ needs. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources. Key Points of the course: Effective questions To do the job of a data analyst, you need to ask questions and problem-solve. In this part of the course, you’ll check out some common analysis challenges and how analysts address them. You'll also learn about effective questioning techniques that can help guide your analysis. Data-driven decisions In analytics, data drives decision-making. In this part of the course, you’ll explore data of all kinds and its impact on real-life choices and strategies. You’ll also learn how to share your data through reports and dashboards. More spreadsheet basics Spreadsheets are a very important data analytics tool. In this part of the course, you will learn about how data analysts use spreadsheets in their work every day. You will also explore why structured thinking helps analysts better understand problems and come up with solutions. Always remember the stakeholder Successful data analysts learn to balance needs and expectations. In this part of the course, you’ll learn strategies for managing stakeholder expectations while establishing clear communication with your team. Details and Features: Time: Topic Time To Complete Week 1 Effective questions 5 hours Week 2 Data-driven decisions 3 hours Week 3 More spreadsheet basics 7 hours Week 4 Always remember the stakeholder 4 hours This course includes 5 graded assignments. All exams are in the form of a multiple-choice quiz. COURSE 3 Prepare Data for Exploration This is the third course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. As you continue to build on your understanding of the topics from the first two courses, you’ll also be introduced to new topics that will help you gain practical data analytics skills. You’ll learn how to use tools like spreadsheets and SQL to extract and make use of the right data for your objectives and how to organize and protect your data. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources. Key Points of the course: Data types and structures We all generate lots of data in our daily lives. In this part of the course, you’ll check out how we generate data and how analysts decide which data to collect for analysis. You’ll also learn about structured and unstructured data, data types, and data formats as you start thinking about how to prepare your data for exploration. Bias, credibility, privacy, ethics, and access When data analysts work with data, they always check that the data is unbiased and credible. In this part of the course, you’ll learn how to identify different types of bias in data and how to ensure credibility in your data. You’ll also explore open data and the relationship between and importance of data ethics and data privacy. Databases: Where data lives When you’re analysing data, you’ll access much of the data from a database. It’s where data lives. In this part of the course, you’ll learn all about databases, including how to access them and extract, filter, and sort the data they contain. You’ll also check out metadata to discover the different types and how analysts use them. Organizing and protecting your data Good organization skills are a big part of most types of work, and data analytics is no different. In this part of the course, you’ll learn the best practices for organizing data and keeping it secure. You’ll also learn how analysts use file naming conventions to help them keep their work organized. Optional: Engaging in the data community Having a strong online presence can be a big help for job seekers of all kinds. In this part of the course, you’ll explore how to manage your online presence. You’ll also discover the benefits of networking with other data analytics professionals. *Course challenge* Prepare for the course challenge by reviewing terms and definitions in the glossary. Then, demonstrate your knowledge of data collection, ethics and privacy, and bias during the quiz. You will also have an opportunity to apply your skill with spreadsheet and SQL functions, as well as filtering and sorting. Finally, secure and organize data with data analytics best practices. Details and Features: Time: This course is 5-weeks course Topic Time To Complete Week 1 Data types and structures 7 hours Week 2 Bias, credibility, privacy, ethics, and access 4 hours Week 3 Databases: Where data lives 9 hours Week 4 Organizing and protecting your data 3 hours Week 5 Optional: Engaging in the data community 1 hour This course includes 5 graded assignments and one course challenge. All exams are in the form of a multiple-choice quiz. Learning Outcomes: I have learnt about data and how data helps us make decisions in everyday life and in business. I have learnt how data analysts use data analytics and the tools of their trade to inform those decisions I have learnt how to structure the unstructured data and get the hidden information for the benefit of the organization I have learnt about how to access databases and extract, filter, and sort the data they contain. I learned so many skills that are fundamentally important to my journey in Data Analytics. I have established a professional social media presence, learned the basics of SQL, and data structure Conclusion: In last I want to say these courses are well structured and very help-full for all the data science students as well as the students from different fields. I have learnt a lot from these courses. The main thing of this professional certification is this is a series of beginner level courses and no prior knowledge is required for doing these courses and it provides a robust introduction to working with databases with Google sheets and SQL. Covers bias and data privacy as well. Gives clear definitions and is well taught and paced. I enjoy the content so far, and I liked the instructor. REFERENCES: 1. https://www.coursera.org/programs/summer-training-2022-of-cse7dur7?authProvider=chandigarhuniversity&currentTab=MY_COURSES&productId=kvb6uMbTEeqZOA5eKDHLw&productType=course&showMiniModal=true 2. https://www.coursera.org/learn/foundations-data/home/week/1 3. https://www.coursera.org/learn/ask-questions-make-decisions/home/week/1 4. https://www.coursera.org/learn/data-preparation/home/week/1 5. https://www.oracle.com/in/database/what-isdatabase/#:~:text=A%20database%20is%20an%20organized,database%20manageme nt%20system%20(DBMS). 6. https://www.guru99.com/introduction-to-microsoft-excel.html 7. https://support.microsoft.com/en-us/office/excel-video-training-9bc05390-e94c-46afa5b3-d7c22f6990bb