Python Case Study (Batch1Group5) Studying the behaviour of students who have taken some particular courses across the country. College vs Average Score On the x-axis, it shows different colleges and on y-axis, the average percentages of students of that college in the ‘IT Foundation Skills’ course. We can see the average performance of students from each college in the graph. Code import pandas as pd df=pd.read_excel('/content/Learning Dash Board.xlsx') df=df.drop([0]) df=df.astype({'Unnamed: 15': 'float'}) import numpy as np graphing=df.groupby(['Unnamed: 2']).agg({'Unnamed: 15':np.mean}) graphing.plot(kind='bar') Branch vs Average Score On the x-axis, it shows different Branches the students studies in and on y-axis, the average percentages of students of that college in the ‘IT Foundation Skills’ course. We can see the average performance of students from each branch. Code import pandas as pd df=pd.read_excel('/content/Learning Dash Board.xlsx') df=df.drop([0]) df=df.astype({'Unnamed: 15': 'float'}) import numpy as np graphing=df.groupby(['Unnamed: 6']).agg({'Unnamed: 15':np.mean}) graphing.plot(kind='bar') College vs Average Score (Web Technology) The x-axis, represents the different colleges and y-axis, represents the student’s average score in the ‘Web Technology’ course. Inference NIT college has the best performance and SRM with the least average score in this respective course. AMBS and BTIS almost exhibit equal performance. Hence, this plot will aide in choosing the appropriate college if one wants to pursue his career in ‘Web Technology’. Code Branch vs Average Score (Web Technology) The x-axis, represents the different student’s engineering background taken under consideration and yaxis, represents the student’s average score in the ‘Web Technology’ course. Inference Surprisingly, Civil has the best performance and EEE with the least average score in this course. Mech and CSC performed equally well, followed by BIO and IP with almost an equivalent performance. Code College vs Performance in Angular JS Is it evident from the graph that BTIS and SRM do very well in angular JS and NIT takes the last place. This could be due to the high absentee percentage present at these other colleges due to which the mean performance is reduced. Code Snippet for Plotting the graph Code Snippet for aggregation Engineering Background vs Performance (Angular JS) The inference we can make here is that CSC, EEE, IP students have a better grasp over Angular JS than Civil or Bio. Surprisingly, Mech students are well versed in it despite not usually having it in their curriculum. Code for aggregation Code Snippet-JAVA SKILLS BY COLLEGE # Question - JAVA SKILLS BY COLLEGE # object to float data['QUIZ+ASSIGNMENT'] = data['QUIZ+ASSIGNMENT'].astype('float') # converting percentages to values data['QUIZ+ASSIGNMENT'] = data['QUIZ+ASSIGNMENT']/100 java_skills_by_college = data.groupby(['College'])['QUIZ+ASSIGNMENT'].mean() print(java_skills_by_college) from matplotlib import pyplot as plt plt.bar(java_skills_by_college.index,java_skills,color='b') plt.title('JAVA SKILLS BY COLLEGE') plt.ylabel('Mean performance') plt.xlabel("College") plt.legend() plt.show() JAVA SKILLS BY COLLEGE Inferences The analysis is done for a total of 296 students from different colleges. The analysis none of clearly the shows colleges that have exceptional skills in java performance and of the al average colleges is nearly same. * BTIS and SRM have the highest averages. Code Snippet-JAVA SKILLS BY BRANCH #Question - JAVA SKILLS BY STREAMS java_skills_by_branches = data.groupby('Engineering backgroud')['QUIZ+ASSIGNMENT'].mean() print(java_skills_by_branches ) plt.bar(java_skills_by_branches.index,java_skills_by_branches,color='b') plt.title('JAVA SKILLS BY BRANCHES') plt.ylabel('Mean performance') plt.xlabel("College") plt.legend() plt.show() JAVA SKILLS BY BRANCH Inference The highest performance is by Mechanical though it's a non IT branch closely followed by EEE and IP.Surprisingly CSC has the least average . College Vs CGPA After analysing the given dataset, it can be deduced that 1. The average CGPA of all the 5 colleges in question lie between 7.5 and 7.9. 2. The maximum difference in CGPA between any 2 colleges in this list is < 0.5 3. NIT has the highest average CGPA of 7.9 4. IIT has the lowest average CGPA of 7.65 Branch Vs CGPA 1. The maximum difference in CGPA between any two branches in around 0.4. 2. Bio stream has the highest average CGPA with 7.8. 3. Following closely behind Bio is the Civil, IP and IT branches all of which have a CGPA >7.75 4. The lowest average CGPA among all the branches has been recorded for EEE whose CGPA stands at 7.45 College Vs CGPA and Branch VS CGPA CGPA College AMBS BTIS IIT NIT SRM 7.705128 7.793651 7.632812 7.892157 7.666667 CGPA College AMBS BTIS IIT NIT SRM 7.705128 7.793651 7.632812 7.892157 7.666667 Code for College Vs CGPA and Branch Vs CGPA from matplotlib import pyplot as plt import pandas as pd df=data[1:300][['Background','CGPA']].groupby from matplotlib import style as s ('Background').mean() #converting csv to dataframe print(df) data=pd.read_csv('/content/sample_data/FinalCollege fig = plt.figure(dpi = 128, figsize = (10,6)) Report.csv') plt.plot(df) plt.savefig("CGPA Branch Wise.png") cgpa=data.iloc[0:300,5].values college=data.iloc[0:300,2].values df=data[1:300][['College','Background']].grou eng_background=data.iloc[0:300,6] pby('Background').count() print(df) df=data[1:300][['College','CGPA']].groupby('College fig = plt.figure(dpi = 128, figsize = (10,6)) ').mean() plt.plot(df) print(df) fig = plt.figure(dpi = 128, figsize = (10,6)) plt.plot(df) plt.savefig("CGPA College Wise.png") Code import pandas as pd data = pd.read_excel('/content/Learning Dash Board.xlsx') new_data = data[['Unnamed: 4','Unnamed: 15','Unnamed: 23']] new_data = new_data.iloc[1:] new_data['Unnamed: 15'] = new_data['Unnamed: 15'].astype(float) new_data['Unnamed: 23'] = new_data['Unnamed: 23'].astype(float) new_data = new_data.rename(columns={'Unnamed: 4': 'year'}) new_data = new_data.rename(columns={'Unnamed: 15': 'IT Foundation Skills'}) new_data = new_data.rename(columns={'Unnamed: 23': 'Web Technologies'}) grouped= new_data.groupby('year') new_grouped = grouped.agg('mean') Code IT_foundation_df = pd.DataFrame({'year':[2018,2019,2020], 'IT Foundation Skills':new_grouped['IT Foundation Skills']}) IT_Web_Technologies_df = pd.DataFrame({'year':[2018,2019,2020], 'Web Technologies':new_grouped['Web Technologies']}) IT_foundation_graph = IT_foundation_df.plot.bar(x='year',y='IT Foundation Skills',rot=0) IT_foundation_graph.figure.savefig('year_vs_IT_foundation.jpg') IT_Web_Technologies_dfgraph = IT_Web_Technologies_df.plot.bar(x='year',y='Web Technologies',rot=0) IT_Web_Technologies_dfgraph.figure.savefig('year_vs_Web_Technologies.jpg') Year vs Average Score in IT Foundation Skill ● From the Bar graph we notice that the performance of students in subsequent years since 2018 has been improving steadily. ● This shows that overall more students have started focusing on the IT Foundation skills either by themselves or at a college level. ● Additionally the improving score signifies the growing importance of the IT Foundation Skill in various Industries. Year vs Average Score in Web Technologies ● From the Bar graph we notice that the performance of students in 2019 as compared to 2018 has decreased a bit. ● But in the year 2020 the performance has increased quite significantly showing the growing importance of Web Technology ● The overall growth of the score signifies that students are getting better at Web technology which indicates that the industry requirement for websites has seen a significant increase in the year 2020 Code for performance in Angular and Java over years Year vs Average Score in Angular JS After analysing the given dataset, it can be deduced that The Angular JS performance is increasing year by year. The average of Quiz+Assignment performance is between 30 to 40% This may be the result of increase in use of Angular JS and, the corporate interest in it. Year vs Average Score in Java After analysing the given dataset, it can be deduced that The average Java performance is between 15 to 20% The highest performance in Quiz+Assesment is in the year 2020 The significant increase from lowest to highest can be due to increase in learning under lockdown.