Uploaded by Vivek S

Case study - Group 5

advertisement
Python Case Study (Batch1Group5)
Studying the behaviour of students who have taken some particular
courses across the country.
College vs Average Score
On the x-axis, it shows different
colleges and on y-axis, the average
percentages of students of that
college in the ‘IT Foundation Skills’
course. We can see the average
performance of students from each
college in the graph.
Code
import pandas as pd
df=pd.read_excel('/content/Learning Dash Board.xlsx')
df=df.drop([0])
df=df.astype({'Unnamed: 15': 'float'})
import numpy as np
graphing=df.groupby(['Unnamed: 2']).agg({'Unnamed: 15':np.mean})
graphing.plot(kind='bar')
Branch vs Average Score
On the x-axis, it shows different
Branches the students studies in
and on y-axis, the average
percentages of students of that
college in the ‘IT Foundation Skills’
course. We can see the average
performance of students from each
branch.
Code
import pandas as pd
df=pd.read_excel('/content/Learning Dash Board.xlsx')
df=df.drop([0])
df=df.astype({'Unnamed: 15': 'float'})
import numpy as np
graphing=df.groupby(['Unnamed: 6']).agg({'Unnamed: 15':np.mean})
graphing.plot(kind='bar')
College vs Average Score (Web
Technology)
The x-axis, represents the different
colleges and y-axis, represents the
student’s average score in the ‘Web
Technology’ course.
Inference NIT college has the best
performance and SRM with the
least average score in this
respective course. AMBS and BTIS
almost exhibit equal performance.
Hence, this plot will aide in
choosing the appropriate college if
one wants to pursue his career in
‘Web Technology’.
Code
Branch vs Average Score (Web
Technology)
The x-axis, represents the different
student’s engineering background
taken under consideration and yaxis, represents the student’s
average score in the ‘Web
Technology’ course.
Inference Surprisingly, Civil has the best
performance and EEE with the least
average score in this course. Mech
and CSC performed equally well,
followed by BIO and IP with almost
an equivalent performance.
Code
College vs
Performance in
Angular JS
Is it evident from the graph that
BTIS and SRM do very well in
angular JS and NIT takes the last
place. This could be due to the high
absentee percentage present at
these other colleges due to which
the mean performance is reduced.
Code Snippet for Plotting the
graph
Code Snippet for aggregation
Engineering Background vs
Performance (Angular JS)
The inference we can make here is
that CSC, EEE, IP students have a
better grasp over Angular JS than
Civil or Bio. Surprisingly, Mech
students are well versed in it
despite not usually having it in their
curriculum.
Code for aggregation
Code Snippet-JAVA SKILLS BY COLLEGE
# Question - JAVA SKILLS BY COLLEGE
# object to float
data['QUIZ+ASSIGNMENT'] = data['QUIZ+ASSIGNMENT'].astype('float')
# converting percentages to values
data['QUIZ+ASSIGNMENT'] = data['QUIZ+ASSIGNMENT']/100
java_skills_by_college = data.groupby(['College'])['QUIZ+ASSIGNMENT'].mean()
print(java_skills_by_college)
from matplotlib import pyplot as plt
plt.bar(java_skills_by_college.index,java_skills,color='b')
plt.title('JAVA SKILLS BY COLLEGE')
plt.ylabel('Mean performance')
plt.xlabel("College")
plt.legend()
plt.show()
JAVA SKILLS BY COLLEGE
Inferences
The analysis is done for a total
of
296
students
from
different
colleges.
The
analysis
none
of
clearly
the
shows
colleges
that
have
exceptional skills
in
java
performance
and
of
the
al
average
colleges
is
nearly same.
* BTIS and SRM have the highest
averages.
Code Snippet-JAVA SKILLS BY BRANCH
#Question - JAVA SKILLS BY STREAMS
java_skills_by_branches = data.groupby('Engineering
backgroud')['QUIZ+ASSIGNMENT'].mean()
print(java_skills_by_branches )
plt.bar(java_skills_by_branches.index,java_skills_by_branches,color='b')
plt.title('JAVA SKILLS BY BRANCHES')
plt.ylabel('Mean performance')
plt.xlabel("College")
plt.legend()
plt.show()
JAVA SKILLS BY BRANCH
Inference
The highest performance is
by Mechanical though it's a
non IT branch closely
followed by EEE and
IP.Surprisingly CSC has the
least average .
College Vs CGPA
After analysing the given dataset, it can be
deduced that
1. The average CGPA of all the 5 colleges
in question lie between 7.5 and 7.9.
2. The maximum difference in CGPA
between any 2 colleges in this list is <
0.5
3. NIT has the highest average CGPA of
7.9
4. IIT has the lowest average CGPA of
7.65
Branch Vs CGPA
1. The maximum difference in CGPA
between any two branches in around
0.4.
2. Bio stream has the highest average
CGPA with 7.8.
3. Following closely behind Bio is the
Civil, IP and IT branches all of which
have a CGPA >7.75
4. The lowest average CGPA among all
the branches has been recorded for
EEE whose CGPA stands at 7.45
College Vs CGPA and Branch VS CGPA
CGPA
College
AMBS
BTIS
IIT
NIT
SRM
7.705128
7.793651
7.632812
7.892157
7.666667
CGPA
College
AMBS
BTIS
IIT
NIT
SRM
7.705128
7.793651
7.632812
7.892157
7.666667
Code for College Vs CGPA and Branch Vs CGPA
from matplotlib import pyplot as plt
import pandas as pd
df=data[1:300][['Background','CGPA']].groupby
from matplotlib import style as s
('Background').mean()
#converting csv to dataframe
print(df)
data=pd.read_csv('/content/sample_data/FinalCollege
fig = plt.figure(dpi = 128, figsize = (10,6))
Report.csv')
plt.plot(df)
plt.savefig("CGPA Branch Wise.png")
cgpa=data.iloc[0:300,5].values
college=data.iloc[0:300,2].values
df=data[1:300][['College','Background']].grou
eng_background=data.iloc[0:300,6]
pby('Background').count()
print(df)
df=data[1:300][['College','CGPA']].groupby('College
fig = plt.figure(dpi = 128, figsize = (10,6))
').mean()
plt.plot(df)
print(df)
fig = plt.figure(dpi = 128, figsize = (10,6))
plt.plot(df)
plt.savefig("CGPA College Wise.png")
Code
import pandas as pd
data = pd.read_excel('/content/Learning Dash Board.xlsx')
new_data = data[['Unnamed: 4','Unnamed: 15','Unnamed: 23']]
new_data = new_data.iloc[1:]
new_data['Unnamed: 15'] = new_data['Unnamed: 15'].astype(float)
new_data['Unnamed: 23'] = new_data['Unnamed: 23'].astype(float)
new_data = new_data.rename(columns={'Unnamed: 4': 'year'})
new_data = new_data.rename(columns={'Unnamed: 15': 'IT Foundation Skills'})
new_data = new_data.rename(columns={'Unnamed: 23': 'Web Technologies'})
grouped= new_data.groupby('year')
new_grouped = grouped.agg('mean')
Code
IT_foundation_df = pd.DataFrame({'year':[2018,2019,2020], 'IT Foundation
Skills':new_grouped['IT Foundation Skills']})
IT_Web_Technologies_df = pd.DataFrame({'year':[2018,2019,2020], 'Web
Technologies':new_grouped['Web Technologies']})
IT_foundation_graph = IT_foundation_df.plot.bar(x='year',y='IT Foundation Skills',rot=0)
IT_foundation_graph.figure.savefig('year_vs_IT_foundation.jpg')
IT_Web_Technologies_dfgraph = IT_Web_Technologies_df.plot.bar(x='year',y='Web
Technologies',rot=0)
IT_Web_Technologies_dfgraph.figure.savefig('year_vs_Web_Technologies.jpg')
Year vs Average Score
in IT Foundation Skill
●
From the Bar graph we notice
that the performance of students
in subsequent years since 2018
has been improving steadily.
●
This shows that overall more
students have started focusing
on the IT Foundation skills either
by themselves or at a college
level.
●
Additionally the improving score
signifies the growing importance
of the IT Foundation Skill in
various Industries.
Year vs Average Score
in Web Technologies
●
From the Bar graph we notice
that the performance of students
in 2019 as compared to 2018 has
decreased a bit.
●
But in the year 2020 the
performance has increased quite
significantly showing the growing
importance of Web Technology
●
The overall growth of the score
signifies that students are getting
better at Web technology which
indicates that the industry
requirement for websites has
seen a significant increase in the
year 2020
Code for performance in Angular and
Java over years
Year vs Average Score
in Angular JS
After analysing the given dataset, it can
be deduced that
The Angular JS performance is
increasing year by year.
The average of Quiz+Assignment
performance is between 30 to 40%
This may be the result of increase in use
of Angular JS and, the corporate interest
in it.
Year vs Average Score
in Java
After analysing the given dataset, it can
be deduced that
The average Java performance is
between 15 to 20%
The highest performance in
Quiz+Assesment is in the year 2020
The significant increase from lowest to
highest can be due to increase in
learning under lockdown.
Download