Uploaded by Suheyl

Week 1 Python Review

advertisement
Python Review
Code Formatting
•
•
•
•
Python uses indentation.
Indenting incorrectly will cause an error.
# is used to start a comment
In some constructs, colons start a new block. For example, defining functions, if-then
clause, for, while
Example
Whitespaces are ignored inside () and [].
Example
x=(1+2+ 3+4)
list = [[1, 1, 1], [2, 5, 4], [2, 1, 5]]
Import
• Some Python features need to be imported by importing the
libraries that contain them.
Example:
import matplotlib.pyplot as plt # Matplotlib is a comprehensive library for creating
static, animated, and interactive visualizations in Python
import numpy as np # It offers high-level mathematical functions and a multidimensional structure (know as ndarray) for manipulating large data sets
Variables and objects
Just assign a value to create a variable- there is no need to declare type
Calling a name before creating it causes an error
• A= 3
• A= [1, 2, 3]
• A = ‘Text’
Assignment creates references, not copies
• A = [1, 2, 3]
• B= A
• A[0] = 5
• Print (B) # B is [5,2, 3]
You can have multiple assignments at the same time
x, y = 2, 3
You can use the multiple assignments to swap values
x, y = y, x
You can have a chain of assignments
x=y=z=3
Arithmetic Operations
x = 1 + 2 # x is 3
y = 1 – 3 # y is -2
z = 2 * 2 # z is 4
l = 3**2 # l is 9
f = 5 % 2 # f is 1
g = 5 / 2 # g is 2.5
h = 5 // 2 # h is 2
m = 5 / float(2) # m is 2.5
n = int(5 / 2) # n is 2
Numerical types: int, float, complex
Comparison
<
less than
x = [0, 1, 2, 3, 4]
y = x
z = x[:]
<=
less than or equal
x == y #True
>
greater than
x is y #True
>=
greater than or equal
==
equal
!=
not equal
is
object identity
is not
negated object identity
Operation Meaning
x == z #True
Bitwise operators:
& (AND),
| (OR),
~(NOT)
Math
Command name
Description
abs(a)
absolute value
ceil(a)
rounds up
cos(a)
cosine, in radians
floor(a)
rounds down
log(a)
logarithm, base e
log10(a)
logarithm, base 10
max(a, b)
larger of two values
min(a,b)
smaller of two values
round(a)
nearest whole number
sin(a)
sine, in radians
sqrt(a)
square root
Constant
Description
e
2.7182818...
pi
3.1415926...
Strings
• single or double quotation marks
• triple quotes for multi line strings
x = ‘statistics for data science‘ #single quotes
y = “statistics for data science “ #double quotes
Z="My name is \"Farah\"" #escaped string -- helps you escape
characters that are not allowed
V = ‘super long string \
that has more than one part, \
but are all written in one line.‘ # very long string
V = ’’’super long string
that has more than one part,
written in many lines.’’’ # very long string
Strings cont.
"\t" # tab character
len(string) # length of string
• Strings are concatenated with + and repeated with *
s = 3 * ‘bla' + ‘umm' # s is ‘blablablaumm'
• Two or more strings next to each other are automatically
concatenated
a = 'Py' 'thon'
b = a + ‘3'
Lists
int_list = [1, 2, 3]
mixed_list = [“Farah", 1, FALSE]
list_of_lists = [ int_list, mixed_list]
len(int_list) # length of the list is 3
list_sum = sum(int_list) # sum of integers in the list is 6
Get the i-th element of a list
x = [0, 1,2,3,4,5,6,7,8,9]
First_element = x[0] # is 0, lists are 0-indexed
Second_element = x[1] # is 1
Ninth_element = x[-1] # is 9 – returns the last element
Eighth_element = x[-2] # is 8 second to last element
Get a part of a list
a=
b=
c=
d=
e=
f=
x[1:4] # [1, 2, 3]
x[:4] # [0, 1, 2,3]
x[-2:] # [8, 9]
x[3:] # [3, 4, ..., 9]
x[1:-1] # [1, 2, ..., 8] without_first_and_last
x[:] # [0, 1, 2, ..., 9]
Lists cont
Checking for elements in a list
Concatenating lists
2 in [1, 2, 3]
5 in [1, 2, 3]
a = [3, 2, 1]
b = [6, 5, 4]
a.extend(b) # a is[3,2,1,6,5,4]
a
b
c
a
= [3,
= [6,
= a +
stays
2, 1]
5, 4]
b # c is [3,2,1,6,5,4]&
the same.
Modifying lists
a = [0, 1, 2, 3, 4,
a[2] = a[2] * 2
a[-1] = 0
# a is
a[2:4] = a[2:4] * 2
del a[:2]
# a is
del a[:]
# a is
5, 6, 7, 8,9,10]
# a is [0, 1, 4, 3, 4, 5, 6, 7, 8,9,10]
[0, 1, 4, 3, 4, 5, 6, 7, 8,9,0]
# a is [0, 1, 4, 6, 4, 5, 6, 7,8,9 0]
[4, 6, 4, 5, 6, 7,8,9 0]
[]
Lists cont.
•
Strings can also be accessed like lists. But they can’t be modified (immutable)
a = ‘class'
b = a[0]
# ‘c'
c = a[:2]
# ‘cl'
d = a[-2:]
# ‘ss'
a[:2] = ‘Cl' # error because they are immutable
a = ‘Cl' + a[2:]# a is now Class
Functions-range()
x= list(range(2, 5)) , print(x) #[2,3,4]
for i in range(3):
print (i) # prints 0, 1, 2
for i in range(2, 6):
print (i) # prints 2, 3, 4,5
for i in range(0, 20, 2):
print (i) # prints 0, 2, 4, 6, 8,10,12,14,16,18
for i in range(10, 2, -1):
print (i) # prints 9,8,7,6..
x = [‘I', ‘love', ‘Python’]
for i in range(len(x)):
...
print(i, x[i])
...
0 I
1 love
2 Python
Functions-sort ()
sorted(a): returns a new sorted list without changing
the original list
a.sort: sorts the original list a
a = [4,1,2,3]
b = sorted(a)
a.sort()
# b is [1,2,3,4], a is same
#a is changed to [1,2,3,4]
Modifying sorted()
# sort the list by absolute value from largest to smallest
a = [4,-1, 2,-3]
b = sorted(a, key=abs, reverse=True) # is [4,-3, 2,-1]
#key is a function that will be called to transform the collection's items
before they are compared.
if-else
if c > 2: #if 1 is greater than 2
a = “cool"
elif c > 3: # elif stands for 'else if'"
a = ”coool"
else: # when all else fails use else
a = “Meh"
print (a)
Loops
i = 0
while i < 10:
print (i, "is below 10“) #Error if we forget to indent
i += 1
for i in range(10):
if i == 2:
continue # go to next iteration
if i == 6:
break # quit the loop
print (i)
Important python libraries for data science
• Numpy
– Handling multi dimensional arrays
• pandas
– DataFrame
– Handling labeled tabular data
• Matplotlib: plotting
Reading and Writing Data
From .csv file
import pandas as pd
data=pd.read_csv(r'C:\Users\Farah\Desktop\MATH37198
FALL2022\Week1\1B_Height_in_inches.csv')
From excel- use
pd.read_excel()
Saving data to .csv
data.to_csv('C:/Users/Farah/Desktop/Farah.csv')
To excel- use
df.to_excel()
Plotting
Line graph.
• Good for trends.
• use plt.plot
• You can specify different marker and line styles, colors, etc.
import numpy as np
import matplotlib.pyplot as plt #plot graphics will appear in your notebook
%matplotlib inline
years = list(range(2000, 2022, 2))
Population_In_M = [100, 105, 120, 125, 135, 150, 159,164,171,180,186,196]
# create a line chart, years on x-axis, population on y-axis
plt.plot(years, Population_In_M, color='green', marker='o', linestyle='solid')
# add a title
plt.title(“Population Growth")
# label to the y-axis
plt.ylabel(“Population in Million")
# add a label to the x-axis
plt.xlabel("Year")
plt.show()
Scatterplots
• visualizing the relationship between two paired
sets of data
# create a scatter plot, years on x-axis, population on y-axis
plt.scatter(years, Population_In_M)
# add a title
plt.title("Population Growth")
# label to the y-axis
plt.ylabel("Population in Million")
# add a label to the x-axis
plt.xlabel("Year")
plt.show()
import matplotlib.pyplot as plt #plot graphics will appear in
your notebook
%matplotlib inline
# create a line chart, years on x-axis, population on y-axis
plt.plot(years, Population_In_M,'green',
years, Population_In_M2, 'red')
# add a title
plt.title("Population Growth")
# label to the y-axis
plt.ylabel("Population in Million")
# add a label to the x-axis
plt.xlabel("Year")
plt.legend(['Polar Bears','Brown Bears'])
plt.show()
plt.show()
Bar charts
• Good for presenting/comparing numbers in discrete set of items
# create a bar chart, years on x-axis, population on y-axis
plt.bar(years, Population_In_M)
# add a title
plt.title("Population Growth")
# label to the y-axis
plt.ylabel("Population in Million")
# add a label to the x-axis
plt.xlabel("Year")
plt.show()
Histogram
Graphical representation of a frequency distribution table
Create Frequency Distribution
#read csv
data=pd.read_csv(r'C:\Users\Farah\Desktop\MATH37198
FALL2022\Week1\1B_Height_in_inches.csv')
#explore the dataset
data.head()
#creating frequency distribution table
freqdist=data.value_counts(), print(freqdist)
#plotting histogram
plt.hist(data)
# add a title
plt.title("Heights")
# label to the y-axis
plt.ylabel("Count")
# add a label to the x-axis
plt.xlabel("Height in inches")
plt.show()
67
65
69
66
68
71
64
63
73
72
61
62
74
6
5
5
5
4
4
4
3
2
2
1
1
1
Download