LECTURE 7 The Standard Library

advertisement
LECTURE 7
The Standard Library
THE STANDARD LIBRARY
Python has a fantastically large standard library.
Some modules are more useful than others (e.g. sys and strings).
Some modules are relatively obscure.
Some of the Standard Library is really just a collection of built-in functions.
We’ll start by looking at the built-in functionality – what we can use without importing
anything.
BUILT-IN FUNCTIONS
Python has about 80 built-in functions that are free to use without importing a
module.
Here’s a list.
It would be hard to cover each function and its applications in a lecture period so I
encourage you all to look at this list and see what’s available to you.
There are additionally some non-essential built-in functions but they are deprecated
or improved upon.
BUILT-IN CONSTANTS
• False
• True
• None: absence of a value.
• NotImplemented
>>> x = [1,2,3]
>>> 1 in x
True
>>> if ((1 in x) == True):
...
print "yay!"
...
yay!
>>> def myfunc():
...
pass
...
>>> i = myfunc()
>>> print i
None
BUILT-IN TYPES AND EXCEPTIONS
We’ve already covered most of the built-in types and looked at some built-in
exceptions. I recommend finding the complete lists in the official documentation.
STRING SERVICES
The string module (as opposed to the string data type ) and the re module
provide some really useful string manipulation operations.
There are a number of other modules that fall under the “string services” header, but
these two are the most ubiquitous – any larger Python program will almost certainly
use string and re is fairly common.
THE STRING MODULE
The string module provides a variety of useful string operations and constants. To
use the string module, you have only to import string. The string module can
roughly be broken down into the following sections:
• Constants
• Formatting
• Templating
• General Purpose Functions and Deprecated Functions
STRING CONSTANTS
The string constants are a number of built-in constants defined by the string module.
>>> import string
>>> string.ascii_letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
>>> string.digits
'0123456789'
>>> string.hexdigits
'0123456789abcdefABCDEF'
>>> string.printable
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQR
STUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
>>> string.whitespace
'\t\n\x0b\x0c\r '
STRING CONSTANTS
The string constants are a number of built-in constants defined by the string module.
More listed in the docs.
Now, I have predefined strings
that can be used for a variety
of useful operations.
For example, … say I wanted
to create a cipher…
>>> import string
>>> string.ascii_letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
>>> string.digits
'0123456789'
>>> string.hexdigits
'0123456789abcdefABCDEF'
>>> string.printable
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQR
STUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
>>> string.whitespace
'\t\n\x0b\x0c\r '
STRING FORMATTING
The str.format() method allows you to specify “replacement fields” within a string by
using curly braces. Everything outside of the curly braces is literal text.
• Positional arguments
>>> import string
>>> '{0} {1} {2}'.format('Hello', 'Python', 'Class!')
'Hello Python Class!'
>>> '{} {} {}'.format('Hello', 'Python', 'Class!')
'Hello Python Class!'
>>> '{3}, {0} {1} {2}'.format('My', 'name', 'is', 'Yoda')
'Yoda, My name is'
>>> '{0} {2} {3}, {1} {2} {3}'.format('Walk', 'Talk', 'this', 'way')
'Walk this way, Talk this way'
STRING FORMATTING
The str.format() method allows you to specify “replacement fields” within a string by
using curly braces. Everything outside of the curly braces is literal text.
• Keyword arguments
>>> 'Today is {day} and it is {temp} degrees Fahrenheit outside in {city}'.format(day='Jan 23', temp='65', city='Tallahassee')
'Today is Jan 23 and it is 65 degrees Fahrenheit outside in Tallahassee'
STRING FORMATTING
You can also access the attributes of an object inside of the format() method.
>>> class Fraction:
...
def __init__(self, num, denom):
...
self.num = num
...
self.denom = denom
...
>>> myfrac = Fraction(3,5)
>>> 'The numerator is {0.num} and the denominator is {0.denom}'.format(myfrac)
'The numerator is 3 and the denominator is 5'
STRING FORMATTING
You can also access items in a sequence object.
>>> fibonacci = [0, 1, 1, 2, 3, 5, 8]
>>> '{0[2]} + {0[3]} = {0[4]}'.format(fibonacci)
'1 + 2 = 3'
And align text.
>>> '{:<50}'.format('To the left, to the left')
'To the left, to the left
'
>>> '{:>50}'.format('All right, All right, All right')
'
All right, All right, All right'
>>> '{:^50}'.format('Let\'s center ourselves')
"
Let's center ourselves
"
>>> '{:*^50}'.format('Let\'s center ourselves')
"**************Let's center ourselves**************"
STRING FORMATTING
You can also change the precision and display of your float values.
>>> '{:.2f}'.format(3.4385)
'3.44'
>>> '{:.4f}'.format(3.4385)
'3.4385'
STRING FORMATTING
Some fun things:
• Add commas to your big numbers.
• Define precision for your percentages.
• Use formatting specific to a type.
>>> '{:,}'.format(1234567890)
'1,234,567,890'
>>> points = 19.5
>>> total = 22
>>> 'Correct answers: {:.2%}'.format(points/total)
'Correct answers: 88.64%'
>>> import datetime
>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)
>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)
'2010-07-04 12:15:58'
STRING TEMPLATING
Templating provides simpler substitution methods – there are also some very
advanced uses for templates but we’ll skip that for now.
>>> import string
>>> mytemp = string.Template("$who likes $what")
>>> mytemp.substitute(who='tim', what='kung pao')
'tim likes kung pao'
STRING FUNCTIONS
Probably the most useful and most used part of the string module are the numerous
string functions. Note that these functions are considered deprecated but only in
Python 3.x. The difference between Python 2.x and Python 3.x is this:
>>> string.capitalize("word")
'Word’
>>> "word".capitalize()
'Word'
Python 2.x only
Python 2.x and 3.x
STRING FUNCTIONS
Here’s a sample of the many
useful string functions predefined
for you!
>>> import string
>>> string.atoi("345")
345
>>> string.capitalize("word")
'Word'
>>> string.find("expression", "press")
2
>>> string.rfind("This is a sentence.", " ")
9
>>> string.count("mississippi", "iss")
2
>>> string.lower("WORD")
'word'
STRING FUNCTIONS
Here’s a sample of the many useful string functions predefined for you!
>>> l = string.split("Look at all these words I have!")
>>> l
['Look', 'at', 'all', 'these', 'words', 'I', 'have!']
>>> " ".join(l)
'Look at all these words I have!'
>>> string.rstrip("exceptional", "al")
'exception'
>>> string.strip("Temp: 59F", "Temp: ")
'59F'
>>> string.swapcase("HeLlO")
'hElLo'
>>> string.replace("cout >> x >> endl;", ">>", "<<")
'cout << x << endl;'
RE MODULE
So you see now what makes Python support such rapid development – how long
would it take you to do each of those things in C++ or C (using the standard library)?
Let’s move on to the next popular string service: re
re is the module that defines all of the built-in regular expression operations and
supports Unicode as well as 8-bit strings.
RE SYNTAX
Let’s do a short intro to regular expressions for those of you who are not familiar.
A regular expression is a set of characters with special meaning that defines an
entire set of strings.
A single character simply matches itself (e.g. ‘A’ matches the string “A”).
Characters like ‘|’, ’.’, ‘$’, ‘*’, ‘+’ have special meaning and allow us to construct
regular expressions concisely.
RE SYNTAX
Sybol
Meaning
Example
|
Alternation
A|B = {‘A’, ‘B’}
.
Match any char but newline
. = {‘a’, ‘b’, ‘c’, ‘%’, ‘#’, ‘ ‘, ‘(‘, ...}
*
Match 0 or more repetitions
(str)*={‘’, ‘str’, ‘strstr’, ‘strstrstr’, ..}
+
Match 1 or more repetitions
(str)+={‘str’, ‘strstr’, ‘strstrstr’, …}
?
Preceding RE optional
ab? = {‘a’, ‘ab’}
{m}
Match exactly m copies of previous RE
A{6} = {‘AAAAAA’}
[]
Characters in a set
[0-9a-z] = {‘0’, ‘a’, ‘1’, ‘b’, ‘2’, …}
There are a lot of options – check out the official docs to see what you can do.
Most of these examples are based off of the tutorialspoint page for RE in Python – very helpful!
RAW STRINGS
A note before we do some examples:
Typically, regular expressions are passed to re methods as raw strings
(e.g. r‘[0-9]+’). Strings appended with an ‘r’ in the front will have their escape
characters suppressed.
By keeping regular expressions raw, we can avoid these kinds of issues:
r"\\“  regular expression matching literal backslash.
"\\\\“  regular expression matching literal backslash.
REGULAR EXPRESSIONS
re.match(pattern, string, flags = 0)
• Returns a MatchObject if it can match pattern in
string. Else returns None.
• We can access the matched expression using the
method group().
• We can pass an index to group (i.e. group(1)) to
access matched subgroups. These are denoted
with parentheses in the regular expression.
>>> import re
>>> line = "Cats are smarter than dogs"
>>> match_obj = re.match(r'(.*) are (.*?) .*',
line, re.I)
>>> if match_obj:
...
print "match_obj.group() : ",
match_obj.group()
...
print "match_obj.group(1): ",
match_obj.group(1)
...
print "match_obj.group(2): ",
match_obj.group(2)
... else:
...
print "No match!"
...
match_obj.group() : Cats are smarter than dogs
match_obj.group(1): Cats
match_obj.group(2): smarter
REGULAR EXPRESSIONS
• The match() function is just checking to match the regular expression from the
beginning of the string.
• Use search() to check for a match anywhere within the string.
>>> import re
>>> line = "Cats are smarter than dogs"
>>> match_obj = re.match(r'dogs', line, re.I)
>>> if match_obj:
...
print "Match is: ", match_obj.group()
... else:
...
print "No Match!"
...
No Match!
REGULAR EXPRESSIONS
• The match() function is just checking to match the regular expression from the
beginning of the string.
• Use search() to check for a match anywhere within the string.
>>> import re
>>> line = "Cats are smarter than dogs"
>>> search_obj = re.search(r'dogs', line, re.I)
>>> if search_obj:
...
print "Match is: ", search_obj.group()
... else:
...
print "No Match!"
...
Match is: dogs
REGULAR EXPRESSIONS
Use the sub() function to perform substitution.
\d matches any digit.
\D matches any non-digit.
$ matches the end of a line.
>>> import re
>>> phone = "2004-959-559 # This is Phone Number"
>>> num = re.sub(r'#.*$', "", phone)
>>> print "Phone Num : ", num
Phone Num : 2004-959-559
>>> num = re.sub(r'\D', "", phone)
>>> print "Phone Num : ", num
Phone Num : 2004959559
REGULAR EXPRESSIONS
You can compile your
regular expressions for
future use if they are used
multiple times within a
program.
Just call compile() with your
re string.
It creates a regular
expression object, with the
usual methods.
>>> my_re = re.compile(r'([a-zA-Z])([a-zA-Z_0-9]*)')
>>> match_obj = my_re.match("myint_1")
>>> match_obj.group()
'myint_1'
>>> match_obj.group(1)
'm'
>>> match_obj.group(2)
'yint_1'
>>> match_obj = my_re.match("totalBalance")
>>> match_obj.group()
'totalBalance'
>>> match_obj.group(1)
't'
>>> match_obj.group(2)
'otalBalance'
REGULAR EXPRESSIONS
The official regular expressions docs have a ton of cool examples including
implementing C’s scanf, making a phonebook, etc.
Next time, we’ll be talking about specialized data types in the Python Standard
Library.
Download