Uploaded by vinaykumar456575

Module 4

advertisement
Module IV
Python Programming: Classes, inheritance, generators, standard library (part I),
command line arguments, string pattern matching, internet access, data compression.
Classes concept
Python is an object-oriented programming language. Object-oriented
programming (OOP) focuses on creating reusable patterns of code, in contrast to
procedural programming, which focuses on explicit sequenced instructions. When
working on complex programs in particular, object-oriented programming lets you
reuse code and write code that is more readable, which in turn makes it more
maintainable.
One of the most important concepts in object-oriented programming is the
distinction between classes and objects, which are defined as follows:


Class — A blueprint created by a programmer for an object. This defines a
set of attributes that will characterize any object that is instantiated from this
class.
Object — An instance of a class. This is the realized version of the class,
where the class is manifested in the program.
These are used to create patterns (in the case of classes) and then make use of the
patterns (in the case of objects).
In this tutorial, we’ll go through creating classes, instantiating objects, initializing
attributes with the constructor method, and working with more than one object of
the same class.
Classes
Classes are like a blueprint or a prototype that you can define to use to create
objects.
We define classes by using the class keyword, similar to how we define
functions by using the def keyword.
Methods are a special kind of function that are defined within a class.
The argument to these functions is the word self, which is a reference to objects
that are made based on this class. To reference instances (or objects) of the
class, self will always be the first parameter, but it need not be the only one.
Objects
An object is an instance of a class. Object is active entity and class is logical entity
which means that object has state and it is changing dynamically so that object is
active entity .class has no state , it is a logical entity.
All data members and methods are public by default in python.
The self Parameter
The self parameter is a reference to the class itself, and is used to access
variables that belongs to the class.
It does not have to be named self , you can call it whatever you like, but it has
to be the first parameter of any function in the class:
SAMPLE PROGRAM: 1
Class Cse:
Def __init__(self):
Self.name=None //name is public data member
Self. Strength=None //strength is public data member
Def set(self,n,s):
Self.name=n
Self.strength=s
Def get(self):
Print(self.name,” \t”,self.strength)
//Creation of Object and calling
Cs=Cse()
Cs.set(‘4B8’,67)
Cs.get()
Cs.name=’4B4’ //accessing public data members
Cs.get()
Note:
Declaration of a data member name as public, protected, private
Self.name--public data member
Self._name--protected data member
Self._ _name--private data member
Accessing data members
__init__() is a constructor which is invoked whenever object is created which
initializes all data members of the class.
Sample Program2:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def myfunc(self):
print("Hello my name is " + self.name)
p1 = Person("John", 36)
p1.myfunc()
Inheritance
Inheritance is a feature of object-oriented programming. It specifies that one object
acquires all the properties and behaviors of parent object. By using inheritance you can
define a new class with a little or no changes to the existing class. The new class is known
as derived class or child class and from which it inherits the properties is called base class or
parent class.
It provides re-usability of the code.
Python Inheritance Terminologies
1. Superclass: The class from which attributes and methods will be inherited.
2. Subclass: The class which inherits the members from superclass.
3. Method Overloading: Redefining the definitions of methods in subclass which was
already defined in superclass.
Inheritance example
Full example of Python inheritance:
class User:
name = ""
def __init__(self, name):
self.name = name
def printName(self):
print "Name = " + self.name
class Programmer(User):
def __init__(self, name):
self.name = name
def doPython(self):
print "Programming Python"
brian = User("brian")
brian.printName()
diana = Programmer("Diana")
diana.printName()
diana.doPython()
The output:
Name = brian
Name = Diana
Programming Python
Brian is an instance of User and can only access the method printName. Diana is
an instance of Programmer, a class with inheritance from User, and can access
both the methods in Programmer and User.
Multiple Inheritance
In Python a class can inherit from more than one class. The resulting class will
have all the methods and attributes from the parent classes.
In essence, it’s called multiple inheritance because a class can inherit from
multiple classes.
In the example below class C inherits from both class A and class B. If an
object is created with class C, it has the methods of class A,B and C.
Keep in mind that if you create an object from class A or class B, they will only
have the methods and attributes of those classes.
class A:
def A(self):
print('A')
class B(A):
def B(self):
print('B')
class C(A,B):
def C(self):
print('C')
obj = C()
obj.A()
obj.B()
obj.C()
Genarators in Python:
There is a lot of overhead in building an iterator in Python; we have to implement a
class with __iter__() and __next__() method, keep track of internal states,
raise StopIterationwhen there was no values to be returned etc.
This is both lengthy and counter intuitive. Generator comes into rescue in such
situations.
Python generators are a simple way of creating iterators. All the overhead we
mentioned above are automatically handled by generators in Python.
Simply speaking, a generator is a function that returns an object (iterator) which we can
iterate over (one value at a time).
Creating A Genarators in Python:
It is fairly simple to create a generator in Python. It is as easy as defining a normal
function with yield statement instead of a return statement.
If a function contains at least one yield statement (it may contain
other yield or returnstatements),
it
becomes
a
generator
function.
Both yield and return will return some value from a function.
The difference is that, while a return statement terminates a function
entirely, yieldstatement pauses the function saving all its states and later continues
from there on successive calls.
# A simple generator function
def my_gen():
n=1
print('This is printed first')
# Generator function contains yield statements
yield n
n += 1
print('This is printed second')
yield n
n += 1
print('This is printed at last')
yield n
Python genarators with loop :
def rev_str(my_str):
length = len(my_str)
for i in range(length - 1,-1,-1):
yield my_str[i]
# For loop to reverse the string
# Output:
#o
#l
#l
#e
#h
for char in rev_str("hello"):
print(char)
Python Generator Expression
Simple generators can be easily created on the fly using generator expressions. It
makes building
my_list = [1, 3, 6, 10]
# square each term using list comprehension
# Output: [1, 9, 36, 100]
[x**2 for x in my_list]
# same thing can be done using generator expression
# Output: <generator object <genexpr> at 0x0000000002EBDAF8>
(x**2 for x in my_list) #nerators easy.
Why generators are used in Python?
1.Easy to Implement
Generators can be implemented in a clear and concise way as compared to their iterator class
counterpart
2.Memory Efficient
A normal function to return a sequence will create the entire sequence in memory before
returning the result. This is an overkill if the number of items in the sequence is very large.
Generator implementation of such sequence is memory friendly and is preferred since it only
produces one item at a time.
3. Represent Infinite Stream
Generators are excellent medium to represent an infinite stream of data. Infinite streams cannot
be stored in memory and since generators produce only one item at a time, it can represent
infinite stream of data.
4. Pipelining Generators
Generators can be used to pipeline a series of operations.
Internet connection in python:
There are a lot of python modules and packages available on the internet that allows developers
to perform different things that are related to internet. But we see only urllib and SMTP as the
import modules for basic internet usage.
Urlliburllib is a Python module that can be used for opening URLs. It defines functions and classes to
help in URL actions. With Python you can also access and retrieve data from the internet like
XML, HTML, JSON, etc. You can also use Python to work with this data directly. We also have
urlib2, which is also a python module similar to urllib.
The urlopen method is used for opening the given link, the response it gives is an object that
works as a context manager. The urlopen will sent the data as an object and it has functions like
getcode().
We see the example below the urlopen is used for opening the youtube link. The getcode() will
give a standard response codes, ‘200’ if successfully processed or any other codes related to the
situation.
Example-
SMTPThe smtplib module defines an SMTP client session object that can be used to send mail to any
Internet machine with an SMTP or ESMTP listener daemon.
SMTP stands for Simple Mail Transfer Protocol.
The smtplib modules is useful for communicating with mail servers to send mail.
Sending mail is done with Python's smtplib using an SMTP server.
We learn from the below example how the smtp module works. The first thing we have to create
an object of smtp for the smtp server that is going to be used. This is done by the command
smtplib.SMTP(‘server address’, port),line 2 in the program. Next we follow it by logging in to
the senders account on the server. This is done by the command
object.login(‘mailaddress’,’password’). The only thing left now is to send the mail. This done by
the command object. sendmail(‘senders mail’, ‘receivers mail’, message)
Exmaple-
Data Compression:
The zlib compression format is free to use, and is not covered by any patent, so you can safely
use it in commercial products as well. It is a lossless compression format (which means you don't
lose any data between compression and decompression), and has the advantage of being portable
across different platforms. Another important benefit of this compression mechanism is that it
doesn't expand the data.
The main use of the zlib library is in applications that require compression and decompression of
arbitrary data, whether it be a string, structured in-memory content, or files.
The most important functionalities included in this library are compression and decompression.
Compression and decompression can both be done as a one-off operations, or by splitting the
data into chunks like you'd seem from a stream of data. Both modes of operation are explained in
this article.
One of the best things, in my opinion, about the zlib library is that it is compatible with
the gzip file format/tool (which is also based on DEFLATE), which is one of the most widely
used compression applications on Unix systems.
Compression
Compressing a String of Data
The zlib library provides us with the compress function, which can be used to compress a string
of data. The syntax of this function is very simple, taking only two arguments:
compress(data, level=-1)
Here the argument data contains the bytes to be compressed, and level is an integer value that
can take the values -1 or 0 to 9. This parameter determines the level of compression, where level
1 is the fastest and yields the lowest level of compression. Level 9 is the slowest, yet it yields the
highest level of compression. The value -1 represents the default, which is level 6. The default
value has a balance between speed and compression. Level 0 yields no compression.
An example of using the compress method on a simple string is shown below:
import zlib
import binascii
data = 'Hello world'
compressed_data = zlib.compress(data, 2)
print('Original data: ' + data)
print('Compressed data: ' + binascii.hexlify(compressed_data))
And the result is as follows:
$ python compress_str.py
Original data: Hello world
Compressed data: 785ef348cdc9c95728cf2fca49010018ab043d
Figure 1
If we change the level to 0 (no compression), then line 5 becomes:
compressed_data = zlib.compress(data, 0)
And the new result is:
$ python compress_str.py
Original data: Hello world
Compressed data: 7801010b00f4ff48656c6c6f20776f726c6418ab043d
Figure 2
You may notice a few differences comparing the outputs when using 0 or 2 for the compression
level. Using a level of 2 we get a string (formatted in hexadecimal) of length 38, whereas with a
level of 0 we get a hex string with length 44. This difference in length is due to the lack of
compression in using level 0.
If you don't format the string as hexadecimal, as I've done in this example, and view the output
data you'll probably notice that the input string is still readable even after being "compressed",
although it has a few extra formatting characters around it.
Compressing Large Data Streams
Large data streams can be managed with the compressobj() function, which returns a
compression object. The syntax is as follows:
compressobj(level=-1,
method=DEFLATED,
strategy=Z_DEFAULT_STRATEGY[, zdict])
wbits=15,
memLevel=8,
The main difference between the arguments of this function and the compress() function is (aside
from the data parameter) the wbits argument, which controls the window size, and whether or
not the header and trailer are included in the output.
The possible values for wbits are:
Value
Window size logarithm
Output
+9 to +15
Base 2
Includes zlib header and trailer
-9 to -15
Absolute value of wbits
No header and trailer
+25 to +31
Low 4 bits of the value
Includes gzip header and trailing checksum
Table 1
The method argument represents the compression algorithm used. Currently the only possible
value is DEFLATED, which is the only method defined in the RFC 1950. The strategy argument
relates to compression tuning. Unless you really know what you're doing I'd recommend to not
use it and just use the default value.
The following code shows how to use the compressobj() function:
import zlib
import binascii
data = 'Hello world'
compress = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -15)
compressed_data = compress.compress(data)
compressed_data += compress.flush()
print('Original: ' + data)
print('Compressed data: ' + binascii.hexlify(compressed_data))
After running this code, the result is:
$ python compress_obj.py
Original: Hello world
Compressed data: f348cdc9c95728cf2fca490100
Figure 3
As we can see from the figure above, the phrase "Hello world" has been compressed. Typically
this method is used for compressing data streams that won't fit into memory at once. Although
this example does not have a very large stream of data, it serves the purpose of showing the
mechanics of the compressobj() function.
You may also be able to see how it would be useful in a larger application in which you can
configure the compression and then pass around the compression object to other
methods/modules. This can then be used to compress chunks of data in series.
You may also be able to see how it would be useful in a scenario where you have a data stream
to compress. Instead of having to accumulate all of the data in memory, you can just
call compress.compress(data) and compress.flush() on your data chunk and then move on to the
next chunk while leaving the previous one to be cleaned up by garbage collection.
Compressing a File
We can also use the compress() function to compress the data in a file. The syntax is the same as
in the first example.
In the example below we will compress a PNG image file named "logo.png" (which, I should
note, is already a compressed version of the original raw image).
The example code is as follows:
import zlib
original_data = open('logo.png', 'rb').read()
compressed_data = zlib.compress(original_data, zlib.Z_BEST_COMPRESSION)
compress_ratio
=
float(len(original_data))
(float(len(original_data))
-
float(len(compressed_data)))
/
print('Compressed: %d%%' % (100.0 * compress_ratio))
In the above code, the zlib.compress(...) line uses the constant Z_BEST_COMPRESSION,
which, as the name suggests, gives us the best compression level this algorithm has to offer. The
next line then calculates the level of compression based on the ratio of length of compressed data
over length of original data.
The result is as follows:
$ python compress_file.py
Compressed: 13%
Figure 4
As we can see, the file was compressed by 13%.
The only difference between this example and our first one is the source of the data. However, I
think it is important to show so you can get an idea of what kind of data can be compressed,
whether it be just an ASCII string or binary image data. Simply read in your data from the file
like you normally would and call the compress method.
Saving Compressed Data to a File
The compressed data can also be saved to a file for later use. The example below shows how to
save some compressed text into a file:
import zlib
my_data = 'Hello world'
compressed_data = zlib.compress(my_data, 2)
f = open('outfile.txt', 'w')
f.write(compressed_data)
f.close()
The above example compresses our simple "Hello world" string and saves the compressed data
into a file named "outfile.txt". The "outfile.txt" file, when opened with our text editor, looks as
follows:
Figure 5
Decompression
Decompressing a String of Data
A compressed string of data can be easily decompressed by using the decompress() function. The
syntax is as follows:
decompress(data, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
This function decompresses the bytes in the data argument. The wbits argument can be used to
manage the size of the history buffer. The default value matches the largest window size. It also
asks for the inclusion of the header and trailer of the compressed file. The possible values are:
Value
Window size logarithm
Input
+8 to +15
Base 2
Includes zlib header and trailer
-8 to -15
Absolute value of wbits
Raw stream with no header and trailer
+24 to +31 = 16 + (8 to 15)
Low 4 bits of the value
Includes gzip header and trailer
+40 to +47 = 32 + (8 to 15)
Low 4 bits of the value
zlib or gzip format
Table 2
The initial value of the buffer size is indicated in the bufsize argument. However, the important
aspect about this parameter is that it doesn't need to be exact, because if extra buffer size is
needed, it will automatically be increased.
The following example shows how to decompress the string of data compressed in our previous
example:
import zlib
data = 'Hello world'
compressed_data = zlib.compress(data, 2)
decompressed_data = zlib.decompress(compressed_data)
print('Decompressed data: ' + decompressed_data)
The result is as follows:
$ python decompress_str.py
Decompressed data: Hello world
Figure 5
Decompressing Large Data Streams
Decompressing big data streams may require memory management due to the size or source of
your data. It's possible that you may not be able to use all of the available memory for this task
(or you don't have enough memory), so the decompressobj() method allows you to divide up a
stream of data in to several chunks which you can decompress separately.
The syntax of the decompressobj() function is as follows:
decompressobj(wbits=15[, zdict])
This function returns a decompression object, which what you use to decompress the individual
data. The wbits argument has the same characteristics as in decompress() function previously
explained.
The following code shows how to decompress a big stream of data that is stored in a file. Firstly,
the program creates a file named "outfile.txt", which contains the compressed data. Note that the
data is compressed using a value of wbits equal to +15. This ensures the creation of a header and
a trailer in the data.
The file is then decompressed using chunks of data. Again, in this example the file doesn't
contain a massive amount of data, but nevertheless, it serves the purpose of explaining the buffer
concept.
The code is as follows:
import zlib
data = 'Hello world'
compress = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, +15)
compressed_data = compress.compress(data)
compressed_data += compress.flush()
print('Original: ' + data)
print('Compressed data: ' + compressed_data)
f = open('compressed.dat', 'w')
f.write(compressed_data)
f.close()
CHUNKSIZE = 1024
data2 = zlib.decompressobj()
my_file = open('compressed.dat', 'rb')
buf = my_file.read(CHUNKSIZE)
# Decompress stream chunks
while buf:
decompressed_data = data2.decompress(buf)
buf = my_file.read(CHUNKSIZE)
decompressed_data += data2.flush()
print('Decompressed data: ' + decompressed_data)
my_file.close()
After running the above code, we obtain the following results:
$ python decompress_data.py
Original: Hello world
Compressed data: x??H???W(?/?I?=
Decompressed data: Hello world
Figure 6
Decompressing Data from a File
The compressed data contained in a file can be easily decompressed, as you've seen in previous
examples. This example is very similar to the previous one in that we're decompressing data that
originates from a file, except that in this case we're going back to using the oneoff decompressmethod, which decompresses the data in a single method call. This is useful for
when your data is small enough to easily fit in memory.
This can be seen from the following example:
import zlib
compressed_data = open('compressed.dat', 'rb').read()
decompressed_data = zlib.decompress(compressed_data)
print(decompressed_data)
The above program opens the file "compressed.dat" created in a previous example, which
contains the compressed "Hello world" string.
In this example, once the compressed data is retrieved and stored in the
variable compressed_data, the program decompresses the stream and shows the result on the
screen. As the file contains a small amount of data, the example uses the decompress() function.
However, as the previous example shows, we could also decompress the data using
the decompressobj() function.
After running the program we get the following result:
$ python decompress_file.py
Hello world
Figure 7
String Pattern Matching:
Regular expressions. These are tiny programs that process text.
We access regular expressions through the re library. We call
methods like re.match().
With methods, such as match() and search(), we run these little
programs. More advanced methods like groupdict can process
groups. Findall handles multiple matches. It returns a list.
Regular Expression Patterns
Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | \), all characters
match themselves. You can escape a control character by preceding it with
a backslash.
Following table lists the regular expression syntax that is available in Python
−
Sr.No.
1
Pattern & Description
^
Matches beginning of line.
2
$
Matches end of line.
3
.
Matches any single character except newline. Using m option allows it to
match newline as well.
4
[...]
Matches any single character in brackets.
5
[^...]
Matches any single character not in brackets
6
re*
Matches 0 or more occurrences of preceding expression.
7
re+
Matches 1 or more occurrence of preceding expression.
8
re?
Matches 0 or 1 occurrence of preceding expression.
9
re{ n}
Matches exactly n number of occurrences of preceding expression.
10
re{ n,}
Matches n or more occurrences of preceding expression.
11
re{ n, m}
Matches at least n and at most m occurrences of preceding expression.
12
a| b
Matches either a or b.
13
(re)
Groups regular expressions and remembers matched text.
Character classes
Sr.No.
1
Example & Description
[Pp]ython
Match "Python" or "python"
2
rub[ye]
Match "ruby" or "rube"
3
[aeiou]
Match any one lowercase vowel
4
[0-9]
Match any digit; same as [0123456789]
5
[a-z]
Match any lowercase ASCII letter
6
[A-Z]
Match any uppercase ASCII letter
7
[a-zA-Z0-9]
Match any of the above
8
[^aeiou]
Match anything other than a lowercase vowel
9
[^0-9]
Match anything other than a digit
Special Character Classes
Sr.No.
1
Example & Description
.
Match any character except newline
2
\d
Match a digit: [0-9]
3
\D
Match a nondigit: [^0-9]
4
\s
Match a whitespace character: [ \t\r\n\f]
5
\S
Match nonwhitespace: [^ \t\r\n\f]
6
\w
Match a single word character: [A-Za-z0-9_]
7
\W
Match a nonword character: [^A-Za-z0-9_]
Repetition Cases
Sr.No.
1
Example & Description
ruby?
Match "rub" or "ruby": the y is optional
2
ruby*
Match "rub" plus 0 or more ys
3
ruby+
Match "rub" plus 1 or more ys
4
\d{3}
Match exactly 3 digits
5
\d{3,}
Match 3 or more digits
6
\d{3,5}
Match 3, 4, or 5 digits
What is a Regular Expression?
It's a string pattern written in a compact syntax, that allows us to quickly check
whether a given string matches or contains a given pattern.
The power of regular expressions is that they can specify patterns, not just
fixed characters.
Basic patterns
a, X, 9
ordinary characters just match themselves exactly.
.^$*+?{[] |()
meta-characters with special meanings (see below)
. (a period)
matches any single character except newline 'n'
w
matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_].
It only matches a single character not a whole word.
W
matches any non-word character.
w+
matches one or more words / characters
b
boundary between word and non-word
s
matches a single whitespace character, space, newline, return, tab, form
S matches any non-whitespace character.
t, n, r tab, newline, return
D matches anything but a digit
d matches a decimal digit [0-9]
d{1,5} matches a digit between 1 and 5 in lengths.
{n} d{5} matches for 5 digits in a row
^match the start of the string
$match the of the string end
* matches 0 or more repetitions
? matches 0 or 1 characters of whatever precedes it use . to match a period or
to match a slash. If you are unsure if a character has special meaning, such as
'@', you can
put a slash in front of it, @, to make sure it is treated just as a character.
re.findall
The findall() is probably the single most powerful function in the re module
and we will use that function in this script.
In the example below we create a string that have a text with many email
addresses.
We then create a variable (emails) that will contain a list of all the found
email strings.
Lastly, we use a for loop that we can do something with for each email string
that is found.
str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher'
## Here re.findall() returns a list of all the found email strings
emails = re.findall(r'[w.-]+@[w.-]+', str) ## ['alice@google.com', 'bob@abc.com']
for email in emails:
# do something with each found email string
print email
We can also apply this for files. If you have a file and want to iterate over
the lines of the file, just feed it into findall() and let it return a list of
all the matches in a single step
read() returns the whole text of a file in a single string.
(If you want to read more about file handling in Python, we have written a
'Cheat Sheet' that you can find here)
# Open file
f = open('test.txt', 'r')
# Feed the file text into findall(); it returns a list of all the found strings
strings = re.findall(r'some pattern', f.read())
re.search
The re.search() method takes a regular expression pattern and a string and
searches for that pattern within the string.
The syntax is re.search(pattern, string).
where:
pattern
regular expression to be matched.
string
the string which would be searched to match the pattern anywhere in the string.
It searches for first occurrence of RE pattern within string with optional flags.
If the search is successful, search() returns a match object or None otherwise.
Therefore, the search is usually immediately followed by an if-statement to test
if the search succeeded.
It is common to use the 'r' at the start of the pattern string, that designates
a python "raw" string which passes through backslashes without change which is
very handy for regular expressions.
This example searches for the pattern 'word:' followed by a 3 letter word.
The code match = re.search(pat, str) stores the search result in a variable
named "match".
Then the if-statement tests the match, if true the search succeeded and
match.group() is the matching text (e.g. 'word:cat').
If the match is false, the search did not succeed, and there is no matching text.
str = 'an example word:cat!!'
match = re.search(r'word:www', str)
# If-statement after search() tests if it succeeded
if match:
print 'found', match.group() ## 'found word:cat'
else:
print 'did not find'
As you can see in the example below, I have used the | operator, which search
for either pattern I specify.
import re
programming = ["Python", "Perl", "PHP", "C++"]
pat = "^B|^P|i$|H$"
for lang in programming:
if re.search(pat,lang,re.IGNORECASE):
print lang , "FOUND"
else:
print lang, "NOT FOUND"
The output of above script will be:
Python FOUND
Perl FOUND
PHP FOUND
C++ NOT FOUND
re.sub
The re.sub() function in the re module can be used to replace substrings.
The syntax for re.sub() is re.sub(pattern,repl,string).
That will replace the matches in string with repl.
In this example, I will replace all occurrences of the re pattern ("cool")
in string (text) with repl ("good").
import re
text = "Python for beginner is a very cool website"
pattern = re.sub("cool", "good", text)
print text2
Here is another example (taken from Googles Python class ) which searches for
all
the email addresses, and changes them to keep the user (1) but have
yo-yo-dyne.com as the host.
str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher'
## re.sub(pat, replacement, str) -- returns new string with all replacements,
## 1 is group(1), 2 group(2) in the replacement
print re.sub(r'([w.-]+)@([w.-]+)', r'1@yo-yo-dyne.com', str)
## purple alice@yo-yo-dyne.com, blah monkey bob@yo-yo-dyne.com blah
dishwasher
re.compile
With the re.compile() function we can compile pattern into pattern objects,
which have methods for various operations such as searching for pattern
matches
or performing string substitutions.
Let's see two examples, using the re.compile() function.
The first example checks if the input from the user contains only letters,
spaces or . (no digits)
Any other character is not allowed.
import re
name_check = re.compile(r"[^A-Za-zs.]")
name = raw_input ("Please, enter your name: ")
while name_check.search(name):
print "Please enter your name correctly!"
name = raw_input ("Please, enter your name: ")
The second example checks if the input from the user contains only numbers,
parentheses, spaces or hyphen (no letters)
Any other character is not allowed
import re
phone_check = re.compile(r"[^0-9s-()]")
phone = raw_input ("Please, enter your phone: ")
while phone_check.search(phone):
print "Please enter your phone correctly!"
phone = raw_input ("Please, enter your phone: ")
The output of above script will be:
Please, enter your phone: s
Please enter your phone correctly!
It will continue to ask until you put in numbers only.
Find Email Domain in Address
Let's end this article about regular expressions in Python with a neat script I
found on stackoverflow.
@
scan till you see this character
[w.]
a set of characters to potentially match, so w is all alphanumeric characters,
and the trailing period . adds to that set of characters.
+
one or more of the previous set.
Because this regex is matching the period character and every alphanumeric
after an @, it'll match email domains even in the middle of sentences.
import re
s = 'My name is Conrad, and blahblah@gmail.com is my email.'
domain = re.search("@[w.]+", s)
print domain.group()
outputs:
@gmail.com
Download