Uploaded by 丁文淇

PythonDataStructures

advertisement
python-3.8.5-amd64.exe
Built-in data structures: lists, strings, dictionaries, tuples, and sets.
 Lists are enclosed in brackets: lst = [1, 2, “a”]
 Strings are enclosed by quotes: str=‘abc’ or “def”
 Dictionaries are built with curly brackets: dic = {“a”:1, “b”:2}
 Tuples are enclosed in parentheses: tup = (1, 2, "a")
 Sets are made using the set() function or set = {1, 2, "a"}
✓ Lists, strings, and tuples are ordered sequences of objects. Unlike
strings that contain only characters, list and tuples can contain any
type of objects. Lists and tuples are like arrays. Tuples like strings
are immutables (不可變動). Lists are mutables (可變動) so they
can be extended or reduced at will. Sets are mutable unordered
sequence of unique elements. Python does not have native array
data structure; you can use list instead or import NumPy package.
In python, lists are part of the language. They are everywhere!
Contents:
 Quick example
 Difference between append() and extend()
 Other list methods
 Operators & slicing
 List comprehension
 Filtering Lists
 Lists as Stacks
 Lists as Queues
 How to copy a list
 Inserting items into a sorted list
 Nested lists

Quick example
>>>
>>>
1
>>>
>>>
[1,

list1 = [1, 2, 3]
list1[0]
list1.append(1)
list1
2, 3, 1]
Difference between append() and extend()
>>> stack = ['a','b']
>>> stack.append('c')
>>> stack
['a', 'b', 'c']
>>> stack.append(['d', 'e', 'f'])
>>> stack
['a', 'b', 'c', ['d', 'e', 'f']]
>>> stack = ['a', 'b', 'c']
>>> stack.extend(['d', 'e’, 'f'])
>>> stack
['a', 'b', 'c', 'd', 'e', 'f']

Other list methods:
index()
insert()
remove()
pop()
count()
sort()
sorted()
reverse()

Other list methods:
>>> my_list = ['a','b','c','b','a']
>>> my_list.index('b')
1
>>> my_list = ['a','b','c','b','a']
>>> my_list.index('b', 2)
3
>>> my_list.insert(2, 'a')
>>> my_list
['a', 'b', 'a', 'c', 'b', 'a']
>>> my_list.remove('a')
>>> my_list
['b', 'a', 'c', 'b', 'a']
>>> my_list.pop()
'a'
>>> my_list
['b', 'a', 'c', 'b']
>>> del my_list[2]
[‘a’, 'b’, 'c’]
>>> my_list
['b', 'a', 'b']
>>> my_list.count('b')
2
>>> my_list.sort()
>>> my_list
['a', 'b', 'b']
>>> my_list.sort(reverse=True) or my_list.reverse()
>>> my_list
['b', 'b', 'a']
>>> my_list = ['a','b','c','b','a']
>>> my_list2 = sorted(my_list)
>>> my_list2
['a', 'a', 'b', 'b', 'c’]
list.sort() modifies the list in-place, while sorted() builds and returns
a new sorted list from the original list.
 Operators
The + operator can be used to extend a list:
>>>
>>>
>>>
[1,
my_list = [1, 2]
my_list = my_list + [3, 4]
my_list
2, 3, 4]
The * operator ease the creation of list with similar values
>>> my_list = my_list * 3
>>> my_list
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]

Slicing: list[first index : last index : step], by default, the first
index is 0, the last index is the last one, and the step is 1.
>>>
>>>
[2,
>>>
[0,
>>>
[2,
>>>
[0,
>>>
[5,
>>>
[0,
a = [0, 1, 2, 3, 4, 5]
a[2:]
3, 4, 5]
a[:2]
1]
a[2:-1]
3, 4]
a[:] or a[::]
1, 2, 3, 4, 5]
a[::-1]
4, 3, 2, 1, 0]
a[::2]
2, 4]

List comprehension: alternative to loops over a sequence
>>>
>>>
...
...
>>>
[0,
evens = []
for i in range(10):
if i%2 == 0:
evens.append(i)
evens
2, 4, 6, 8]
The following list comprehension is shorter and more efficient:
>>> [i for i in range(10) if i % 2 == 0]
[0, 2, 4, 6, 8]

Filtering Lists
>>> alist = [1, 2, 3, 4, 5, 6, 7, 8]
>>> [elem*2 for elem in alist if elem>3]
[8, 10, 12, 14, 16]

Lists as Stacks: a last-in, first-out (LIFO) data structure
>>> stack = ['a’, 'b’, 'c’, 'd']
>>> stack.append('e')
>>> stack.append('f')
>>> stack
['a', 'b', 'c', 'd', 'e', 'f']
>>> stack.pop()
'f'
>>> stack
['a, 'b', 'c', 'd', 'e’]
>>> stack.pop()
‘e'
>>> stack
['a, 'b', 'c', 'd']

Lists as Queues, a first-in, first-out (FIFO) data structure
>>> queue = ['a', 'b', 'c', 'd']
>>> queue.append('e')
>>> queue.append('f')
>>> queue
['a', 'b', 'c', 'd', 'e', 'f']
>>> queue.pop(0)
'a'
>>> queue.pop(0)
‘b'
>>> queue
['c', 'd', 'e', 'f’]
The pop() method removes the element at the specified position. If
the index is not given, then by default the last element is popped out
and removed, i.e., list.pop([index=-1])

How to copy a list? Don’t do list2 = list1, which is a reference,
not a copy. There are three ways to copy a list.
>>>
>>>
>>>
>>>
l2 = list(l1)
l2 = l1[:]
import copy
l2 = copy.copy(l1)
Examples:
>>>
>>>
>>>
>>>
[1,
>>>
[1,
l1 = [1, 2, 3, 4]
l2 = list(l1)
l1[1]=5
l1
5, 3, 4]
l2
2, 3, 4]
Actually, these techniques for copying a list create shallow copies.
It means that nested objects will not be copied. Consider this:
>>>
>>>
>>>
>>>
[1,
>>>
[1,
a = [1, 2, [3, 4]]
b = copy.copy(a)
a[2][0] = 10
a
2, [10, 4]]
b
2, [10, 4]]
The value of b[2][0] changes with a[2][0]!
To get around this problem, you must perform a deep copy:
>>>
>>>
>>>
>>>
[1,
>>>
[1,
a = [1, 2, [3, 4]]
b = copy.deepcopy(a)
a[2][0] = 10
a
2, [10, 4]]
b
2, [3, 4]]

Inserting items into a sorted list: bisect module provides tools…
>>>
>>>
>>>
[1,
>>>
>>>
>>>
[1,
x = [3, 9, 4, 1, 2]
x.sort()
x
2, 3, 4, 9]
import bisect
bisect.insort(x, 7)
x
2, 3, 4, 7, 9]
To know the index where the value was inserted, you could use
>>>
>>>
>>>
>>>
4
x = [3, 9, 4, 1, 2]
x.sort()
x → [1, 2, 3, 4, 9]
bisect.bisect(x, 7)

Nested lists are list objects where the elements in the lists can be
lists themselves, e.g. a table as a list of rows or columns as follows.
>>> table1 = [[20, 25, 30, 35, 40], [68, 77, 86, 95, 104]]
>>> table2 = [[20, 68],[25, 77],[30, 86],[35, 95],[40, 104]]

Nested lists are list objects where the elements in the lists can be
lists themselves, e.g. a table as a list of rows or columns as follows.
>>> table1 = [[20, 25, 30, 35, 40], [68, 77, 86, 95, 104]]
>>> table2 = [[20, 68],[25, 77],[30, 86],[35, 95],[40, 104]]
Strings are immutable sequence of characters. There are a lot of
methods to ease creation and manipulation of strings as below.
Contents:
 Creating a string (and special characters)
 Strings are immutable
 Formatter
 Operators
 Methods
✓Methods to query information
✓Methods that return a modified version of the string
✓Methods to find position of substrings
✓Methods to build or decompose a string
Creating a string (and special characters)
Single and double quotes are special characters. There are many
ways to define a string using either single, double or triple quotes:

>>>
>>>
>>>
>>>
text
text
text
text
=
=
=
=
'The surface of the circle is 2
"The surface of the circle is 2
'''The surface of the circle is
"""The surface of the circle is
pi R
pi R
2 pi
2 pi
=
=
R
R
'
"
= '''
= """
Strings in double quotes work exactly the same as in single quotes
but allow to insert single quote character inside them. The interest of
the triple quotes (''' or """) is that you can specify multi-line
strings and use single quotes and double quotes freely within it.
>>> text = """ a str with special char " and ' in """
Otherwise, you have to do the following:
>>> text = " a str with special char \" and \' in "

Substring specification: s[i:j] extracts the substring starting with
character number i and ending with character number j-1.
>>> s = "Berlin: 18.4 C at 4 pm"
>>> s[8:]
# from index 8 to the end of the string
'18.4 C at 4 pm'
>>> s[8:12]
# index 8, 9, 10 and 11 (not 12!)
'18.4'
A negative upper index counts from the right such that s[-1] is the
last element, s[-2] is the next last element, and so on.
>>> s[8:-1]
'18.4 C at 4 p'

Strings are immutable: you can access but cannot change!
>>> s[0:6]
→ 'Berlin'
>>> s[0] = 'b' → object doesn’t support assignment
>>> new = 'b' + s[1:6]
→ 'berlin'
Formatter (old)
In Python, the % sign lets you produce formatted output. The syntax
is simply: string % values. If you have more than one value, they
should be placed within brackets.

>>> book = "Pride and Prejudice"
>>> author = "Jane Austen"
>>> price = 123
>>> print("%s by %s costs USD%d."%(book, author, price))
Pride and Prejudice by Jane Austen costs USD$123.
To escape the sign %, just double it.
>>> print("This is a percent sign: %%")
This is a percent sign: %
Formatter (new)
str.format() and f-string gradually become more common in
formatting a string with arguments, e.g.,

>>> print("{0} by {1} costs USD${2}.".format(book,
author, price))
Pride and Prejudice by Jane Austen costs USD$123.
>>> print(f"{book} by {author} costs USD${price}.")
Pride and Prejudice by Jane Austen costs USD$123.
More examples:
>>> print("{a} and {b}".format(a=2, b=1.2345))
2 and 1.2345
>>> print("{a:4d} and {b:5.2f}".format(a=2, b=1.2345))
2 and 1.23
>>> print("{a:<4d} and {b:<5.2f}".format(a=2, b=1.2345))
2
and 1.23
Operators
The operators + and * can be used to create new strings:

>>> s1 = "This is "
>>> s2 = "a test."
>>> s = s1 + s2
>>> s
'This is a test.'
>>> ss = s + s
>>> ss
'This is a test.This is a test.'
>>> s3 = s*3
>>> s3
'This is a test.This is a test.This is a test.'
Operators >, >=, ==, <=, < and != can be used to compare strings.
Methods to query information:
Check the type of alpha/numeric characters present in a string, e.g.
isdigit(), isalpha(), isalnum(), isupper(), islower(), istitle(), …

>>>
>>>
>>>
>>>
>>>
>>>
"44".isdigit()
"44".isalpha()
"44".isalnum()
"Aa".isupper()
"aa".islower()
"Aa".istitle()
→
→
→
→
→
→
True
False
True
False
True
True
Count the occurrence of a character or get the length of a string:
>>> mystr = "This is a string"
>>> mystr.count('i')
3
>>> len(mystr)
16

Methods that return a modified copy of the original string:
>>> mystr = "this is a dummy string"
>>> mystr.title()
'This Is A Dummy String'
>>> mystr.capitalize()
'This is a dummy string'
>>> mystr.upper()
'THIS IS A DUMMY STRING'
>>> mystr.lower()
'this is a dummy string’
>>> mystr = "this is string 1 and that is string 2."
>>> mystr.replace('is', 'was')
'thwas was string 1 and that was string 2.'
>>> mystr.replace(' is', ' was', 1)
'this was string 1 and that is string 2.'
Remove trailing spaces with the strip() methods:
>>> mystr = " string with left and right spaces
>>> mystr.strip()
'string with left and right spaces'
>>> mystr.rstrip()
' string with left and right spaces'
>>> mystr.lstrip()
'string with left and right spaces
‘
Separate a string with respect to a separator, returning a tuple
containing three elements p[0], p[1], p[2].
>>> mystr = "this is a line"
>>> p = mystr.partition('is’);
('th', 'is', ' is a line')
>>> mystr.rpartition('is')
('this ', 'is', ' a line')
>>> p
"
Methods to find position of substrings in a string:
str.find() returns start index of the substring first occurrence

>>>
>>>
2
>>>
5
>>>
5
mystr = "This is a dummy string"
mystr.find('is’)
# return -1 if not found
mystr.find('is', 4) # starting at index 4 and …
mystr.rfind('is’)
# last occurrence
str.index() likes find() but raises error when substring is not found.
>>> mystr.index('is')
2
>>> mystr.rindex('is')
5
The two statements are inverse operations:
>>> line = delimiter.join(words)
>>> words = line.split(delimiter)
Methods to build or decompose a string: join() and split()
The join() method returns a string created by joining the elements of
an iterable by string separator.

>>> message = ' '.join(['this' ,'is', 'a', 'string'])
>>> message
'this is a string'
>>> message.split(' ')
['this', 'is', 'a', 'string’]
An illustration of the usefulness of split() and join():
>>> line = 'This is a line of words separated by space'
>>> words = line.split()
Words is a list → ['This', 'is', 'a', 'line', 'of',
'words', 'separated', 'by', 'space']
>>> line2 = ' '.join(words[2:])
>>> Line2 → 'a line of words separated by space'
If a string is multi-lines, you can split it with splitlines(), which
returns a list where each line is a list item.
>>> 'An example\n of\nmultilines sentence'.splitlines()
['An example', ' of', 'multilines sentence’]
Note that split() removes the splitter:
>>> "this is an example".split(" is ")
['this', 'an example']
If you want to keep the splitter as well, use partition()
>>> "this is an example".partition(" is ")
('this', ' is ', 'an example’)
In Python, tuples are part of the standard language. This is a data
structure very similar to list, but tuples cannot be changed. That is, a
tuple can be viewed as a constant list. The main difference being
that tuple manipulation are faster and consume less memory than list
because tuples are immutable.
Contents:
 Constructing tuples
 Tuple methods
 Interests of tuples
 Misc.

Constructing tuples: place values within parentheses (,…):
>>> t1 = (1, 2, 3)
>>> t1[0]
1
Or simply by using commas (without parentheses)
>>> t2 = 1, 2
>>> t2
(1, 2)
You can concatenate tuples and use augmented assignment (*=, +=).
Note that if you want to create a tuple with a single element, you
must additionally use a comma, e.g.
>>>
>>>
>>>
(1,
t3 = (1,0)
t3 += (1,)
t3
0, 1)
Tuple methods
Tuples are optimized, which makes them very simple objects. There
are two methods available only:
(1) index() to find occurrence of a value
(2) count() to count the number of occurrence of a value

>>> tp = (1,2,3,1)
>>> tp.index(2)
1
>>> tp.count(1)
2
Interests of tuples
Tuples are useful because there are (1) faster than lists, (2) protect
the data, which is immutable, and more, e.g.
(3) tuples as key/value pairs to build dictionaries:

>>> d = dict([('jan', 1), ('feb', 2), ('march', 3)])
Equivalent to d = dict(jan=1, feb=2, march=3)
Equivalent to d = {'jan':1, 'feb':2, 'march':3}
>>> d
{'jan': 1, 'feb': 2, 'march': 3}
>>> d['feb']
→
2
(4) assigning multiple values:
>>>
>>>
>>>
>>>
(x,y,z) = ('a','b','c')
x
→
'a'
(x,y,z) = range(3)
y
→
1
(5) tuple unpacking: extract tuple elements automatically.
>>> data = (1,2,3)
>>> x, y, z = data
>>> z
3
(6) tuple can be use as swap function:
>>> (x,y) = (y,x)
Warning! consider this example:
def swap(a, b):
(b, a) = (a, b)
>>> a = 2; b = 3
>>> swap(a, b)
Note that a is still 2 and b still 3!
a and b are indeed passed by value
not passed by reference.
Misc.
(1) Length (finding the length of a tuple)

>>> t= (1,2)
>>> len(t)
2
(2) Slicing (extracting a segment)
>>> t = (1,2,3,4,5)
>>> t[2:]
(3, 4, 5)
(3) Copy a tuple (just use the assignment)
>>>
>>>
>>>
>>>
(1,
t = (1, 2, 3, 4, 5)
newt = t
t[0] = 5
newt
2, 3, 4, 5)
Warning! You cannot copy a list with “=”
because lists are mutables. The “=” sign
creates a reference not a copy. Tuples are
immutable therefore “=” does not create
a reference, but a copy as expected.
(4) Tuple are not fully immutable!
If a value within a tuple is mutable, then you can change it.
>>>
>>>
>>>
(1,
t = (1, 2, [3, 10])
t[2][0] = 9
t
2, [9, 10])
(5) Convert a tuple to a string
You can do that because string is also an immutable object.
>>> str(t)
'(1, 2, [3, 10])’
(6) Math and comparison
>>> t = (1, 2, 3)
>>> max(t)
3
Much functionality for lists is also available for tuples, except for
those change the content of a list
>>> t = (1, 3, 5, 7)
>>> t[1] = -1
>>> t.append(9)
>>> del t[2]
Object doesn’t support … or has no method …
However, it is OK to add two tuples like that (new assignment!)
>>>
>>>
(1,
>>>
(5,
t = t + (9, 11)
t
3, 5, 7, 9, 11)
t[2:]
7, 9, 11)
A list is a collection of objects indexed by an integer going from 0 to
the number of elements minus one. Instead of looking up an element
through an integer index, it can be more handy or intuitive to use a
text. Roughly speaking, a list where the index can be a text is called
a {dictionary} in Python.
A dictionary {key1:value1, key2:value2, …} is a sequence of items.
Each item is a pair made of a key and a value. Dictionaries are not
sorted. You can access to the list of keys or values independently.
Contents:
 Quick example
 Methods to query information
 Methods to create new dictionary
 Combining dictionaries
 Iterators
Quick example
You can access to the list of keys or values independently.

>>> d = {'first’:'string value', 'second':[1,2]}
>>> d.keys()
['first', 'second']
>>> d.values()
['string value', [1, 2]]
You can access to the value of a given key as follows:
>>> d['first']
'string value'
Warning: You cannot have duplicate keys in a dictionary.
Warning: Dictionaries have no concept of order among elements.
Methods to query information
In addition to keys and values methods, there is also the items
method that returns a list of items of the form (key, value). The
items are not returned in any particular order.

>>> d = {'first':'string value', 'second':[1,2]}
>>> d.items()
dict_items([('first','string value'),('second',[1,2])])
You can check for the existence of a specific key with has_key:
>>> d.has_key('first')
True
The expression d.has_key(k) is equivalent to k in d. The choice
of which to use is largely a matter of taste.
To get the value corresponding to a specific key, use get or pop:
>>> d.get('first')
'string value'
The difference between get() and pop() is that pop also removes the
corresponding item from the dictionary:
>>> d.pop('first')
'string value'
>>> d
{'second': [1, 2]}
Finally, popitem() removes and returns a pair (key, value); you do
not choose which one because a dictionary is not sorted
>>> d.popitem()
('second', [1, 2])
>>> d
{}
Looking up keys that are not present in the dictionary? e.g.
>>> poly1 = {0: -1, 2: 1, 7: 3}
>>> poly1[1]
→
KeyError: 1
This operation results in a KeyError since 1 is not a registered key in
poly1. We need to do either:
>>> if key in poly1:
value = poly1[key]
Or use
value=poly1.get(key, 0.0)
where poly1.get() returns
poly1[key] if key in poly1 and
the default value 0.0 if not.

Methods to create new dictionary
>>> d1 = {'a': [1,2]}
>>> d2 = d1
>>> d2['a'] = [1,2,3,4]
>>> d1['a']
[1,2,3,4]
To create a new object, use the copy method (shallow copy):
>>> d2 = d1.copy()
You can clear a dictionary (i.e., remove all its items) using clear():
>>> d2.clear()
{}
The clear() method deletes all items whereas del() deletes just one:
>>> d = {'a':1, 'b':2, 'c':3}
>>> del d['a']
>>> d
→
{'b': 2, 'c': 3}
Create a new item with default value (if not provided, None is the
default):
>>> d2.setdefault('third', '')
>>> d2['third']
''
Create a dictionary given a set of keys:
>>> d3={}.fromkeys(['first', 'second', ‘third'])
>>> d3
{'first': None, 'second': None , ‘third': None}
Just keep in mind that the fromkeys() method creates a new
dictionary with the given keys, each with a default corresponding
value of None.
Combining dictionaries
Given 2 dictionaries d1 and d2, you can add all pairs of key/value
from d2 into d1 by using the update method (instead of looping and
assigning each pair yourself:

>>> d1 = {'a':1}
>>> d2 = {'a':2, 'b':2}
>>> d1.update(d2)
>>> d1
{'a': 2, 'b': 2}
The items in the supplied dictionary are added to the old one,
overwriting any items there with the same keys.
Iterators
Dictionary provides iterators over values, keys, or items:

>>> d = {'first':'string value', 'second':[1,2]}
>>> [x for x in d.values()]
['string value', [1, 2]]
>>> [x for x in d.keys()]
['first', 'second']
>>> [x for x in d.items()]
[('first', 'string value'), ('second', [1, 2])]
Sets {,…} are constructed from a sequence (or some iterable object).
Since sets cannot have duplicated, there are usually used to build
sequence of unique items (e.g., set of identifiers).
Contents:
 Quick example
 Ordering
 Operators

Quick example
>>> a = set([1, 2, 3, 4])
>>> b = {3, 4, 5, 6}
>>> a | b # Union
{1, 2, 3, 4, 5, 6}
>>> a & b # Intersection
{3, 4}
>>> a < b # Subset
False
>>> a – b # Difference
{1, 2}
>>> a ^ b # Symmetric Difference
{1, 2, 5, 6}
Note: The intersection, subset, difference and symmetric difference
can be called with method rather that operator symbols.
Ordering
Just as with dictionaries, the ordering of set elements is quite
arbitrary, and should not be relied on.
 Operators
Each operator is associated to a symbol (e.g., &) and a method name
(e.g. union).

>>>
>>>
>>>
{2,
a = set([1, 2, 3]); b = set([2, 3, 4])
c = a.intersection(b)
c
3}
Note that c = a.intersection(b) is equivalent to c = a & b
>>> c.issubset(a)
True
>>> c <= a
True
>>> c.issuperset(a)
False
>>> c >= a
False
>>> a.difference(b)
{1}
>>> a - b
{1}
>>> a.symmetric_difference(b)
{1, 4}
>>> a ^ b
{1, 4}
>>> d=a.copy()
>>> d
{1, 2, 3}
Related documents
Download