ISOM2007 –
Programming for Business
Analytics
NumPy
Hongchuan Shen
Faculty of Business Administration
University of Macau
1
Learning Outcomes
• Understand how to perform data analysis with the
NumPy module (i.e. NumPy Structure and Operations)
2
NumPy arrays
• NumPy Arrays, also known as ndarrays,
are multi-dimensional array objects that
can hold items of the same data type.
• NumPy arrays are grid-like data
structures that can be indexed and
iterated upon, similar to lists.
• However, unlike Python lists, NumPy
arrays can have any number of dimensions
and are more efficient in terms of
memory and performance.
• NumPy arrays are fundamental to the
NumPy library, and most of the
functionalities provided by NumPy
revolve around these arrays
• This array can be used to represent a
vector (one-dimensional set of numerical 3
values) or matrix (multiple-dimensional
NumPy arrays
4
Differences to Python
Lists
Both lists and NumPy arrays can be used to
store data, but there are crucial
differences.
• Performance: NumPy arrays are more
efficient in terms of memory and speed.
They provide fast numerical operations
which are highly optimized for
performance.
• Functionality: NumPy arrays have more
advanced and diverse functionality for
mathematical and scientific
computations.
• Homogeneity: Unlike lists, NumPy arrays
hold items of the same data type,
ensuring more consistent data.
• Dimensions: NumPy arrays can have any
5
number of dimensions, making them more
NumPy Structures &
Operations
• Compare the processing time between
list and numpy array
6
NumPy Structures &
Operations
• Compare the processing time between
list and numpy array
7
NumPy Arrays - Attributes
• Shape: An attribute (ndarray.shape)
to get the shape of the array.
• Dtype: An attribute (ndarray.dtype)
that returns the data type of the
elements in the array.
8
NumPy Arrays - Attributes
• Size: An attribute (ndarray.size)
that returns the total number of
elements in the array.
• Ndim: An attribute (ndarray.ndim)
that returns the number of array
dimensions.
9
Creating NumPy Arrays
• Creating an Array with NumPy
• NumPy arrays, instances of the ndarray
class, are data structures that can be
created in a number of different ways.
• You can create an array from an
existing Python list or tuple, or use
one of the many built-in NumPy
convenience methods:
• empty(): Creates a new array whose
elements are uninitialized.
• zeros(): Creates a new array whose
elements are initialized to zero.
• ones(): Creates a new array whose
elements are initialized to one.
10
Creating NumPy Arrays
• Creating an Array with NumPy (cont’d)
• empty_like(): Creates a new array
whose size matches the input array
and whose values are uninitialized.
• zero_like(): Creates a new array
whose size matches the input array
and whose values are initialized to
zero.
• ones_like(): Creates a new array
whose size matches the input array
and whose values are initialized to
unity.
11
Creating NumPy Arrays
• Creating an Array with NumPy
Example #1:
12
Creating NumPy Arrays
• Creating an Array with NumPy (cont’d)
• array(): Creates an array from a
regular Python list or tuple.
• arange(): Creates a new array in a
certain range, similar to the
range() function.
• linspace(): Creates an evenly
spaced sequence in a specified
interval. This function is similar
to numpy.arange(), but instead of
step size, it uses a sample
13
number..
Creating NumPy Arrays
• arange():
numpy.arange(start=0, stop, step=1, dtype=None)
Parameters:
• start: (optional) The starting of an interval. The
default is 0.
• stop: The end of an interval (exclusive).
• step: (optional) The spacing between values. The
default is 1.
• dtype: (optional) The type of output array.
14
Creating NumPy Arrays
• linspace():
numpy.linspace(start, stop, num=50, endpoint=True, retstep=False,
dtype=None, axis=0)
Parameters:
• start: The starting value of the sequence.
• stop: The end value of the sequence.
• num: (optional, default=50) The number of evenly
spaced samples to be generated. If not provided,
the default is 50.
• endpoint: (optional, default=True) If True, stop
is the last sample; otherwise, it is not
included.
• retstep: (optional, default=False) If True,
returns samples and step between the samples.
• dtype: (optional) The type of output array.
• axis: (optional, default=0) The axis in the
result to store the samples
15
Creating NumPy Arrays
• Creating an Array with NumPy
Example #2:
16
Creating NumPy Arrays
• Creating an Array with NumPy
Example #3:
17
Array Operations
18
Array Operations
19
Array Operations
20
Indexing and Slicing
• Indexing and slicing in NumPy
arrays are quite similar to regular
Python lists.
• Indexing allows you to access
individual elements in an array,
while slicing allows you to access
a sequence of data within the
array.
• Unlike Python lists, NumPy allows
you to index and slice arrays in
multiple dimensions.
21
Indexing and Slicing - 1D
array
22
Indexing and Slicing - 2D
array
23
Indexing and Slicing
• Indexing Arrays – access elements
in an array
24
Indexing and Slicing
• Special Indexing
• Access the elements by index array &
access by Boolean mask array
25
NumPy Functions and
Method
• NumPy provides a host of functions
and methods to perform computations
on arrays.
• These include mathematical
functions, statistical functions,
and array manipulation functions.
• Functions are general NumPy
operations, while methods are
called on individual NumPy objects.
26
NumPy Functions and
Method
27
NumPy Functions and
Method
28
NumPy Functions and
Method
29
NumPy Functions and
Method
30
NumPy Functions and
Method
31
NumPy Functions and
Method
Linear algebra for
numpy
32
NumPy Functions and
Method
• vstack and hstack
33
NumPy mutability
• Numpy array is mutable!
34
Random Numbers in NumPy
• NumPy's random module provides
functions for generating random
numbers from different
distributions.
• The random module's rand() and
randn() functions generate random
numbers from the uniform and normal
distributions, respectively.
Random Numbers in NumPy
Random Numbers in NumPy
• The random module's randint()
function generates random integers
within a specified range.
• The random module's choice()
function randomly selects an
element from a given 1-D array.
Random Numbers in NumPy
Random Numbers in NumPy
• The random module's seed() function
sets the random seed, which allows
for reproducible results.
• The seed determines the sequence of
random numbers generated. The same
seed will always produce the same
sequence of random numbers.
Random Numbers in NumPy
Multi-Dimensional Array
• Higher dimensional arrays can be
created in the same way that a
single dimensional array
• First create a one dimensional
arrays with the correct number of
elements and then use the reshape
function from NumPy to create the
n-dimensional array.
42
Multi-Dimensional Array
• Example to create a 4x5 matrix
43
Multi-Dimensional Array
• Example to create a 3x4x5 matrix
44
Multi-Dimensional Array
• Create special two-dimensional
array (Part I)
45
Multi-Dimensional Array
• Create special two-dimensional
array (Part II)
46
Multi-Dimensional Array
• Slicing Multi-Dimensional Arrays
• Multi-dimensional arrays can be sliced
(or indexed); the only trick is to
remember the proper ordering for the
elements.
• Each dimension is differentiated by a
comma in the slicing operation.
• A two-dimensional array is sliced with
[start1:end1, start2:end2],
• A three-dimensional array is sliced
with [start1:end1, start2:end2,
start3:end3], continuing on with
higher dimensions.
47
Multi-Dimensional Array
• Perform Slicing on a TwoDimensional Matrix (Part I)
48
Multi-Dimensional Array
• Perform Slicing on a TwoDimensional Matrix (Part II)
49
Multi-Dimensional Array
• Perform Slicing on a TwoDimensional Matrix (Part III)
50
Multi-Dimensional Array
Time frame
9
1
2
1
5
0
1
2
3
4
5
6
7
8
1
0
1
3
1
6
1
1
1
4
1
7
1
9
2
2
2
5
2
0
2
3
2
6
9
0
1
3 2
1
6 5
1
8
2
1
2
4
1
1 0
1
4 3
1
7 6
Z
1
1
2
9
8
9
1 0
1
0
1 0 2 1
X
Y
2
2
2
2
1
1
1 3
1
3 24 3 5 4
2
2
2
5
4
1
1 6
1
6 57 6 8 7
Salesper
sons
1
8
2
1
2
4
2
1
1
9 8 1 9 1 0 2
0
1 0 2 11 0
2 1 2
1
2
0
1 3
5 4 2
5
4
3
1 6
8 7
8
6
7
1
9
2
2
2
5
1
2 1
1
5 4
1
8 7
2
0
2
3
2
6
Multi-Dimensional Array
• Perform Slicing on a ThreeDimensional Matrix (Part I)
52
Multi-Dimensional Array
• Perform Slicing on a ThreeDimensional Matrix (Part II)
53
NumPy Structures &
Operations
• Perform Slicing on a ThreeDimensional Matrix (Part III)
54
Multi-Dimensional Array
• Slicing on a 3-Dimensional matrix
(Part IV)
55
Multi-Dimensional Array
• Slicing on a 3-Dimensional matrix
(Part IV)
56
Multi-Dimensional Array
• Special Indexing – Boolean Mask
Access
57
Multi-Dimensional Array
• Special Indexing – Boolean Mask
Access
58
Multi-Dimensional Array
• Special Indexing – Boolean Mask
Access
59
Data Analysis with NumPy
Solving Systems of Linear
Equations with NumPy
Advanced Data Analysis
with NumPy
In this example, we simulate a random walk, which is a sequence where each value is a step of 1 or 1 from the previous value. We use np.random.choice() to randomly select the steps, and
np.cumsum() to calculate the cumulative sum of these steps, which gives the position of the
walk after each step.
We then calculate the minimum and maximum position of the walk using np.min() and np.max(), and
the step at which the walk first reaches 10 or -10 using argmax() on a boolean array that
Detailed intro to arrays
Advanced numpy tricks