Uploaded by java.luke

deep-learning

advertisement
deep learning
1
Neural networks and deep learning
1.1)
math-programming notation
𝑑𝑣𝑎𝑟 =
=
in Week 2/Derivatives with a Computation Graph (slide 7:24)
𝑑𝐽
= 3 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑖𝑛 𝑐𝑜𝑑𝑒 𝑎𝑠 𝑑𝑣 = 3
𝑑𝑣
𝑑𝐽
= 5 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑖𝑛 𝑐𝑜𝑑𝑒 𝑎𝑠 𝑑𝑎 = 5
𝑑𝑎
1.2)
Broadcasting
bsxfun serves the same purpose in matlab
1.3)
sigmoid backward
-------------------------------------------------------------------------TypeError
Traceback (most recent cal
l last)
<ipython-input-21-3f1a403310df> in <module>()
1 dAL, linear_activation_cache = linear_activation_backward_te
st_case()
2
----> 3 dA_prev, dW, db = linear_activation_backward(dAL, linear_act
ivation_cache, activation = "sigmoid")
4 print ("sigmoid:")
5 print ("dA_prev = "+ str(dA_prev))
<ipython-input-20-2e6d36797376> in linear_activation_backward(dA, ca
che, activation)
25
elif activation == "sigmoid":
26
### START CODE HERE ### (≈ 2 lines of code)
---> 27
dZ = sigmoid_backward(dA, cache)
28
dA_prev, dW, db = linear_backward(dZ, cache)
29
### END CODE HERE ###
/home/jovyan/work/Week 4/Building your Deep Neural Network - Step by
Step/dnn_utils_v2.py in sigmoid_backward(dA, cache)
74
Z = cache
75
---> 76
s = 1/(1+np.exp(-Z))
77
dZ = dA * s * (1-s)
78
1.4)
dnn_utils.py (relu backward)
TypeError
Traceback (most recent cal
l last)
<ipython-input-27-628c45c83d14> in <module>()
1 dAL, linear_activation_cache = linear_activation_backward_te
st_case()
1/8
deep learning
----> 2 dA_prev, dW, db = linear_activation_backward(dAL, linear_act
ivation_cache, activation = "relu")
3 print ("relu:")
4 print ("dA_prev = "+ str(dA_prev))
5 print ("dW = " + str(dW))
<ipython-input-26-db74fa02e6dd> in linear_activation_backward(dA, ca
che, activation)
20
### START CODE HERE ### (≈ 2 lines of code)
21
#dZ = relu_backward(dA, activation_cache)
---> 22
dZ = relu_backward(dA, cache)
23
dA_prev, dW, db = linear_backward(dZ, linear_cache)
24
### END CODE HERE ###
/home/jovyan/work/Week 4/Building your Deep Neural Network - Step by
Step/dnn_utils_v2.py in relu_backward(dA, cache)
54
55
# When z <= 0, you should set dz to 0 as well.
---> 56
dZ[Z <= 0] = 0
57
58
assert (dZ.shape == Z.shape)
1.5)
cache (week 4 deep learning programming)
1.5.1) test case
dAL, linear_activation_cache = linear_activation_backward_test_case()
print("linear_activation_cache = ", linear_activation_cache)
linear_activation_cache =
((array([[-2.1361961 , 1.64027081],
# A – linear cache
[-1.79343559, -0.84174737],
[ 0.50288142, -1.24528809]
]
),
array([[-1.05795222, -0.90900761, 0.55145404] # W – linear cache
]
),
array([[ 2.29220801]
# b – linear cache
]
)
),
array([[ 0.04153939, -1.11792545]]
# Z – activation cache
)
)
1.6)
forward code
def linear_forward(A, W, b):
Z = np.dot(W, A) + b
cache = (A, W, b)
return Z, cache
2/8
deep learning
def linear_activation_forward(A_prev, W, b, activation):
if activation == "sigmoid":
Z, linear_cache = linear_forward(A_prev, W, b)
A, activation_cache = sigmoid(Z)
elif activation == "relu":
Z, linear_cache = linear_forward(A_prev, W, b)
A, activation_cache = relu(Z)
cache = (linear_cache, activation_cache)
return A, cache
Sigmoid: σ(Z)=σ(WA+b)=
This function returns two items: the activation value "a" and a "cache" that
contains "Z"
ReLU: The mathematical formula for ReLu
is A=RELU(Z)=max(0,Z)A=RELU(Z)=max(0,Z). We have provided you with
the relu function. This function returns two items: the activation value "A"
and a "cache" that contains "Z"
def L_model_forward(X, parameters):
caches = []
A = X
L = len(parameters) // 2
# number of layers in the
# neural network
for l in range(1, L):
A_prev = A
A, cache = linear_activation_forward(\
A_prev, parameters["W"+str(l)], \
parameters["b"+str(l)], activation = "relu")
caches.append(cache)
AL, cache = linear_activation_forward(\
A, parameters["W"+str(L)], parameters["b"+str(L)], \
activation = "sigmoid")
caches.append(cache)
return AL, caches
def compute_cost(AL, Y):
m = Y.shape[1]
3/8
deep learning
# Compute loss from aL and y.
cost = - np.sum(np.multiply(Y, np.log(AL))
+ np.multiply((1 - Y), np.log(1 - AL))) / m
cost = np.squeeze(cost)
# To make sure your cost's
# shape is what we expect
# (e.g. this turns [[17]] into 17).
return cost
1.7)
backward
def linear_backward(dZ, cache):
A_prev, W, b = cache
m = A_prev.shape[1]
dW = np.dot(dZ, A_prev.T) / m
db = np.sum(dZ, axis=1, keepdims=True) / m
dA_prev = np.dot(W.T, dZ)
return dA_prev, dW, db
def linear_activation_backward(dA, cache, activation):
linear_cache, activation_cache = cache
if activation == "relu":
dZ = relu_backward(dA, activation_cache)
dA_prev, dW, db = linear_backward(dZ, linear_cache)
elif activation == "sigmoid":
dZ = sigmoid_backward(dA, activation_cache)
dA_prev, dW, db = linear_backward(dZ, linear_cache)
return dA_prev, dW, db
def L_model_backward(AL, Y, caches):
grads = {}
L = len(caches) # the number of layers
m = AL.shape[1]
Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL
dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))
4/8
deep learning
# Lth layer (SIGMOID -> LINEAR) gradients. Inputs: "dAL,
# current_cache". Outputs: "grads["dAL-1"], grads["dWL"], grads["dbL"]
current_cache = caches[L-1]
grads["dA" + str(L-1)], grads["dW" + str(L)], \
grads["db" + str(L)] = \
linear_activation_backward(dAL, current_cache, activation = "si
gmoid")
# Loop from l=L-2 to l=0
for l in reversed(range(L-1)):
# lth layer: (RELU -> LINEAR) gradients.
current_cache = caches[l]
dA_prev_temp, dW_temp, db_temp = \
linear_activation_backward(\
grads["dA"+str(l+1)], current_cache, activation = "relu")
grads["dA" + str(l)] = dA_prev_temp
grads["dW" + str(l + 1)] = dW_temp
grads["db" + str(l + 1)] = db_temp
return grads
def update_parameters(parameters, grads, learning_rate):
L = len(parameters) // 2 # number of layers in the neural network
# Update rule for each parameter. Use a for loop.
for l in range(L):
parameters["W" + str(l+1)] = parameters["W" + str(l+1)] \
- learning_rate * grads["dW" + str(l + 1)]
parameters["b" + str(l+1)] = parameters["b" + str(l+1)] \
- learning_rate * grads["db" + str(l + 1)]
return parameters
2
Improve DNN – hyperparameters tuning
2.1)
random_mini_batches()
.
# GRADED FUNCTION: random_mini_batches
def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0):
"""
Creates a list of random minibatches from (X, Y)
5/8
deep learning
Arguments:
X -- input data, of shape (input size, number of examples)
Y -- true "label" vector (1 for blue dot / 0 for red dot), of shape (1, number of examples)
mini_batch_size -- size of the mini-batches, integer
Returns:
mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
"""
np.random.seed(seed)
# To make your "random" minibatches the same as ours
m = X.shape[1]
# number of training examples
mini_batches = []
# Step 1: Shuffle (X, Y)
permutation = list(np.random.permutation(m))
shuffled_X = X[:, permutation]
shuffled_Y = Y[:, permutation].reshape((1,m))
# Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_
batch_size in your partitionning
for k in range(0, num_complete_minibatches):
### START CODE HERE ### (approx. 2 lines)
mini_batch_X = X[:, k * mini_batch_size : (k+1) * mini_batch_size]
mini_batch_Y = Y[:, k * mini_batch_size : (k+1) * mini_batch_size]
### END CODE HERE ###
mini_batch = (mini_batch_X, mini_batch_Y)
mini_batches.append(mini_batch)
# Handling the end case (last mini-batch < mini_batch_size)
if m % mini_batch_size != 0:
### START CODE HERE ### (approx. 2 lines)
mini_batch_X = X[:, num_complete_minibatches * mini_batch_size : m]
mini_batch_Y = Y[:, num_complete_minibatches * mini_batch_size : m]
### END CODE HERE ###
mini_batch = (mini_batch_X, mini_batch_Y)
mini_batches.append(mini_batch)
return mini_batches
6/8
deep learning
3
large data/model download
3.1)
use split
e.g.,
split --bytes=100MB xydevNPY.tar.gz xydevNPYpart
-rw-r--r-- 1 jovyan users 100000000 Mar 2 06:23 xydevNPYpartaa
-rw-r--r-- 1 jovyan users 6396638 Mar 2 06:23 xydevNPYpartab
-rw-r--r-- 1 jovyan users 106396638 Mar 2 06:04 xydevNPY.tar.gz
splitted parts a, b were downloaded into
C:\fajin\workspace\deepLearning\5_sequenceModel\Week_3\Trigger word
detection\XY_dev
3.2)
split --help
jovyan@48c312d67fd2:~/work$ split --help
Usage: split [OPTION]... [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT
is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N generate suffixes of length N (default 2)
--additional-suffix=SUFFIX append an additional SUFFIX to file names
-b, --bytes=SIZE
put SIZE bytes per output file
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
-d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic;
FROM changes the start value (default 0)
-e, --elide-empty-files do not generate empty output files with '-n'
--filter=COMMAND write to shell COMMAND; file name is $FILE
-l, --lines=NUMBER put NUMBER lines per output file
-n, --number=CHUNKS generate CHUNKS output files; see explanation
below
-u, --unbuffered
immediately copy input to output with '-n r/...'
--verbose
print a diagnostic just before each
output file is opened
--help display this help and exit
--version output version information and exit
The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).
CHUNKS may be:
N
split into N files based on size of input
K/N output Kth of N to stdout
l/N split into N files without splitting lines
l/K/N output Kth of N to stdout without splitting lines
r/N like 'l' but use round robin distribution
7/8
deep learning
r/K/N likewise but only output Kth of N to stdout
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/split>
or available locally via: info '(coreutils) split invocation'
jovyan@48c312d67fd2:~/work$ split --bytes=100MB xydevNPY.tar.gz
xydevNPYpart
jovyan@48c312d67fd2:~/work$ ls -la
-rw-r--r-- 1 jovyan users 100000000 Mar 2 06:23 xydevNPYpartaa
-rw-r--r-- 1 jovyan users 6396638 Mar 2 06:23 xydevNPYpartab
-rw-r--r-- 1 jovyan users 106396638 Mar 2 06:04 xydevNPY.tar.gz
4
Keras error
4.1)
not using Lambda on input x for model
AttributeError: 'Tensor' object has no attribute
'_keras_history'
5
jupyter notebook
5.1)
change home dir
open Anaconda Prompt (2019)
enter the command below:
C:\>jupyter notebook --notebook-dir=”C:\fajin\workspace\deepLearning\2_improve_dnn”
6
linux
6.1)
kernal version
jovyan@48c312d67fd2:~/work$ uname -r
4.14.70-67.55.amzn1.x86_64
jovyan@48c312d67fd2:~/work$ cat /proc/version
Linux version 4.14.70-67.55.amzn1.x86_64 (mockbuild@gobi-build-60002)
(gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (G
CC)) #1 SMP Tue Sep 18 10:36:30 UTC 2018
jovyan@48c312d67fd2:~/work$ uname -mrsn
Linux 48c312d67fd2 4.14.70-67.55.amzn1.x86_64 x86_64
jovyan@48c312d67fd2:~/work$ uname -a
Linux 48c312d67fd2 4.14.70-67.55.amzn1.x86_64 #1 SMP Tue Sep 18 10:36:30
UTC 2018 x86_64 GNU/Linux
8/8
Download