Deep Learning Lecture Notes: Neural Networks & Hyperparameter Tuning

deep learning 1 Neural networks and deep learning 1.1) math-programming notation 𝑑𝑣𝑎𝑟 = = in Week 2/Derivatives with a Computation Graph (slide 7:24) 𝑑𝐽 = 3 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑖𝑛 𝑐𝑜𝑑𝑒 𝑎𝑠 𝑑𝑣 = 3 𝑑𝑣 𝑑𝐽 = 5 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑖𝑛 𝑐𝑜𝑑𝑒 𝑎𝑠 𝑑𝑎 = 5 𝑑𝑎 1.2) Broadcasting bsxfun serves the same purpose in matlab 1.3) sigmoid backward -------------------------------------------------------------------------TypeError Traceback (most recent cal l last) <ipython-input-21-3f1a403310df> in <module>() 1 dAL, linear_activation_cache = linear_activation_backward_te st_case() 2 ----> 3 dA_prev, dW, db = linear_activation_backward(dAL, linear_act ivation_cache, activation = "sigmoid") 4 print ("sigmoid:") 5 print ("dA_prev = "+ str(dA_prev)) <ipython-input-20-2e6d36797376> in linear_activation_backward(dA, ca che, activation) 25 elif activation == "sigmoid": 26 ### START CODE HERE ### (≈ 2 lines of code) ---> 27 dZ = sigmoid_backward(dA, cache) 28 dA_prev, dW, db = linear_backward(dZ, cache) 29 ### END CODE HERE ### /home/jovyan/work/Week 4/Building your Deep Neural Network - Step by Step/dnn_utils_v2.py in sigmoid_backward(dA, cache) 74 Z = cache 75 ---> 76 s = 1/(1+np.exp(-Z)) 77 dZ = dA * s * (1-s) 78 1.4) dnn_utils.py (relu backward) TypeError Traceback (most recent cal l last) <ipython-input-27-628c45c83d14> in <module>() 1 dAL, linear_activation_cache = linear_activation_backward_te st_case() 1/8 deep learning ----> 2 dA_prev, dW, db = linear_activation_backward(dAL, linear_act ivation_cache, activation = "relu") 3 print ("relu:") 4 print ("dA_prev = "+ str(dA_prev)) 5 print ("dW = " + str(dW)) <ipython-input-26-db74fa02e6dd> in linear_activation_backward(dA, ca che, activation) 20 ### START CODE HERE ### (≈ 2 lines of code) 21 #dZ = relu_backward(dA, activation_cache) ---> 22 dZ = relu_backward(dA, cache) 23 dA_prev, dW, db = linear_backward(dZ, linear_cache) 24 ### END CODE HERE ### /home/jovyan/work/Week 4/Building your Deep Neural Network - Step by Step/dnn_utils_v2.py in relu_backward(dA, cache) 54 55 # When z <= 0, you should set dz to 0 as well. ---> 56 dZ[Z <= 0] = 0 57 58 assert (dZ.shape == Z.shape) 1.5) cache (week 4 deep learning programming) 1.5.1) test case dAL, linear_activation_cache = linear_activation_backward_test_case() print("linear_activation_cache = ", linear_activation_cache) linear_activation_cache = ((array([[-2.1361961 , 1.64027081], # A – linear cache [-1.79343559, -0.84174737], [ 0.50288142, -1.24528809] ] ), array([[-1.05795222, -0.90900761, 0.55145404] # W – linear cache ] ), array([[ 2.29220801] # b – linear cache ] ) ), array([[ 0.04153939, -1.11792545]] # Z – activation cache ) ) 1.6) forward code def linear_forward(A, W, b): Z = np.dot(W, A) + b cache = (A, W, b) return Z, cache 2/8 deep learning def linear_activation_forward(A_prev, W, b, activation): if activation == "sigmoid": Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = sigmoid(Z) elif activation == "relu": Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = relu(Z) cache = (linear_cache, activation_cache) return A, cache Sigmoid: σ(Z)=σ(WA+b)= This function returns two items: the activation value "a" and a "cache" that contains "Z" ReLU: The mathematical formula for ReLu is A=RELU(Z)=max(0,Z)A=RELU(Z)=max(0,Z). We have provided you with the relu function. This function returns two items: the activation value "A" and a "cache" that contains "Z" def L_model_forward(X, parameters): caches = [] A = X L = len(parameters) // 2 # number of layers in the # neural network for l in range(1, L): A_prev = A A, cache = linear_activation_forward(\ A_prev, parameters["W"+str(l)], \ parameters["b"+str(l)], activation = "relu") caches.append(cache) AL, cache = linear_activation_forward(\ A, parameters["W"+str(L)], parameters["b"+str(L)], \ activation = "sigmoid") caches.append(cache) return AL, caches def compute_cost(AL, Y): m = Y.shape[1] 3/8 deep learning # Compute loss from aL and y. cost = - np.sum(np.multiply(Y, np.log(AL)) + np.multiply((1 - Y), np.log(1 - AL))) / m cost = np.squeeze(cost) # To make sure your cost's # shape is what we expect # (e.g. this turns [[17]] into 17). return cost 1.7) backward def linear_backward(dZ, cache): A_prev, W, b = cache m = A_prev.shape[1] dW = np.dot(dZ, A_prev.T) / m db = np.sum(dZ, axis=1, keepdims=True) / m dA_prev = np.dot(W.T, dZ) return dA_prev, dW, db def linear_activation_backward(dA, cache, activation): linear_cache, activation_cache = cache if activation == "relu": dZ = relu_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache) elif activation == "sigmoid": dZ = sigmoid_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache) return dA_prev, dW, db def L_model_backward(AL, Y, caches): grads = {} L = len(caches) # the number of layers m = AL.shape[1] Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL)) 4/8 deep learning # Lth layer (SIGMOID -> LINEAR) gradients. Inputs: "dAL, # current_cache". Outputs: "grads["dAL-1"], grads["dWL"], grads["dbL"] current_cache = caches[L-1] grads["dA" + str(L-1)], grads["dW" + str(L)], \ grads["db" + str(L)] = \ linear_activation_backward(dAL, current_cache, activation = "si gmoid") # Loop from l=L-2 to l=0 for l in reversed(range(L-1)): # lth layer: (RELU -> LINEAR) gradients. current_cache = caches[l] dA_prev_temp, dW_temp, db_temp = \ linear_activation_backward(\ grads["dA"+str(l+1)], current_cache, activation = "relu") grads["dA" + str(l)] = dA_prev_temp grads["dW" + str(l + 1)] = dW_temp grads["db" + str(l + 1)] = db_temp return grads def update_parameters(parameters, grads, learning_rate): L = len(parameters) // 2 # number of layers in the neural network # Update rule for each parameter. Use a for loop. for l in range(L): parameters["W" + str(l+1)] = parameters["W" + str(l+1)] \ - learning_rate * grads["dW" + str(l + 1)] parameters["b" + str(l+1)] = parameters["b" + str(l+1)] \ - learning_rate * grads["db" + str(l + 1)] return parameters 2 Improve DNN – hyperparameters tuning 2.1) random_mini_batches() . # GRADED FUNCTION: random_mini_batches def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0): """ Creates a list of random minibatches from (X, Y) 5/8 deep learning Arguments: X -- input data, of shape (input size, number of examples) Y -- true "label" vector (1 for blue dot / 0 for red dot), of shape (1, number of examples) mini_batch_size -- size of the mini-batches, integer Returns: mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y) """ np.random.seed(seed) # To make your "random" minibatches the same as ours m = X.shape[1] # number of training examples mini_batches = [] # Step 1: Shuffle (X, Y) permutation = list(np.random.permutation(m)) shuffled_X = X[:, permutation] shuffled_Y = Y[:, permutation].reshape((1,m)) # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case. num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_ batch_size in your partitionning for k in range(0, num_complete_minibatches): ### START CODE HERE ### (approx. 2 lines) mini_batch_X = X[:, k * mini_batch_size : (k+1) * mini_batch_size] mini_batch_Y = Y[:, k * mini_batch_size : (k+1) * mini_batch_size] ### END CODE HERE ### mini_batch = (mini_batch_X, mini_batch_Y) mini_batches.append(mini_batch) # Handling the end case (last mini-batch < mini_batch_size) if m % mini_batch_size != 0: ### START CODE HERE ### (approx. 2 lines) mini_batch_X = X[:, num_complete_minibatches * mini_batch_size : m] mini_batch_Y = Y[:, num_complete_minibatches * mini_batch_size : m] ### END CODE HERE ### mini_batch = (mini_batch_X, mini_batch_Y) mini_batches.append(mini_batch) return mini_batches 6/8 deep learning 3 large data/model download 3.1) use split e.g., split --bytes=100MB xydevNPY.tar.gz xydevNPYpart -rw-r--r-- 1 jovyan users 100000000 Mar 2 06:23 xydevNPYpartaa -rw-r--r-- 1 jovyan users 6396638 Mar 2 06:23 xydevNPYpartab -rw-r--r-- 1 jovyan users 106396638 Mar 2 06:04 xydevNPY.tar.gz splitted parts a, b were downloaded into C:\fajin\workspace\deepLearning\5_sequenceModel\Week_3\Trigger word detection\XY_dev 3.2) split --help jovyan@48c312d67fd2:~/work$ split --help Usage: split [OPTION]... [INPUT [PREFIX]] Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT is -, read standard input. Mandatory arguments to long options are mandatory for short options too. -a, --suffix-length=N generate suffixes of length N (default 2) --additional-suffix=SUFFIX append an additional SUFFIX to file names -b, --bytes=SIZE put SIZE bytes per output file -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file -d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic; FROM changes the start value (default 0) -e, --elide-empty-files do not generate empty output files with '-n' --filter=COMMAND write to shell COMMAND; file name is $FILE -l, --lines=NUMBER put NUMBER lines per output file -n, --number=CHUNKS generate CHUNKS output files; see explanation below -u, --unbuffered immediately copy input to output with '-n r/...' --verbose print a diagnostic just before each output file is opened --help display this help and exit --version output version information and exit The SIZE argument is an integer and optional unit (example: 10K is 10*1024). Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000). CHUNKS may be: N split into N files based on size of input K/N output Kth of N to stdout l/N split into N files without splitting lines l/K/N output Kth of N to stdout without splitting lines r/N like 'l' but use round robin distribution 7/8 deep learning r/K/N likewise but only output Kth of N to stdout GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Full documentation at: <http://www.gnu.org/software/coreutils/split> or available locally via: info '(coreutils) split invocation' jovyan@48c312d67fd2:~/work$ split --bytes=100MB xydevNPY.tar.gz xydevNPYpart jovyan@48c312d67fd2:~/work$ ls -la -rw-r--r-- 1 jovyan users 100000000 Mar 2 06:23 xydevNPYpartaa -rw-r--r-- 1 jovyan users 6396638 Mar 2 06:23 xydevNPYpartab -rw-r--r-- 1 jovyan users 106396638 Mar 2 06:04 xydevNPY.tar.gz 4 Keras error 4.1) not using Lambda on input x for model AttributeError: 'Tensor' object has no attribute '_keras_history' 5 jupyter notebook 5.1) change home dir open Anaconda Prompt (2019) enter the command below: C:\>jupyter notebook --notebook-dir=”C:\fajin\workspace\deepLearning\2_improve_dnn” 6 linux 6.1) kernal version jovyan@48c312d67fd2:~/work$ uname -r 4.14.70-67.55.amzn1.x86_64 jovyan@48c312d67fd2:~/work$ cat /proc/version Linux version 4.14.70-67.55.amzn1.x86_64 (mockbuild@gobi-build-60002) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (G CC)) #1 SMP Tue Sep 18 10:36:30 UTC 2018 jovyan@48c312d67fd2:~/work$ uname -mrsn Linux 48c312d67fd2 4.14.70-67.55.amzn1.x86_64 x86_64 jovyan@48c312d67fd2:~/work$ uname -a Linux 48c312d67fd2 4.14.70-67.55.amzn1.x86_64 #1 SMP Tue Sep 18 10:36:30 UTC 2018 x86_64 GNU/Linux 8/8

Deep Learning Lecture Notes: Neural Networks & Hyperparameter Tuning

Products

Support

Deep Learning Lecture Notes: Neural Networks & Hyperparameter Tuning

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib