手机网站的特效,上海网站开发制作,安康市教育云平台,自建网站去除html文章目录作业1. 建立你的深度神经网络1. 导入包2. 算法主要流程3. 初始化3.1 两层神经网络3.2 多层神经网络4. 前向传播4.1 线性模块4.2 线性激活模块4.3 多层模型5. 损失函数6. 反向传播6.1 线性模块6.2 线性激活模块6.3 多层模型6.4 梯度下降、更新参数作业2. 深度神经网络应…
文章目录作业1. 建立你的深度神经网络1. 导入包2. 算法主要流程3. 初始化3.1 两层神经网络3.2 多层神经网络4. 前向传播4.1 线性模块4.2 线性激活模块4.3 多层模型5. 损失函数6. 反向传播6.1 线性模块6.2 线性激活模块6.3 多层模型6.4 梯度下降、更新参数作业2. 深度神经网络应用图像分类1. 导入包2. 数据集3. 建立模型3.1 两层神经网络3.2 多层神经网络3.3 一般步骤4. 两层神经网络5. 多层神经网络6. 结果分析7. 用自己的图片测试测试题参考博文
作业1. 建立你的深度神经网络
1. 导入包
import numpy as np
import h5py
import matplotlib.pyplot as plt
from testCases_v2 import *
from dnn_utils_v2 import sigmoid, sigmoid_backward, relu, relu_backward%matplotlib inline
plt.rcParams[figure.figsize] (5.0, 4.0) # set default size of plots
plt.rcParams[image.interpolation] nearest
plt.rcParams[image.cmap] gray%load_ext autoreload
%autoreload 2np.random.seed(1)2. 算法主要流程 3. 初始化
第4节笔记01.神经网络和深度学习 W4.深层神经网络
3.1 两层神经网络
模型结构LINEAR - RELU - LINEAR - SIGMOID 权重np.random.randn(shape)*0.01 偏置np.zeros(shape)
# GRADED FUNCTION: initialize_parametersdef initialize_parameters(n_x, n_h, n_y):Argument:n_x -- size of the input layern_h -- size of the hidden layern_y -- size of the output layerReturns:parameters -- python dictionary containing your parameters:W1 -- weight matrix of shape (n_h, n_x)b1 -- bias vector of shape (n_h, 1)W2 -- weight matrix of shape (n_y, n_h)b2 -- bias vector of shape (n_y, 1)np.random.seed(1)### START CODE HERE ### (≈ 4 lines of code)W1 np.random.randn(n_h, n_x)*0.01b1 np.zeros((n_h, 1))W2 np.random.randn(n_y, n_h)*0.01b2 np.zeros((n_y, 1))### END CODE HERE ###assert(W1.shape (n_h, n_x))assert(b1.shape (n_h, 1))assert(W2.shape (n_y, n_h))assert(b2.shape (n_y, 1))parameters {W1: W1,b1: b1,W2: W2,b2: b2}return parameters 3.2 多层神经网络
模型结构[LINEAR - RELU] × (L-1) - LINEAR - SIGMOID
# GRADED FUNCTION: initialize_parameters_deepdef initialize_parameters_deep(layer_dims):Arguments:layer_dims -- python array (list) containing the dimensions of each layer in our networkReturns:parameters -- python dictionary containing your parameters W1, b1, ..., WL, bL:Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])bl -- bias vector of shape (layer_dims[l], 1)np.random.seed(3)parameters {}L len(layer_dims) # number of layers in the networkfor l in range(1, L):### START CODE HERE ### (≈ 2 lines of code)parameters[W str(l)] np.random.randn(layer_dims[l], layer_dims[l-1])*0.01parameters[b str(l)] np.zeros((layer_dims[l], 1))### END CODE HERE ###assert(parameters[W str(l)].shape (layer_dims[l], layer_dims[l-1]))assert(parameters[b str(l)].shape (layer_dims[l], 1))return parameters4. 前向传播
4.1 线性模块
向量化公式 Z[l]W[l]A[l−1]b[l]Z^{[l]} W^{[l]}A^{[l-1]} b^{[l]}Z[l]W[l]A[l−1]b[l]
其中 A[0]XA^{[0]} XA[0]X
计算 ZZZ缓存 A,W,bA, W, bA,W,b
# GRADED FUNCTION: linear_forwarddef linear_forward(A, W, b):Implement the linear part of a layers forward propagation.Arguments:A -- activations from previous layer (or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)Returns:Z -- the input of the activation function, also called pre-activation parameter cache -- a python dictionary containing A, W and b ; stored for computing the backward pass efficiently### START CODE HERE ### (≈ 1 line of code)Z np.dot(W, A) b### END CODE HERE ###assert(Z.shape (W.shape[0], A.shape[1]))cache (A, W, b)return Z, cache4.2 线性激活模块
计算激活输出 AAA以及缓存 ZZZ 反向传播时要用到作业里的激活函数会返回这两项 A[l]g(Z[l])g(W[l]A[l−1]b[l])A^{[l]} g(Z^{[l]}) g(W^{[l]}A^{[l-1]} b^{[l]})A[l]g(Z[l])g(W[l]A[l−1]b[l]) 其中 ggg 是激活函数可以是ReLuSigmoid等
# GRADED FUNCTION: linear_activation_forwarddef linear_activation_forward(A_prev, W, b, activation):Implement the forward propagation for the LINEAR-ACTIVATION layerArguments:A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)activation -- the activation to be used in this layer, stored as a text string: sigmoid or reluReturns:A -- the output of the activation function, also called the post-activation value cache -- a python dictionary containing linear_cache and activation_cache;stored for computing the backward pass efficientlyif activation sigmoid:# Inputs: A_prev, W, b. Outputs: A, activation_cache.### START CODE HERE ### (≈ 2 lines of code)Z, linear_cache linear_forward(A_prev, W, b)A, activation_cache sigmoid(Z)### END CODE HERE ###elif activation relu:# Inputs: A_prev, W, b. Outputs: A, activation_cache.### START CODE HERE ### (≈ 2 lines of code)Z, linear_cache linear_forward(A_prev, W, b)A, activation_cache relu(Z)### END CODE HERE ###assert (A.shape (W.shape[0], A_prev.shape[1]))cache (linear_cache, activation_cache)return A, cache4.3 多层模型 前面使用 L−1L-1L−1 层 ReLu最后使用 1 层 Sigmoid
# GRADED FUNCTION: L_model_forwarddef L_model_forward(X, parameters):Implement forward propagation for the [LINEAR-RELU]*(L-1)-LINEAR-SIGMOID computationArguments:X -- data, numpy array of shape (input size, number of examples)parameters -- output of initialize_parameters_deep()Returns:AL -- last post-activation valuecaches -- list of caches containing:every cache of linear_relu_forward() (there are L-1 of them, indexed from 0 to L-2)the cache of linear_sigmoid_forward() (there is one, indexed L-1)caches []A XL len(parameters) // 2 # number of layers in the neural network# Implement [LINEAR - RELU]*(L-1). Add cache to the caches list.for l in range(1, L):A_prev A ### START CODE HERE ### (≈ 2 lines of code)A, cache linear_activation_forward(A_prev, parameters[Wstr(l)], parameters[bstr(l)], relu)caches.append(cache) # 每一层的 AWb, Z)### END CODE HERE #### Implement LINEAR - SIGMOID. Add cache to the caches list.### START CODE HERE ### (≈ 2 lines of code)AL, cache linear_activation_forward(A, parameters[Wstr(L)], parameters[bstr(L)], sigmoid)caches.append(cache)### END CODE HERE ###assert(AL.shape (1,X.shape[1]))return AL, caches现在得到了一个完整的前向传播AL 包含预测值可以计算损失函数
5. 损失函数
计算损失
−1m∑i1m(y(i)log(a[L](i))(1−y(i))log(1−a[L](i)))-\frac{1}{m} \sum\limits_{i 1}^{m} \bigg(y^{(i)}\log\left(a^{[L] (i)}\right) (1-y^{(i)})\log\left(1- a^{[L](i)}\right) \bigg) −m1i1∑m(y(i)log(a[L](i))(1−y(i))log(1−a[L](i)))
# GRADED FUNCTION: compute_costdef compute_cost(AL, Y):Implement the cost function defined by equation (7).Arguments:AL -- probability vector corresponding to your label predictions, shape (1, number of examples)Y -- true label vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)Returns:cost -- cross-entropy costm Y.shape[1]# Compute loss from aL and y.### START CODE HERE ### (≈ 1 lines of code)cost np.sum(Y*np.log(AL)(1-Y)*np.log(1-AL))/(-m)### END CODE HERE ###cost np.squeeze(cost) # To make sure your costs shape is what we expect (e.g. this turns [[17]] into 17).assert(cost.shape ())return cost6. 反向传播
计算损失函数的梯度 6.1 线性模块 dW[l]∂L∂W[l]1mdZ[l]A[l−1]TdW^{[l]} \frac{\partial \mathcal{L} }{\partial W^{[l]}} \frac{1}{m} dZ^{[l]} A^{[l-1] T} dW[l]∂W[l]∂Lm1dZ[l]A[l−1]T db[l]∂L∂b[l]1m∑i1mdZ[l](i)db^{[l]} \frac{\partial \mathcal{L} }{\partial b^{[l]}} \frac{1}{m} \sum_{i 1}^{m} dZ^{[l](i)}db[l]∂b[l]∂Lm1i1∑mdZ[l](i) dA[l−1]∂L∂A[l−1]W[l]TdZ[l]dA^{[l-1]} \frac{\partial \mathcal{L} }{\partial A^{[l-1]}} W^{[l] T} dZ^{[l]} dA[l−1]∂A[l−1]∂LW[l]TdZ[l]
# GRADED FUNCTION: linear_backwarddef linear_backward(dZ, cache):Implement the linear portion of backward propagation for a single layer (layer l)Arguments:dZ -- Gradient of the cost with respect to the linear output (of current layer l)cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layerReturns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as bA_prev, W, b cachem A_prev.shape[1]### START CODE HERE ### (≈ 3 lines of code)dW np.dot(dZ, A_prev.T)/mdb 1/m*np.sum(dZ, axis1, keepdimsTrue)dA_prev np.dot(W.T, dZ)### END CODE HERE ###assert (dA_prev.shape A_prev.shape)assert (dW.shape W.shape)assert (db.shape b.shape)return dA_prev, dW, db6.2 线性激活模块
dZ[l]dA[l]∗g′(Z[l])dZ^{[l]} dA^{[l]} * g(Z^{[l]})dZ[l]dA[l]∗g′(Z[l])
# GRADED FUNCTION: linear_activation_backwarddef linear_activation_backward(dA, cache, activation):Implement the backward propagation for the LINEAR-ACTIVATION layer.Arguments:dA -- post-activation gradient for current layer l cache -- tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficientlyactivation -- the activation to be used in this layer, stored as a text string: sigmoid or reluReturns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as blinear_cache, activation_cache cacheif activation relu:### START CODE HERE ### (≈ 2 lines of code)dZ relu_backward(dA, activation_cache)dA_prev, dW, db linear_backward(dZ, linear_cache)### END CODE HERE ###elif activation sigmoid:### START CODE HERE ### (≈ 2 lines of code)dZ sigmoid_backward(dA, activation_cache)dA_prev, dW, db linear_backward(dZ, linear_cache)### END CODE HERE ###return dA_prev, dW, db6.3 多层模型 dAL−np.divide(Y,AL)np.divide(1−Y,1−AL)dAL - np.divide(Y, AL) np.divide(1 - Y, 1 - AL)dAL−np.divide(Y,AL)np.divide(1−Y,1−AL)
# GRADED FUNCTION: L_model_backwarddef L_model_backward(AL, Y, caches):Implement the backward propagation for the [LINEAR-RELU] * (L-1) - LINEAR - SIGMOID groupArguments:AL -- probability vector, output of the forward propagation (L_model_forward())Y -- true label vector (containing 0 if non-cat, 1 if cat)caches -- list of caches containing:every cache of linear_activation_forward() with relu (its caches[l], for l in range(L-1) i.e l 0...L-2)the cache of linear_activation_forward() with sigmoid (its caches[L-1])Returns:grads -- A dictionary with the gradientsgrads[dA str(l)] ... grads[dW str(l)] ...grads[db str(l)] ... grads {}L len(caches) # the number of layersm AL.shape[1]Y Y.reshape(AL.shape) # after this line, Y is the same shape as AL# Initializing the backpropagation### START CODE HERE ### (1 line of code)dAL -np.divide(Y, AL) np.divide(1-Y, 1-AL)### END CODE HERE #### Lth layer (SIGMOID - LINEAR) gradients. # Inputs: AL, Y, caches. # Outputs: grads[dAL], grads[dWL], grads[dbL]### START CODE HERE ### (approx. 2 lines)current_cache caches[L-1]grads[dA str(L)], grads[dW str(L)], grads[db str(L)] linear_activation_backward(dAL, current_cache, sigmoid)### END CODE HERE ###for l in reversed(range(L-1)):# lth layer: (RELU - LINEAR) gradients.# Inputs: grads[dA str(l 2)], caches. # Outputs: grads[dA str(l 1)] , grads[dW str(l 1)] , grads[db str(l 1)] ### START CODE HERE ### (approx. 5 lines)current_cache caches[l]dA_prev_temp, dW_temp, db_temp linear_activation_backward(grads[dAstr(l2)], current_cache, relu)grads[dA str(l 1)] dA_prev_tempgrads[dW str(l 1)] dW_tempgrads[db str(l 1)] db_temp### END CODE HERE ###return grads6.4 梯度下降、更新参数
W[l]W[l]−αdW[l]W^{[l]} W^{[l]} - \alpha \text{ } dW^{[l]}W[l]W[l]−α dW[l] b[l]b[l]−αdb[l]b^{[l]} b^{[l]} - \alpha \text{ } db^{[l]}b[l]b[l]−α db[l]
# GRADED FUNCTION: update_parametersdef update_parameters(parameters, grads, learning_rate):Update parameters using gradient descentArguments:parameters -- python dictionary containing your parameters grads -- python dictionary containing your gradients, output of L_model_backwardReturns:parameters -- python dictionary containing your updated parameters parameters[W str(l)] ... parameters[b str(l)] ...L len(parameters) // 2 # number of layers in the neural network# Update rule for each parameter. Use a for loop.### START CODE HERE ### (≈ 3 lines of code)for l in range(L):parameters[W str(l1)] parameters[Wstr(l1)] - learning_rate * grads[dWstr(l1)]parameters[b str(l1)] parameters[bstr(l1)] - learning_rate * grads[dbstr(l1)]### END CODE HERE ###return parameters作业2. 深度神经网络应用图像分类
使用上面的函数建立深度神经网络并对图片是不是猫进行预测。
1. 导入包
import time
import numpy as np
import h5py
import matplotlib.pyplot as plt
import scipy
from PIL import Image
from scipy import ndimage
from dnn_app_utils_v2 import *%matplotlib inline
plt.rcParams[figure.figsize] (5.0, 4.0) # set default size of plots
plt.rcParams[image.interpolation] nearest
plt.rcParams[image.cmap] gray%load_ext autoreload
%autoreload 2np.random.seed(1)2. 数据集
01.神经网络和深度学习 W2.神经网络基础作业逻辑回归 图片识别
使用 01W2 作业里面的数据集逻辑回归的准确率只有 70%
加载数据
train_x_orig, train_y, test_x_orig, test_y, classes load_data()查看数据
# Example of a picture
index 1
plt.imshow(train_x_orig[index])
print (y str(train_y[0,index]) . Its a classes[train_y[0,index]].decode(utf-8) picture.)查看数据大小
# Explore your dataset
m_train train_x_orig.shape[0]
num_px train_x_orig.shape[1]
m_test test_x_orig.shape[0]print (Number of training examples: str(m_train))
print (Number of testing examples: str(m_test))
print (Each image is of size: ( str(num_px) , str(num_px) , 3))
print (train_x_orig shape: str(train_x_orig.shape))
print (train_y shape: str(train_y.shape))
print (test_x_orig shape: str(test_x_orig.shape))
print (test_y shape: str(test_y.shape))Number of training examples: 209
Number of testing examples: 50
Each image is of size: (64, 64, 3)
train_x_orig shape: (209, 64, 64, 3)
train_y shape: (1, 209)
test_x_orig shape: (50, 64, 64, 3)
test_y shape: (1, 50)图片数据向量化
# Reshape the training and test examples
train_x_flatten train_x_orig.reshape(train_x_orig.shape[0], -1).T # The -1 makes reshape flatten the remaining dimensions
test_x_flatten test_x_orig.reshape(test_x_orig.shape[0], -1).T# Standardize data to have feature values between 0 and 1.
train_x train_x_flatten/255.
test_x test_x_flatten/255.print (train_xs shape: str(train_x.shape))
print (test_xs shape: str(test_x.shape))train_xs shape: (12288, 209) # 12288 64 * 64 * 3
test_xs shape: (12288, 50)3. 建立模型
3.1 两层神经网络 3.2 多层神经网络 3.3 一般步骤
初始化参数 / 定义超参数n_iters次 迭代循环 – a. 正向传播 – b. 计算成本函数 – c. 反向传播 – d. 更新参数使用参数、梯度使用训练好的参数 预测
4. 两层神经网络
定义参数
### CONSTANTS DEFINING THE MODEL ####
n_x 12288 # num_px * num_px * 3
n_h 7 # 隐藏层单元个数
n_y 1
layers_dims (n_x, n_h, n_y)组件模型
# GRADED FUNCTION: two_layer_modeldef two_layer_model(X, Y, layers_dims, learning_rate 0.0075, num_iterations 3000, print_costFalse):Implements a two-layer neural network: LINEAR-RELU-LINEAR-SIGMOID.Arguments:X -- input data, of shape (n_x, number of examples)Y -- true label vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)layers_dims -- dimensions of the layers (n_x, n_h, n_y)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- If set to True, this will print the cost every 100 iterations Returns:parameters -- a dictionary containing W1, W2, b1, and b2np.random.seed(1)grads {}costs [] # to keep track of the costm X.shape[1] # number of examples(n_x, n_h, n_y) layers_dims# Initialize parameters dictionary, by calling one of the functions youd previously implemented### START CODE HERE ### (≈ 1 line of code)parameters initialize_parameters(n_x, n_h, n_y)### END CODE HERE #### Get W1, b1, W2 and b2 from the dictionary parameters.W1 parameters[W1]b1 parameters[b1]W2 parameters[W2]b2 parameters[b2]# Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation: LINEAR - RELU - LINEAR - SIGMOID. # Inputs: X, W1, b1. # Output: A1, cache1, A2, cache2.### START CODE HERE ### (≈ 2 lines of code)A1, cache1 linear_activation_forward(X, W1, b1, relu)A2, cache2 linear_activation_forward(A1, W2, b2, sigmoid)### END CODE HERE #### Compute cost### START CODE HERE ### (≈ 1 line of code)cost compute_cost(A2, Y)### END CODE HERE #### Initializing backward propagationdA2 - np.divide(Y, A2) np.divide(1 - Y, 1 - A2)# Backward propagation. # Inputs: dA2, cache2, cache1. # Outputs: dA1, dW2, db2; also dA0 (not used), dW1, db1.### START CODE HERE ### (≈ 2 lines of code)dA1, dW2, db2 linear_activation_backward(dA2, cache2, sigmoid)dA0, dW1, db1 linear_activation_backward(dA1, cache1, relu)### END CODE HERE #### Set grads[dWl] to dW1, grads[db1] to db1, grads[dW2] to dW2, grads[db2] to db2grads[dW1] dW1grads[db1] db1grads[dW2] dW2grads[db2] db2# Update parameters.### START CODE HERE ### (approx. 1 line of code)parameters update_parameters(parameters, grads, learning_rate)### END CODE HERE #### Retrieve W1, b1, W2, b2 from parametersW1 parameters[W1]b1 parameters[b1]W2 parameters[W2]b2 parameters[b2]# Print the cost every 100 training exampleif print_cost and i % 100 0:print(Cost after iteration {}: {}.format(i, np.squeeze(cost)))if print_cost and i % 100 0:costs.append(cost)# plot the costplt.plot(np.squeeze(costs))plt.ylabel(cost)plt.xlabel(iterations (per tens))plt.title(Learning rate str(learning_rate))plt.show()return parameters训练
parameters two_layer_model(train_x, train_y, layers_dims (n_x, n_h, n_y), num_iterations 2500, print_costTrue)Cost after iteration 0: 0.693049735659989
Cost after iteration 100: 0.6464320953428849
Cost after iteration 200: 0.6325140647912678
Cost after iteration 300: 0.6015024920354665
Cost after iteration 400: 0.5601966311605747
Cost after iteration 500: 0.5158304772764729
Cost after iteration 600: 0.4754901313943325
Cost after iteration 700: 0.43391631512257495
Cost after iteration 800: 0.4007977536203887
Cost after iteration 900: 0.35807050113237976
Cost after iteration 1000: 0.33942815383664127
Cost after iteration 1100: 0.30527536361962654
Cost after iteration 1200: 0.2749137728213016
Cost after iteration 1300: 0.24681768210614846
Cost after iteration 1400: 0.19850735037466097
Cost after iteration 1500: 0.17448318112556657
Cost after iteration 1600: 0.1708076297809689
Cost after iteration 1700: 0.11306524562164715
Cost after iteration 1800: 0.09629426845937145
Cost after iteration 1900: 0.08342617959726863
Cost after iteration 2000: 0.07439078704319078
Cost after iteration 2100: 0.06630748132267933
Cost after iteration 2200: 0.0591932950103817
Cost after iteration 2300: 0.05336140348560554
Cost after iteration 2400: 0.04855478562877016预测
训练集Accuracy: 0.9999999999999998
predictions_train predict(train_x, train_y, parameters)
# Accuracy: 0.9999999999999998测试集Accuracy: 0.72比之前的逻辑回归 0.70 好一些
predictions_test predict(test_x, test_y, parameters)
# Accuracy: 0.725. 多层神经网络
定义参数5层 NN
### CONSTANTS ###
layers_dims [12288, 20, 7, 5, 1] # 5-layer model组件模型
# GRADED FUNCTION: L_layer_modeldef L_layer_model(X, Y, layers_dims, learning_rate 0.0075, num_iterations 3000, print_costFalse):#lr was 0.009Implements a L-layer neural network: [LINEAR-RELU]*(L-1)-LINEAR-SIGMOID.Arguments:X -- data, numpy array of shape (number of examples, num_px * num_px * 3)Y -- true label vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)layers_dims -- list containing the input size and each layer size, of length (number of layers 1).learning_rate -- learning rate of the gradient descent update rulenum_iterations -- number of iterations of the optimization loopprint_cost -- if True, it prints the cost every 100 stepsReturns:parameters -- parameters learnt by the model. They can then be used to predict.np.random.seed(1)costs [] # keep track of cost# Parameters initialization.### START CODE HERE ###parameters initialize_parameters_deep(layers_dims)### END CODE HERE #### Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation: [LINEAR - RELU]*(L-1) - LINEAR - SIGMOID.### START CODE HERE ### (≈ 1 line of code)AL, caches L_model_forward(X, parameters)### END CODE HERE #### Compute cost.### START CODE HERE ### (≈ 1 line of code)cost compute_cost(AL, Y)### END CODE HERE #### Backward propagation.### START CODE HERE ### (≈ 1 line of code)grads L_model_backward(AL, Y, caches)### END CODE HERE #### Update parameters.### START CODE HERE ### (≈ 1 line of code)parameters update_parameters(parameters, grads, learning_rate)### END CODE HERE #### Print the cost every 100 training exampleif print_cost and i % 100 0:print (Cost after iteration %i: %f %(i, cost))if print_cost and i % 100 0:costs.append(cost)# plot the costplt.plot(np.squeeze(costs))plt.ylabel(cost)plt.xlabel(iterations (per tens))plt.title(Learning rate str(learning_rate))plt.show()return parameters训练
parameters L_layer_model(train_x, train_y, layers_dims, num_iterations 2500, print_cost True)Cost after iteration 0: 0.771749
Cost after iteration 100: 0.672053
Cost after iteration 200: 0.648263
Cost after iteration 300: 0.611507
Cost after iteration 400: 0.567047
Cost after iteration 500: 0.540138
Cost after iteration 600: 0.527930
Cost after iteration 700: 0.465477
Cost after iteration 800: 0.369126
Cost after iteration 900: 0.391747
Cost after iteration 1000: 0.315187
Cost after iteration 1100: 0.272700
Cost after iteration 1200: 0.237419
Cost after iteration 1300: 0.199601
Cost after iteration 1400: 0.189263
Cost after iteration 1500: 0.161189
Cost after iteration 1600: 0.148214
Cost after iteration 1700: 0.137775
Cost after iteration 1800: 0.129740
Cost after iteration 1900: 0.121225
Cost after iteration 2000: 0.113821
Cost after iteration 2100: 0.107839
Cost after iteration 2200: 0.102855
Cost after iteration 2300: 0.100897
Cost after iteration 2400: 0.092878预测
训练集Accuracy: 0.9856459330143539
pred_train predict(train_x, train_y, parameters)
# Accuracy: 0.9856459330143539测试集Accuracy: 0.8比逻辑回归 0.70两层NN 0.72 都要好
pred_test predict(test_x, test_y, parameters)
# Accuracy: 0.8下一门课将会系统的学习如何调参使得模型的效果更好
6. 结果分析
def print_mislabeled_images(classes, X, y, p):Plots images where predictions and truth were different.X -- datasety -- true labelsp -- predictionsa p ymislabeled_indices np.asarray(np.where(a 1)) # 01, 10, wrong caseplt.rcParams[figure.figsize] (40.0, 40.0) # set default size of plotsnum_images len(mislabeled_indices[0])for i in range(num_images):index mislabeled_indices[1][i]plt.subplot(2, num_images, i 1)plt.imshow(X[:,index].reshape(64,64,3), interpolationnearest)plt.axis(off)plt.title(Prediction: classes[int(p[0,index])].decode(utf-8) \n Class: classes[y[0,index]].decode(utf-8))print_mislabeled_images(classes, test_x, test_y, pred_test)错误特点
猫的身体在一个不寻常的位置猫出现在一个相似颜色的背景下不常见的猫颜色和种类照相机角度图片的亮度大小程度猫在图像中非常大或很小
7. 用自己的图片测试
## START CODE HERE ##
my_image my_image.jpg # change this to the name of your image file
my_label_y [1] # the true class of your image (1 - cat, 0 - non-cat)
## END CODE HERE ##fname images/ my_image
image Image.open(fname)
my_image np.array(image.resize((num_px,num_px))).reshape((num_px*num_px*3,1))
my_predicted_image predict(my_image, my_label_y, parameters)plt.imshow(image)
print (y str(np.squeeze(my_predicted_image)) , your L-layer model predicts a \ classes[int(np.squeeze(my_predicted_image)),].decode(utf-8) \ picture.)Accuracy: 1.0
y 1.0, your L-layer model predicts a cat picture.我的CSDN博客地址 https://michael.blog.csdn.net/
长按或扫码关注我的公众号Michael阿明一起加油、一起学习进步