当前位置：首页 > news >正文

北网站建设优化seo是什么意思

news 2025/11/15 8:36:27

北网站建设,优化seo是什么意思,北京工商注册网上核名,公众号运营怎么做文章目录编程题 11. numpy 基本函数1.1 编写 sigmoid 函数1.2 编写 sigmoid 函数的导数1.3 reshape操作1.4 标准化1.5 广播机制2. 向量化2.1 L1\L2损失函数编程题 2. 图片#x1f431;识别1. 导入包2. 数据预览3. 算法的一般结构4. 建立算法4.1 辅助函数4.2 初始化参数4.3 前向… 文章目录编程题 11. numpy 基本函数1.1 编写 sigmoid 函数1.2 编写 sigmoid 函数的导数1.3 reshape操作1.4 标准化1.5 广播机制2. 向量化2.1 L1\L2损失函数编程题 2. 图片识别1. 导入包2. 数据预览3. 算法的一般结构4. 建立算法4.1 辅助函数4.2 初始化参数4.3 前向后向传播4.4 更新参数梯度下降4.5 合并所有函数到Model4.6 分析4.7 用自己的照片测试模型5. 总结选择题测试请参考链接博文编程题 1 1. numpy 基本函数 1.1 编写 sigmoid 函数 import mathdef basic_sigmoid(x):Compute sigmoid of x.Arguments:x -- A scalarReturn:s -- sigmoid(x)### START CODE HERE ### (≈ 1 line of code)s 1/(1math.pow(math.e, -x)) # or s 1/(1math.exp(-x))### END CODE HERE ###return s不推荐使用 math 包因为深度学习里很多都是向量math 包不能对向量进行计算 ### One reason why we use numpy instead of math in Deep Learning ### x [1, 2, 3] basic_sigmoid(x) # you will see this give an error when you run it, because x is a vector. # 会报错import numpy as np# example of np.exp x np.array([1, 2, 3]) print(np.exp(x)) # result is (exp(1), exp(2), exp(3)) # [ 2.71828183 7.3890561 20.08553692] # numpy 可以对向量进行操作使用 numpy 编写的 sigmoid 函数 import numpy as np # this means you can access numpy functions by writing np.function() instead of numpy.function()def sigmoid(x):Compute the sigmoid of xArguments:x -- A scalar or numpy array of any sizeReturn:s -- sigmoid(x)### START CODE HERE ### (≈ 1 line of code)s 1/(1np.exp(-x))### END CODE HERE ###return sx np.array([1, 2, 3]) sigmoid(x) # array([0.73105858, 0.88079708, 0.95257413])1.2 编写 sigmoid 函数的导数 # GRADED FUNCTION: sigmoid_derivativedef sigmoid_derivative(x):Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.You can store the output of the sigmoid function into variables and then use it to calculate the gradient.Arguments:x -- A scalar or numpy arrayReturn:ds -- Your computed gradient.### START CODE HERE ### (≈ 2 lines of code)s sigmoid(x)ds s*(1-s)### END CODE HERE ###return dsx np.array([1, 2, 3]) sigmoid_derivative(x) print (sigmoid_derivative(x) str(sigmoid_derivative(x))) # sigmoid_derivative(x) [0.19661193 0.10499359 0.04517666]1.3 reshape操作将照片的数据展平不想计算的维可以置为 -1会自动计算 # GRADED FUNCTION: image2vector def image2vector(image):Argument:image -- a numpy array of shape (length, height, depth)Returns:v -- a vector of shape (length*height*depth, 1)### START CODE HERE ### (≈ 1 line of code)v image.reshape(-1,1)### END CODE HERE ###return v# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values image np.array([[[ 0.67826139, 0.29380381],[ 0.90714982, 0.52835647],[ 0.4215251 , 0.45017551]],[[ 0.92814219, 0.96677647],[ 0.85304703, 0.52351845],[ 0.19981397, 0.27417313]],[[ 0.60659855, 0.00533165],[ 0.10820313, 0.49978937],[ 0.34144279, 0.94630077]]])print (image2vector(image) str(image2vector(image)))# 输出 image2vector(image) [[0.67826139][0.29380381][0.90714982][0.52835647][0.4215251 ][0.45017551][0.92814219][0.96677647][0.85304703][0.52351845][0.19981397][0.27417313][0.60659855][0.00533165][0.10820313][0.49978937][0.34144279][0.94630077]]1.4 标准化标准化通常使得梯度下降收敛更快。举个例子 x[034264]x \begin{bmatrix} 0 3 4 \\ 2 6 4 \\ \end{bmatrix}x[023644] 那么 ∥x∥np.linalg.norm(x,axis1,keepdimsTrue)[556]\| x\| np.linalg.norm(x, axis 1, keepdims True) \begin{bmatrix} 5 \\ \sqrt{56} \\ \end{bmatrix}∥x∥np.linalg.norm(x,axis1,keepdimsTrue)[556] x_normalizedx∥x∥[03545256656456]x\_normalized \frac{x}{\| x\|} \begin{bmatrix} 0 \frac{3}{5} \frac{4}{5} \\ \frac{2}{\sqrt{56}} \frac{6}{\sqrt{56}} \frac{4}{\sqrt{56}} \\ \end{bmatrix}x_normalized∥x∥x[05625356654564] # GRADED FUNCTION: normalizeRowsdef normalizeRows(x):Implement a function that normalizes each row of the matrix x (to have unit length).Argument:x -- A numpy matrix of shape (n, m)Returns:x -- The normalized (by row) numpy matrix. You are allowed to modify x.### START CODE HERE ### (≈ 2 lines of code)# Compute x_norm as the norm 2 of x. Use np.linalg.norm(..., ord 2, axis ..., keepdims True)x_norm np.linalg.norm(x, axis1, keepdimsTrue)# Divide x by its norm.x x/x_norm### END CODE HERE ###return xx np.array([[0, 3, 4],[1, 6, 4]]) print(normalizeRows(x) str(normalizeRows(x))) # normalizeRows(x) [[0. 0.6 0.8 ] # [0.13736056 0.82416338 0.54944226]]1.5 广播机制官方文档对于行向量 x∈R1×n, softmax(x)softmax([x1x2...xn])[ex1∑jexjex2∑jexj...exn∑jexj]x \in \mathbb{R}^{1\times n} \text{, } softmax(x) softmax(\begin{bmatrix} x_1 x_2 ... x_n \end{bmatrix}) \begin{bmatrix} \frac{e^{x_1}}{\sum_{j}e^{x_j}} \frac{e^{x_2}}{\sum_{j}e^{x_j}} ... \frac{e^{x_n}}{\sum_{j}e^{x_j}} \end{bmatrix}x∈R1×n, softmax(x)softmax([x1x2...xn])[∑jexjex1∑jexjex2...∑jexjexn] 对于矩阵 xxx ∈Rm×n\in \mathbb{R}^{m \times n}∈Rm×n xijx_{ij}xij maps to the element in the ithi^{th}ith row and jthj^{th}jth column of xxx, thus we have softmax(x)softmax[x11x12x13…x1nx21x22x23…x2n⋮⋮⋮⋱⋮xm1xm2xm3…xmn][ex11∑jex1jex12∑jex1jex13∑jex1j…ex1n∑jex1jex21∑jex2jex22∑jex2jex23∑jex2j…ex2n∑jex2j⋮⋮⋮⋱⋮exm1∑jexmjexm2∑jexmjexm3∑jexmj…exmn∑jexmj](softmax(first row of x)softmax(second row of x)...softmax(last row of x))softmax(x) softmax\begin{bmatrix} x_{11} x_{12} x_{13} \dots x_{1n} \\ x_{21} x_{22} x_{23} \dots x_{2n} \\ \vdots \vdots \vdots \ddots \vdots \\ x_{m1} x_{m2} x_{m3} \dots x_{mn} \end{bmatrix} \begin{bmatrix} \frac{e^{x_{11}}}{\sum_{j}e^{x_{1j}}} \frac{e^{x_{12}}}{\sum_{j}e^{x_{1j}}} \frac{e^{x_{13}}}{\sum_{j}e^{x_{1j}}} \dots \frac{e^{x_{1n}}}{\sum_{j}e^{x_{1j}}} \\ \frac{e^{x_{21}}}{\sum_{j}e^{x_{2j}}} \frac{e^{x_{22}}}{\sum_{j}e^{x_{2j}}} \frac{e^{x_{23}}}{\sum_{j}e^{x_{2j}}} \dots \frac{e^{x_{2n}}}{\sum_{j}e^{x_{2j}}} \\ \vdots \vdots \vdots \ddots \vdots \\ \frac{e^{x_{m1}}}{\sum_{j}e^{x_{mj}}} \frac{e^{x_{m2}}}{\sum_{j}e^{x_{mj}}} \frac{e^{x_{m3}}}{\sum_{j}e^{x_{mj}}} \dots \frac{e^{x_{mn}}}{\sum_{j}e^{x_{mj}}} \end{bmatrix} \begin{pmatrix} softmax\text{(first row of x)} \\ softmax\text{(second row of x)} \\ ... \\ softmax\text{(last row of x)} \\ \end{pmatrix} softmax(x)softmax⎣⎢⎢⎢⎡x11x21⋮xm1x12x22⋮xm2x13x23⋮xm3……⋱…x1nx2n⋮xmn⎦⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡∑jex1jex11∑jex2jex21⋮∑jexmjexm1∑jex1jex12∑jex2jex22⋮∑jexmjexm2∑jex1jex13∑jex2jex23⋮∑jexmjexm3……⋱…∑jex1jex1n∑jex2jex2n⋮∑jexmjexmn⎦⎥⎥⎥⎥⎥⎤⎝⎜⎜⎛softmax(first row of x)softmax(second row of x)...softmax(last row of x)⎠⎟⎟⎞ # GRADED FUNCTION: softmaxdef softmax(x):Calculates the softmax for each row of the input x.Your code should work for a row vector and also for matrices of shape (n, m).Argument:x -- A numpy matrix of shape (n,m)Returns:s -- A numpy matrix equal to the softmax of x, of shape (n,m)### START CODE HERE ### (≈ 3 lines of code)# Apply exp() element-wise to x. Use np.exp(...).x_exp np.exp(x)# Create a vector x_sum that sums each row of x_exp. Use np.sum(..., axis 1, keepdims True).x_sum np.sum(x_exp, axis1, keepdimsTrue)# Compute softmax(x) by dividing x_exp by x_sum. It should automatically use numpy broadcasting.s x_exp/x_sum### END CODE HERE ###return sx np.array([[9, 2, 5, 0, 0],[7, 5, 0, 0 ,0]]) print(softmax(x) str(softmax(x)))softmax(x) [[9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-041.21052389e-04][8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-048.01252314e-04]]2. 向量化向量化计算更简洁更高效 2.1 L1\L2损失函数 L1(y^,y)∑i0m∣y(i)−y^(i)∣\begin{aligned} L_1(\hat{y}, y) \sum_{i0}^m|y^{(i)} - \hat{y}^{(i)}| \end{aligned}L1(y^,y)i0∑m∣y(i)−y^(i)∣ def L1(yhat, y):Arguments:yhat -- vector of size m (predicted labels)y -- vector of size m (true labels)Returns:loss -- the value of the L1 loss function defined above### START CODE HERE ### (≈ 1 line of code)loss np.sum(abs(yhat-y))### END CODE HERE ###return lossyhat np.array([.9, 0.2, 0.1, .4, .9]) y np.array([1, 0, 0, 1, 1]) print(L1 str(L1(yhat,y))) # L1 1.1L2(y^,y)∑i0m(y(i)−y^(i))2\begin{aligned} L_2(\hat{y},y) \sum_{i0}^m(y^{(i)} - \hat{y}^{(i)})^2 \end{aligned}L2(y^,y)i0∑m(y(i)−y^(i))2 import numpy as np a np.array([1, 2, 3]) np.dot(a, a) 14 # GRADED FUNCTION: L2def L2(yhat, y):Arguments:yhat -- vector of size m (predicted labels)y -- vector of size m (true labels)Returns:loss -- the value of the L2 loss function defined above### START CODE HERE ### (≈ 1 line of code)loss np.dot(yhat-y, yhat-y)### END CODE HERE ###return lossyhat np.array([.9, 0.2, 0.1, .4, .9]) y np.array([1, 0, 0, 1, 1]) print(L2 str(L2(yhat,y))) # L2 0.43编程题 2. 图片识别使用神经网络识别猫 1. 导入包 import numpy as np import matplotlib.pyplot as plt import h5py import scipy from PIL import Image from scipy import ndimage from lr_utils import load_dataset%matplotlib inline2. 数据预览弄清楚数据的维度 reshape 数据标准化数据有训练集标签为 y 1 是猫y 0 不是猫有测试集带标签的每个图片是 3 通道的读取数据 # Loading the data (cat/non-cat) train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes load_dataset()预览图片 # Example of a picture index 24 plt.imshow(train_set_x_orig[index]) print (y str(train_set_y[:, index]) , its a classes[np.squeeze(train_set_y[:, index])].decode(utf-8) picture.)y [1], its a cat picture.数据大小 ### START CODE HERE ### (≈ 3 lines of code) m_train train_set_x_orig.shape[0] m_test test_set_x_orig.shape[0] num_px train_set_x_orig.shape[1] ### END CODE HERE ###print (Number of training examples: m_train str(m_train)) print (Number of testing examples: m_test str(m_test)) print (Height/Width of each image: num_px str(num_px)) print (Each image is of size: ( str(num_px) , str(num_px) , 3)) print (train_set_x shape: str(train_set_x_orig.shape)) print (train_set_y shape: str(train_set_y.shape)) print (test_set_x shape: str(test_set_x_orig.shape)) print (test_set_y shape: str(test_set_y.shape))Number of training examples: m_train 209 Number of testing examples: m_test 50 Height/Width of each image: num_px 64 Each image is of size: (64, 64, 3) train_set_x shape: (209, 64, 64, 3) train_set_y shape: (1, 209) test_set_x shape: (50, 64, 64, 3) test_set_y shape: (1, 50)将样本图片矩阵展平 # Reshape the training and test examples### START CODE HERE ### (≈ 2 lines of code) train_set_x_flatten train_set_x_orig.reshape(m_train, -1).T test_set_x_flatten test_set_x_orig.reshape(m_test, -1).T ### END CODE HERE ###print (train_set_x_flatten shape: str(train_set_x_flatten.shape)) print (train_set_y shape: str(train_set_y.shape)) print (test_set_x_flatten shape: str(test_set_x_flatten.shape)) print (test_set_y shape: str(test_set_y.shape)) print (sanity check after reshaping: str(train_set_x_flatten[0:5,0]))train_set_x_flatten shape: (12288, 209) train_set_y shape: (1, 209) test_set_x_flatten shape: (12288, 50) test_set_y shape: (1, 50) sanity check after reshaping: [17 31 56 22 33]图片的矩阵数值为 0 - 255标准化数据 train_set_x train_set_x_flatten/255. test_set_x test_set_x_flatten/255.3. 算法的一般结构用神经网络的思路建立一个 Logistic 回归 4. 建立算法定义模型结构如输入的特征个数初始化模型参数循环迭代计算当前损失前向传播计算当前梯度后向传播更新参数梯度下降 4.1 辅助函数 sigmoid 函数 # GRADED FUNCTION: sigmoiddef sigmoid(z):Compute the sigmoid of zArguments:z -- A scalar or numpy array of any size.Return:s -- sigmoid(z)### START CODE HERE ### (≈ 1 line of code)s 1/(1np.exp(-z))### END CODE HERE ###return s4.2 初始化参数逻辑回归的参数可以都设置为 0神经网络不可以 # GRADED FUNCTION: initialize_with_zerosdef initialize_with_zeros(dim):This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)### START CODE HERE ### (≈ 1 line of code)w np.zeros((dim, 1))b 0### END CODE HERE ###assert(w.shape (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b4.3 前向后向传播前向传播有 XXX 特征计算 Aσ(wTXb)(a(0),a(1),...,a(m−1),a(m))A \sigma(w^T X b) (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})Aσ(wTXb)(a(0),a(1),...,a(m−1),a(m))计算损失函数 J−1m∑i1my(i)log⁡(a(i))(1−y(i))log⁡(1−a(i))J -\frac{1}{m}\sum_{i1}^{m}y^{(i)}\log(a^{(i)})(1-y^{(i)})\log(1-a^{(i)})J−m1∑i1my(i)log(a(i))(1−y(i))log(1−a(i)) 方程 ∂J∂w1mX(A−Y)T\frac{\partial J}{\partial w} \frac{1}{m}X(A-Y)^T∂w∂Jm1X(A−Y)T ∂J∂b1m∑i1m(a(i)−y(i))\frac{\partial J}{\partial b} \frac{1}{m} \sum_{i1}^m (a^{(i)}-y^{(i)})∂b∂Jm1i1∑m(a(i)−y(i)) # GRADED FUNCTION: propagatedef propagate(w, b, X, Y):Implement the cost function and its gradient for the propagation explained aboveArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Y -- true label vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)Return:cost -- negative log-likelihood cost for logistic regressiondw -- gradient of the loss with respect to w, thus same shape as wdb -- gradient of the loss with respect to b, thus same shape as bTips:- Write your code step by step for the propagation. np.log(), np.dot()m X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)### START CODE HERE ### (≈ 2 lines of code)A sigmoid(np.dot(w.T, X)b) # compute activation# w 是列向量 A 行向量dot 矩阵乘法cost np.sum(Y*np.log(A)(1-Y)*np.log(1-A))/(-m) # compute cost# Y 行向量* 对应位置相乘### END CODE HERE #### BACKWARD PROPAGATION (TO FIND GRAD)### START CODE HERE ### (≈ 2 lines of code)dw np.dot(X, (A-Y).T)/mdb np.sum(A-Y, axis1)/m### END CODE HERE ###assert(dw.shape w.shape)assert(db.dtype float)cost np.squeeze(cost)assert(cost.shape ())grads {dw: dw,db: db}return grads, costw, b, X, Y np.array([[1],[2]]), 2, np.array([[1,2],[3,4]]), np.array([[1,0]]) grads, cost propagate(w, b, X, Y) print (dw str(grads[dw])) print (db str(grads[db])) print (cost str(cost))dw [[0.99993216][1.99980262]] db [0.49993523] cost 6.0000647731922054.4 更新参数梯度下降 # GRADED FUNCTION: optimizedef optimize(w, b, X, Y, num_iterations, learning_rate, print_cost False):This function optimizes w and b by running a gradient descent algorithmArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of shape (num_px * num_px * 3, number of examples)Y -- true label vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- True to print the loss every 100 stepsReturns:params -- dictionary containing the weights w and bias bgrads -- dictionary containing the gradients of the weights and bias with respect to the cost functioncosts -- list of all the costs computed during the optimization, this will be used to plot the learning curve.Tips:You basically need to write down two steps and iterate through them:1) Calculate the cost and the gradient for the current parameters. Use propagate().2) Update the parameters using gradient descent rule for w and b.costs []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)### START CODE HERE ### grads, cost propagate(w, b, X, Y)### END CODE HERE #### Retrieve derivatives from gradsdw grads[dw]db grads[db]# update rule (≈ 2 lines of code)### START CODE HERE ###w w - learning_rate * dwb b - learning_rate * db### END CODE HERE #### Record the costsif i % 100 0:costs.append(cost)# Print the cost every 100 training examplesif print_cost and i % 100 0:print (Cost after iteration %i: %f %(i, cost))params {w: w,b: b}grads {dw: dw,db: db}return params, grads, costsparams, grads, costs optimize(w, b, X, Y, num_iterations 100, learning_rate 0.009, print_cost False)print (w str(params[w])) print (b str(params[b])) print (dw str(grads[dw])) print (db str(grads[db]))w [[0.1124579 ][0.23106775]] b [1.55930492] dw [[0.90158428][1.76250842]] db [0.43046207]可以利用学习到的参数来进行预测计算预测值 Y^Aσ(wTXb)\hat{Y} A \sigma(w^T X b)Y^Aσ(wTXb) 根据预测值进行分类 0.5 标记为0否则为1 # GRADED FUNCTION: predictdef predict(w, b, X):Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in Xm X.shape[1]Y_prediction np.zeros((1,m))w w.reshape(X.shape[0], 1)# Compute vector A predicting the probabilities of a cat being present in the picture### START CODE HERE ### (≈ 1 line of code)A sigmoid(np.dot(w.T, X) b)### END CODE HERE ###for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]### START CODE HERE ### (≈ 4 lines of code)Y_prediction[0][i] 0 if A[0][i] 0.5 else 1### END CODE HERE ###assert(Y_prediction.shape (1, m))return Y_predictionprint (predictions str(predict(w, b, X)))predictions [[1. 1.]]4.5 合并所有函数到Model # GRADED FUNCTION: modeldef model(X_train, Y_train, X_test, Y_test, num_iterations 2000, learning_rate 0.5, print_cost False):Builds the logistic regression model by calling the function youve implemented previouslyArguments:X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)num_iterations -- hyperparameter representing the number of iterations to optimize the parameterslearning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()print_cost -- Set to true to print the cost every 100 iterationsReturns:d -- dictionary containing information about the model.### START CODE HERE #### initialize parameters with zeros (≈ 1 line of code)w, b initialize_with_zeros(X_train.shape[0])# Gradient descent (≈ 1 line of code)parameters, grads, costs optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost print_cost)# Retrieve parameters w and b from dictionary parametersw parameters[w]b parameters[b]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_test predict(w, b, X_test)Y_prediction_train predict(w, b, X_train)### END CODE HERE #### Print train/test Errorsprint(train accuracy: {} %.format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print(test accuracy: {} %.format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d {costs: costs,Y_prediction_test: Y_prediction_test, Y_prediction_train : Y_prediction_train, w : w, b : b,learning_rate : learning_rate,num_iterations: num_iterations}return dd model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations 2000, learning_rate 0.005, print_cost True)Cost after iteration 0: 0.693147 Cost after iteration 100: 0.584508 Cost after iteration 200: 0.466949 Cost after iteration 300: 0.376007 Cost after iteration 400: 0.331463 Cost after iteration 500: 0.303273 Cost after iteration 600: 0.279880 Cost after iteration 700: 0.260042 Cost after iteration 800: 0.242941 Cost after iteration 900: 0.228004 Cost after iteration 1000: 0.214820 Cost after iteration 1100: 0.203078 Cost after iteration 1200: 0.192544 Cost after iteration 1300: 0.183033 Cost after iteration 1400: 0.174399 Cost after iteration 1500: 0.166521 Cost after iteration 1600: 0.159305 Cost after iteration 1700: 0.152667 Cost after iteration 1800: 0.146542 Cost after iteration 1900: 0.140872 train accuracy: 99.04306220095694 % test accuracy: 70.0 %模型在训练集上表现的很好在测试集上一般存在过拟合现象 # Example of a picture that was wrongly classified. index 24 plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3))) print (y str(test_set_y[0,index]) , you predicted that it is a \ classes[int(d[Y_prediction_test][0,index])].decode(utf-8) \ picture.)y 1, you predicted that it is a cat picture.更改 index 可以查看测试集的预测值和真实值绘制代价函数、梯度 # Plot learning curve (with costs) costs np.squeeze(d[costs]) plt.plot(costs) plt.ylabel(cost) plt.xlabel(iterations (per hundreds)) plt.title(Learning rate str(d[learning_rate])) plt.show()增加训练迭代次数为 3000上面是2000 train accuracy: 99.52153110047847 % test accuracy: 68.0 %训练集上的准确率上升但是测试集上准确率下降这就是过拟合了 4.6 分析不同学习率下的对比 learning_rates [0.01, 0.001, 0.0001] models {} for i in learning_rates:print (learning rate is: str(i))models[str(i)] model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations 1500, learning_rate i, print_cost False)print (\n ------------------------------------------------------- \n)for i in learning_rates:plt.plot(np.squeeze(models[str(i)][costs]), label str(models[str(i)][learning_rate]))plt.ylabel(cost) plt.xlabel(iterations)legend plt.legend(locupper center, shadowTrue) frame legend.get_frame() frame.set_facecolor(0.90) plt.show()learning rate is: 0.01 train accuracy: 99.52153110047847 % test accuracy: 68.0 %-------------------------------------------------------learning rate is: 0.001 train accuracy: 88.99521531100478 % test accuracy: 64.0 %-------------------------------------------------------learning rate is: 0.0001 train accuracy: 68.42105263157895 % test accuracy: 36.0 %-------------------------------------------------------学习率太大的话容易引起震荡导致不收敛本例子0.01不算太坏最后收敛了低的cost不意味着好的模型要检查是否过拟合训练集很好测试集很差 4.7 用自己的照片测试模型 ## START CODE HERE ## (PUT YOUR IMAGE NAME) my_image cat1.jpg # change this to the name of your image file ## END CODE HERE ### We preprocess the image to fit your algorithm. fname images/ my_image image Image.open(fname) my_image np.array(image.resize((num_px, num_px))).reshape((1, num_px*num_px*3)).T my_predicted_image predict(d[w], d[b], my_image)plt.imshow(image) print(y str(np.squeeze(my_predicted_image)) , your algorithm predicts a \ classes[int(np.squeeze(my_predicted_image)),].decode(utf-8) \ picture.)5. 总结处理数据很重要数据维度数据标准化各个独立的函数初始化前后向传播梯度下降更新参数组成模型调节学习率等超参数

查看全文

http://www.zqtcl.cn/news/766842/