当前位置：首页 > news >正文

长沙网页制作网站北京市规划网站

news 2025/11/14 14:32:49

长沙网页制作网站,北京市规划网站,落实网站建设管理,东莞疾控最新通告今天目录一、实现1、吴恩达提供的工具函数sigmoidsigmoid求导relurelu求导2、实现代码导包和配置初始化参数前向运算计算损失后向运算更新参数组装模型3、问题及思考一、实现 1、吴恩达提供的工具函数这几个函数这里只是展示一下#xff0c;这是吴恩达写好的工具类#xff0c;… 目录一、实现1、吴恩达提供的工具函数sigmoidsigmoid求导relurelu求导2、实现代码导包和配置初始化参数前向运算计算损失后向运算更新参数组装模型3、问题及思考一、实现 1、吴恩达提供的工具函数这几个函数这里只是展示一下这是吴恩达写好的工具类在实现的部分会导入具体查看提供的附件 sigmoid def sigmoid(Z):A 1/(1np.exp(-Z))cache Zreturn A, cachesigmoid求导 def sigmoid_backward(dA, cache):Z caches 1/(1np.exp(-Z))dZ dA * s * (1-s)return dZrelu def relu(Z):A np.maximum(0,Z)cache Z return A, cacherelu求导 def relu_backward(dA, cache):Z cachedZ np.array(dA, copyTrue)dZ[Z 0] 0return dZ2、实现代码导包和配置 import numpy as np import h5py import matplotlib.pyplot as plt from testCases_v2 import * from dnn_utils_v2 import sigmoid, sigmoid_backward, relu, relu_backward%matplotlib inline plt.rcParams[figure.figsize] (5.0, 4.0) # set default size of plots plt.rcParams[image.interpolation] nearest plt.rcParams[image.cmap] gray%load_ext autoreload %autoreload 2np.random.seed(1)初始化参数 def initialize_parameters_deep(layer_dims):Arguments:layer_dims -- python array (list) containing thedimensions of each layer in our networkReturns:parameters -- python dictionary containing yourparameters W1, b1, ..., WL, bL:Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])bl -- bias vector of shape (layer_dims[l], 1)np.random.seed(3)parameters {}L len(layer_dims)for l in range(1, L):parameters[W%d % l] np.random.randn(layer_dims[l],layer_dims[l-1]) * 0.01parameters[b%d % l] np.zeros((layer_dims[l], 1))return parameters 前向运算 def linear_forward(A, W, b):Implement the linear part of a layers forward propagation.Arguments:A -- activations from previous layer (or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)Returns:Z -- the input of the activation function,also called pre-activation parameter cache -- a python dictionary containing A, W and b ;stored for computing the backward pass efficientlyZ np.dot(W, A) bcache (A, W, b)return Z, cachedef linear_activation_forward(A_prev, W, b, activation):Implement the forward propagation for the LINEAR-ACTIVATION layerArguments:A_prev -- activations from previous layer(or input data): (size of previous layer, number of examples)W -- weights matrix: numpy array of shape(size of current layer, size of previous layer)b -- bias vector, numpy array of shape (size of the current layer, 1)activation -- the activation to be used in this layer,stored as a text string: sigmoid or reluReturns:A -- the output of the activation function,also called the post-activation value cache -- a python dictionary containing linear_cacheand activation_cache;stored for computing the backward pass efficientlyif activation sigmoid:Z, linear_cache linear_forward(A_prev, W, b)A, activation_cache sigmoid(Z)elif activation relu:Z, linear_cache linear_forward(A_prev, W, b)A, activation_cache relu(Z)cache (linear_cache, activation_cache)return A, cachedef L_model_forward(X, parameters):Implement forward propagation for the [LINEAR-RELU]*(L-1)-LINEAR-SIGMOID computationArguments:X -- data, numpy array of shape (input size, number of examples)parameters -- output of initialize_parameters_deep()Returns:AL -- last post-activation valuecaches -- list of caches containing:every cache of linear_relu_forward() (there are L-1 of them, indexed from 0 to L-2)the cache of linear_sigmoid_forward() (there is one, indexed L-1)caches []A XL len(parameters) // 2 # number of layers in the neural network# Implement [LINEAR - RELU]*(L-1). Add cache to the caches list.for l in range(1, L):A_prev A ### START CODE HERE ### (≈ 2 lines of code)A, linear_activation_cache linear_activation_forward(A_prev,parameters[W%s % l], parameters[b%s % l], activation relu)caches.append(linear_activation_cache)### END CODE HERE #### Implement LINEAR - SIGMOID. Add cache to the caches list.### START CODE HERE ### (≈ 2 lines of code)AL, linear_activation_cache linear_activation_forward(A,parameters[W%s % L], parameters[b%s % L], activation sigmoid)caches.append(linear_activation_cache)### END CODE HERE ###return AL, caches计算损失 def compute_cost(AL, Y):m Y.shape[1]# Compute loss from aL and y.### START CODE HERE ### (≈ 1 lines of code)cost -1./ m * (np.dot(np.log(AL), Y.T) np.dot(np.log(1-AL), (1-Y).T))### END CODE HERE ###cost np.squeeze(cost) return cost后向运算 def linear_backward(dZ, cache):Implement the linear portion of backward propagation fora single layer (layer l)Arguments:dZ -- Gradient of the cost with respect to the linear output (of current layer l)cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layerReturns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as bA_prev, W, b cachem A_prev.shape[1]dA_prev np.dot(W.T, dZ)dW 1./ m * np.dot(dZ, A_prev.T)db 1./m * np.sum(dZ, axis1, keepdimsTrue)### END CODE HERE ###return dA_prev, dW, dbdef linear_activation_backward(dA, cache, activation):Implement the backward propagation for the LINEAR-ACTIVATION layer.Arguments:dA -- post-activation gradient for current layer l cache -- tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficientlyactivation -- the activation to be used in this layer, stored as a text string: sigmoid or reluReturns:dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prevdW -- Gradient of the cost with respect to W (current layer l), same shape as Wdb -- Gradient of the cost with respect to b (current layer l), same shape as blinear_cache, activation_cache cacheif activation relu:### START CODE HERE ### (≈ 2 lines of code)dZ sigmoid_backward(dA, activation_cache)dA_prev, dW, db linear_backward(dZ, linear_cache)### END CODE HERE ###elif activation sigmoid:### START CODE HERE ### (≈ 2 lines of code)dZ relu_backward(dA, activation_cache)dA_prev, dW, db linear_backward(dZ, linear_cache)### END CODE HERE ###return dA_prev, dW, dbdef L_model_backward(AL, Y, caches):Implement the backward propagation for the [LINEAR-RELU] * (L-1) - LINEAR - SIGMOID groupArguments:AL -- probability vector, output of the forward propagation (L_model_forward())Y -- true label vector (containing 0 if non-cat, 1 if cat)caches -- list of caches containing:every cache of linear_activation_forward() with relu (its caches[l], for l in range(L-1) i.e l 0...L-2)the cache of linear_activation_forward() with sigmoid (its caches[L-1])Returns:grads -- A dictionary with the gradientsgrads[dA str(l)] ...grads[dW str(l)] ...grads[db str(l)] ...grads {}L len(caches) # the number of layersm AL.shape[1]Y Y.reshape(AL.shape) # after this line, Y is the same shape as AL# Initializing the backpropagation### START CODE HERE ### (1 line of code)grads[dAstr(L)] - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))### END CODE HERE ###layer Lgrads[dAstr(layer-1)], grads[dWstr(layer)],grads[dbstr(layer)] linear_activation_backward(grads[dAstr(layer)], caches[layer-1],activation sigmoid)for l in reversed(range(L - 1)):layer l 1grads[dAstr(layer-1)],grads[dWstr(layer)],grads[dbstr(layer)] linear_activation_backward(grads[dAstr(layer)], caches[layer-1], activation relu)return grads更新参数 def update_parameters(parameters, grads, learning_rate):Update parameters using gradient descentArguments:parameters -- python dictionary containing your parameters grads -- python dictionary containing your gradients, output of L_model_backwardReturns:parameters -- python dictionary containing your updated parameters parameters[W str(l)] ... parameters[b str(l)] ...L len(parameters) // 2 # number of layers in the neural networkfor l in range(1, L1):parameters[Wstr(l)] parameters[Wstr(l)] - learning_rate * grads[dWstr(l)]parameters[bstr(l)] parameters[bstr(l)] - learning_rate * grads[dbstr(l)]return parameters组装模型 def L_layer_model(X, Y, layers_dims, learning_rate 0.0075,num_iterations 3000, print_costFalse):#lr was 0.009Implements a L-layer neural network: [LINEAR-RELU]*(L-1)-LINEAR-SIGMOID.Arguments:X -- data, numpy array of shape (number of examples, num_px * num_px * 3)Y -- true label vector (containing 0 if cat, 1 if non-cat),of shape (1, number of examples)layers_dims -- list containing the input size and eachlayer size, of length (number of layers 1).learning_rate -- learning rate of the gradient descent update rulenum_iterations -- number of iterations of the optimization loopprint_cost -- if True, it prints the cost every 100 stepsReturns:parameters -- parameters learnt by the model.They can then be used to predict.np.random.seed(1)costs []# Parameters initialization.### START CODE HERE ###parameters initialize_parameters_deep(layers_dims)### END CODE HERE #### Loop (gradient descent)for i in range(0, num_iterations):# Forward propagation: [LINEAR - RELU]*(L-1) - LINEAR - SIGMOID.### START CODE HERE ### (≈ 1 line of code)AL, caches L_model_forward(X, parameters)### END CODE HERE #### Compute cost.### START CODE HERE ### (≈ 1 line of code)cost compute_cost(AL, Y)### END CODE HERE #### Backward propagation.### START CODE HERE ### (≈ 1 line of code)grads L_model_backward(AL, Y, caches)### END CODE HERE #### Update parameters.### START CODE HERE ### (≈ 1 line of code)parameters update_parameters(parameters, grads, learning_rate)### END CODE HERE #### Print the cost every 100 training exampleif print_cost and i % 100 0:print (Cost after iteration %i: %f %(i, cost))if print_cost and i % 100 0:costs.append(cost)# plot the costplt.plot(np.squeeze(costs))plt.ylabel(cost)plt.xlabel(iterations (per tens))plt.title(Learning rate str(learning_rate))plt.show()return parameters3、问题及思考除了L层的后向运算的测试用例外其余各个环节及最后的结果都是正确的。我的代码的运行结果和测试用例的对比如下图所示可以看到运行结果是完全对不上的所以网上找了很多答案他们的代码与我的代码的不同之处都在L层的反向传播处 1来源 (2)来源就这两个答案来看他们的写法在我看来是错误的但是他们能对上答案而对不上我改成了它们的样子也对不上~所以对不上答案的问题可能在于我的测试用例我也没有去看他们的测试用例和我的是否一样反正就是这一个测试用例过不去后面的全对。证明我的实现是没有问题的。然后要说为什么它们的代码在我看来是不正确的我的代码如下很明显1处的公式得到的就是输出层L的激活值的导数而2和3每次求导都应该得到前一层的激活值的导数与当前层的W和b的导数如下图所示

查看全文

http://www.zqtcl.cn/news/402083/