当前位置：首页 > news >正文

深圳市建设设计院网站zencart 团购网站

news 2025/11/14 22:15:06

深圳市建设设计院网站,zencart 团购网站,wordpress主题怎么改,免费采集器 wordpress文章目录作业1#xff1a;1. 余弦相似度2. 单词类比3. 词向量纠偏3.1 消除对非性别词语的偏见3.2 性别词的均衡算法作业2#xff1a;Emojify表情生成1. Baseline model: Emojifier-V11.1 数据集1.2 模型预览1.3 实现 Emojifier-V11.4 在训练集上测试2. Emojifier-V2: Using L… 文章目录作业11. 余弦相似度2. 单词类比3. 词向量纠偏3.1 消除对非性别词语的偏见3.2 性别词的均衡算法作业2Emojify表情生成1. Baseline model: Emojifier-V11.1 数据集1.2 模型预览1.3 实现 Emojifier-V11.4 在训练集上测试2. Emojifier-V2: Using LSTMs in Keras2.1 模型预览2.2 Keras and mini-batching2.3 Embedding 层2.3 建立 Emojifier-V2测试题参考博文笔记W2.自然语言处理与词嵌入作业1 加载预训练的单词向量用 cos(θ)cos(\theta)cos(θ) 余弦夹角测量相似度使用词嵌入解决类比问题修改词嵌入降低性比歧视 import numpy as np from w2v_utils import *这个作业使用 50-维的 GloVe vectors 表示单词 words, word_to_vec_map read_glove_vecs(data/glove.6B.50d.txt)1. 余弦相似度 CosineSimilarity(u, v)u.v∣∣u∣∣2∣∣v∣∣2cos(θ)\text{CosineSimilarity(u, v)} \frac {u . v} {||u||_2 ||v||_2} cos(\theta)CosineSimilarity(u, v)∣∣u∣∣2∣∣v∣∣2u.vcos(θ) 其中 ∣∣u∣∣2∑i1nui2||u||_2 \sqrt{\sum_{i1}^{n} u_i^2}∣∣u∣∣2∑i1nui2 # GRADED FUNCTION: cosine_similaritydef cosine_similarity(u, v):Cosine similarity reflects the degree of similariy between u and vArguments:u -- a word vector of shape (n,) v -- a word vector of shape (n,)Returns:cosine_similarity -- the cosine similarity between u and v defined by the formula above.distance 0.0### START CODE HERE #### Compute the dot product between u and v (≈1 line)dot np.dot(u, v)# Compute the L2 norm of u (≈1 line)norm_u np.linalg.norm(u)# Compute the L2 norm of v (≈1 line)norm_v np.linalg.norm(v)# Compute the cosine similarity defined by formula (1) (≈1 line)cosine_similarity dot/(norm_u*norm_v)### END CODE HERE ###return cosine_similarity2. 单词类比例如男人女人 -- 国王王后 # GRADED FUNCTION: complete_analogydef complete_analogy(word_a, word_b, word_c, word_to_vec_map):Performs the word analogy task as explained above: a is to b as c is to ____. Arguments:word_a -- a word, stringword_b -- a word, stringword_c -- a word, stringword_to_vec_map -- dictionary that maps words to their corresponding vectors. Returns:best_word -- the word such that v_b - v_a is close to v_best_word - v_c, as measured by cosine similarity# convert words to lower caseword_a, word_b, word_c word_a.lower(), word_b.lower(), word_c.lower()### START CODE HERE #### Get the word embeddings v_a, v_b and v_c (≈1-3 lines)e_a, e_b, e_c word_to_vec_map[word_a],word_to_vec_map[word_b],word_to_vec_map[word_c]### END CODE HERE ###words word_to_vec_map.keys()max_cosine_sim -100 # Initialize max_cosine_sim to a large negative numberbest_word None # Initialize best_word with None, it will help keep track of the word to output# loop over the whole word vector setfor w in words: # to avoid best_word being one of the input words, pass on them.if w in [word_a, word_b, word_c] :continue### START CODE HERE #### Compute cosine similarity between the vector (e_b - e_a) and the vector ((ws vector representation) - e_c) (≈1 line)cosine_sim cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)# If the cosine_sim is more than the max_cosine_sim seen so far,# then: set the new max_cosine_sim to the current cosine_sim and the best_word to the current word (≈3 lines)if cosine_sim max_cosine_sim:max_cosine_sim cosine_simbest_word w### END CODE HERE ###return best_word测试 triads_to_try [(italy, italian, spain), (india, delhi, japan), (man, woman, boy), (small, smaller, large)] for triad in triads_to_try:print ({} - {} :: {} - {}.format( *triad, complete_analogy(*triad,word_to_vec_map)))输出 italy - italian :: spain - spanish india - delhi :: japan - tokyo man - woman :: boy - girl small - smaller :: large - larger额外测试 good - ok :: bad - oops糟糕 father - dad :: mother - mom3. 词向量纠偏研究反映在单词嵌入中的性别偏见并探索减少这种偏见的算法 g word_to_vec_map[woman] - word_to_vec_map[man] print(g)输出向量50维 [-0.087144 0.2182 -0.40986 -0.03922 -0.1032 0.94165-0.06042 0.32988 0.46144 -0.35962 0.31102 -0.868240.96006 0.01073 0.24337 0.08193 -1.02722 -0.211220.695044 -0.00222 0.29106 0.5053 -0.099454 0.404450.30181 0.1355 -0.0606 -0.07131 -0.19245 -0.06115-0.3204 0.07165 -0.13337 -0.25068714 -0.14293 -0.224957-0.149 0.048882 0.12191 -0.27362 -0.165476 -0.204260.54376 -0.271425 -0.10245 -0.32108 0.2516 -0.33455-0.04371 0.01258 ]print (List of names and their similarities with constructed vector:)# girls and boys name name_list [john, marie, sophie, ronaldo, priya, rahul, danielle, reza, katy, yasmin]for w in name_list:print (w, cosine_similarity(word_to_vec_map[w], g))输出 List of names and their similarities with constructed vector: john -0.23163356145973724 marie 0.315597935396073 sophie 0.31868789859418784 ronaldo -0.31244796850329437 priya 0.17632041839009402 rahul -0.16915471039231716 danielle 0.24393299216283895 reza -0.07930429672199553 katy 0.2831068659572615 yasmin 0.2331385776792876可以看出女性的名字往往与向量有正的余弦相似性而男性的名字往往有负的余弦相似性。结果似乎可以接受。试试其他的词语 print(Other words and their similarities:) word_list [lipstick, guns, science, arts, literature, warrior,doctor, tree, receptionist, technology, fashion, teacher, engineer, pilot, computer, singer] for w in word_list:print (w, cosine_similarity(word_to_vec_map[w], g))输出 Other words and their similarities: lipstick 0.2769191625638267 guns -0.1888485567898898 science -0.06082906540929701 arts 0.008189312385880337 literature 0.06472504433459932 warrior -0.20920164641125288 doctor 0.11895289410935041 tree -0.07089399175478091 receptionist 0.3307794175059374 technology -0.13193732447554302 fashion 0.03563894625772699 teacher 0.17920923431825664 engineer -0.0803928049452407 pilot 0.0010764498991916937 computer -0.10330358873850498 singer 0.1850051813649629这些结果反映了某些性别歧视。例如“computer 计算机”更接近“man 男人”“literature 文学”更接近“woman 女人”。下面看到如何使用Boliukbasi等人2016年提出的算法来减少这些向量的偏差。请注意有些词对如“演员”/“女演员”或“祖母”/“祖父”应保持性别特异性而其他词如“接待员”或“技术”应保持中立即与性别无关。纠偏时你必须区别对待这两种类型的单词 3.1 消除对非性别词语的偏见 ebias_componente⋅g∣∣g∣∣22∗ge^{bias\_component} \frac{e \cdot g}{||g||_2^2} * gebias_component∣∣g∣∣22e⋅g∗g edebiasede−ebias_componente^{debiased} e - e^{bias\_component}edebiasede−ebias_component def neutralize(word, g, word_to_vec_map):Removes the bias of word by projecting it on the space orthogonal to the bias axis. This function ensures that gender neutral words are zero in the gender subspace.Arguments:word -- string indicating the word to debiasg -- numpy-array of shape (50,), corresponding to the bias axis (such as gender)word_to_vec_map -- dictionary mapping words to their corresponding vectors.Returns:e_debiased -- neutralized word vector representation of the input word### START CODE HERE #### Select word vector representation of word. Use word_to_vec_map. (≈ 1 line)e word_to_vec_map[word]# Compute e_biascomponent using the formula give above. (≈ 1 line)e_biascomponent np.dot(e, g)/np.linalg.norm(g)**2*g# Neutralize e by substracting e_biascomponent from it # e_debiased should be equal to its orthogonal projection. (≈ 1 line)e_debiased e - e_biascomponent### END CODE HERE ###return e_debiased测试 e receptionist print(cosine similarity between e and g, before neutralizing: , cosine_similarity(word_to_vec_map[receptionist], g))e_debiased neutralize(receptionist, g, word_to_vec_map) print(cosine similarity between e and g, after neutralizing: , cosine_similarity(e_debiased, g))输出 cosine similarity between receptionist and g, before neutralizing: 0.3307794175059374 cosine similarity between receptionist and g, after neutralizing: -2.099120994400013e-17纠偏以后receptionist接待员与性别的相似度接近于 0既不偏向男人也不偏向女人 3.2 性别词的均衡算法如何将纠偏应用于单词对例如“女演员”和“演员”。均衡化应用只希望通过性别属性而有所不同的单词对。作为一个具体的例子假设“女演员”比“演员”更接近“保姆”通过对“保姆”进行中性化我们可以减少与保姆相关的性别刻板印象。但这仍然不能保证“演员”和“女演员”与“保姆”的距离相等均衡算法可以处理这一点。 μew1ew22\mu \frac{e_{w1} e_{w2}}{2}μ2ew1ew2 μBμ⋅bias_axis∣∣bias_axis∣∣22∗bias_axis\mu_{B} \frac {\mu \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}μB∣∣bias_axis∣∣22μ⋅bias_axis∗bias_axis μ⊥μ−μB\mu_{\perp} \mu - \mu_{B}μ⊥μ−μB ew1Bew1⋅bias_axis∣∣bias_axis∣∣22∗bias_axise_{w1B} \frac {e_{w1} \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}ew1B∣∣bias_axis∣∣22ew1⋅bias_axis∗bias_axis ew2Bew2⋅bias_axis∣∣bias_axis∣∣22∗bias_axise_{w2B} \frac {e_{w2} \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}ew2B∣∣bias_axis∣∣22ew2⋅bias_axis∗bias_axis ew1Bcorrected∣1−∣∣μ⊥∣∣22∣∗ew1B−μB∣(ew1−μ⊥)−μB)∣e_{w1B}^{corrected} \sqrt{ |{1 - ||\mu_{\perp} ||^2_2} |} * \frac{e_{\text{w1B}} - \mu_B} {|(e_{w1} - \mu_{\perp}) - \mu_B)|}ew1Bcorrected∣1−∣∣μ⊥∣∣22∣∗∣(ew1−μ⊥)−μB)∣ew1B−μB ew2Bcorrected∣1−∣∣μ⊥∣∣22∣∗ew2B−μB∣(ew2−μ⊥)−μB)∣e_{w2B}^{corrected} \sqrt{ |{1 - ||\mu_{\perp} ||^2_2} |} * \frac{e_{\text{w2B}} - \mu_B} {|(e_{w2} - \mu_{\perp}) - \mu_B)|}ew2Bcorrected∣1−∣∣μ⊥∣∣22∣∗∣(ew2−μ⊥)−μB)∣ew2B−μB e1ew1Bcorrectedμ⊥e_1 e_{w1B}^{corrected} \mu_{\perp}e1ew1Bcorrectedμ⊥ e2ew2Bcorrectedμ⊥e_2 e_{w2B}^{corrected} \mu_{\perp}e2ew2Bcorrectedμ⊥ def equalize(pair, bias_axis, word_to_vec_map):Debias gender specific words by following the equalize method described in the figure above.Arguments:pair -- pair of strings of gender specific words to debias, e.g. (actress, actor) bias_axis -- numpy-array of shape (50,), vector corresponding to the bias axis, e.g. genderword_to_vec_map -- dictionary mapping words to their corresponding vectorsReturnse_1 -- word vector corresponding to the first worde_2 -- word vector corresponding to the second word### START CODE HERE #### Step 1: Select word vector representation of word. Use word_to_vec_map. (≈ 2 lines)w1, w2 pair[0], pair[1]e_w1, e_w2 word_to_vec_map[w1], word_to_vec_map[w2]# Step 2: Compute the mean of e_w1 and e_w2 (≈ 1 line)mu (e_w1e_w2)/2# Step 3: Compute the projections of mu over the bias axis and the orthogonal axis (≈ 2 lines)mu_B np.dot(mu, bias_axis)/np.linalg.norm(bias_axis)**2*bias_axismu_orth mu-mu_B# Step 4: Use equations (7) and (8) to compute e_w1B and e_w2B (≈2 lines)e_w1B np.dot(e_w1,bias_axis)/np.linalg.norm(bias_axis)**2*bias_axise_w2B np.dot(e_w2,bias_axis)/np.linalg.norm(bias_axis)**2*bias_axis# Step 5: Adjust the Bias part of e_w1B and e_w2B using the formulas (9) and (10) given above (≈2 lines)corrected_e_w1B np.sqrt(np.abs(1-np.linalg.norm(mu_orth)**2))*np.divide((e_w1B-mu_B),np.abs(e_w1-mu_orth-mu_B))corrected_e_w2B np.sqrt(np.abs(1-np.linalg.norm(mu_orth)**2))*np.divide((e_w2B-mu_B),np.abs(e_w2-mu_orth-mu_B))# Step 6: Debias by equalizing e1 and e2 to the sum of their corrected projections (≈2 lines)e1 corrected_e_w1Bmu_orthe2 corrected_e_w2Bmu_orth### END CODE HERE ###return e1, e2测试 print(cosine similarities before equalizing:) print(cosine_similarity(word_to_vec_map[\man\], gender) , cosine_similarity(word_to_vec_map[man], g)) print(cosine_similarity(word_to_vec_map[\woman\], gender) , cosine_similarity(word_to_vec_map[woman], g)) print() e1, e2 equalize((man, woman), g, word_to_vec_map) print(cosine similarities after equalizing:) print(cosine_similarity(e1, gender) , cosine_similarity(e1, g)) print(cosine_similarity(e2, gender) , cosine_similarity(e2, g))输出 cosine similarities before equalizing: cosine_similarity(word_to_vec_map[man], gender) -0.11711095765336832 cosine_similarity(word_to_vec_map[woman], gender) 0.35666618846270376cosine similarities after equalizing: cosine_similarity(e1, gender) -0.7165727525843935 cosine_similarity(e2, gender) 0.7396596474928909平衡以后相似度符号相反数值接近作业2Emojify表情生成使用 word vector representations 建立 Emojifier 让你的消息更有表现力使用单词向量的话可以是你的单词没有在该表情的关联里面也能学习到可以使用该表情。导入一些包 import numpy as np from emo_utils import * import emoji import matplotlib.pyplot as plt%matplotlib inline1. Baseline model: Emojifier-V1 1.1 数据集 X127个句子字符串 Y整型标签 0 - 4 是相关的句子的表情加载数据集训练集127个样本测试集56个样本 X_train, Y_train read_csv(data/train_emoji.csv) X_test, Y_test read_csv(data/tesss.csv)maxLen len(max(X_train, keylen).split()) print(max(X_train, keylen).split())输出 [I, am, so, impressed, by, your, dedication, to, this, project]最长的句子是10个单词查看数据集 index 3 print(X_train[index], label_to_emoji(Y_train[index]))输出 Miss you so much ❤️ 1.2 模型预览为了方便把 Y 的形状从 (m,1)(m,1)(m,1) 改成 one-hot 表示 (m,5)(m,5)(m,5) Y_oh_train convert_to_one_hot(Y_train, C 5) Y_oh_test convert_to_one_hot(Y_test, C 5)index 52 print(Y_train[index], is converted into one hot, Y_oh_train[index])输出 3 is converted into one hot [0. 0. 0. 1. 0.]1.3 实现 Emojifier-V1 使用预训练的 50-dimensional GloVe embeddings word_to_index, index_to_word, word_to_vec_map read_glove_vecs(data/glove.6B.50d.txt)检查下是否正确 word cucumber index 289846 print(the index of, word, in the vocabulary is, word_to_index[word]) print(the, str(index) th word in the vocabulary is, index_to_word[index])输出 the index of cucumber in the vocabulary is 113317 the 289846th word in the vocabulary is potatos实现 sentence_to_avg() 转换每个句子为小写并切分成单词每个句子的单词使用 GloVe 向量表示然后求句子的平均 # GRADED FUNCTION: sentence_to_avgdef sentence_to_avg(sentence, word_to_vec_map):Converts a sentence (string) into a list of words (strings). Extracts the GloVe representation of each wordand averages its value into a single vector encoding the meaning of the sentence.Arguments:sentence -- string, one training example from Xword_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representationReturns:avg -- average vector encoding information about the sentence, numpy-array of shape (50,)### START CODE HERE #### Step 1: Split sentence into list of lower case words (≈ 1 line)words sentence.lower().split()# Initialize the average word vector, should have the same shape as your word vectors.avg np.zeros(word_to_vec_map[words[0]].shape)# Step 2: average the word vectors. You can loop over the words in the list words.for w in words:avg word_to_vec_map[w]avg / len(words)### END CODE HERE ###return avg测试 avg sentence_to_avg(Morrocan couscous is my favorite dish, word_to_vec_map) print(avg , avg)输出 avg [-0.008005 0.56370833 -0.50427333 0.258865 0.55131103 0.03104983-0.21013718 0.16893933 -0.09590267 0.141784 -0.15708967 0.185258670.6495785 0.38371117 0.21102167 0.11301667 0.02613967 0.260377670.05820667 -0.01578167 -0.12078833 -0.02471267 0.4128455 0.51520610.38756167 -0.898661 -0.535145 0.33501167 0.68806933 -0.21562651.797155 0.10476933 -0.36775333 0.750785 0.10282583 0.348925-0.27262833 0.66768 -0.10706167 -0.283635 0.59580117 0.28747333-0.3366635 0.23393817 0.34349183 0.178405 0.1166155 -0.0764330.1445417 0.09808667]模型用sentence_to_avg() 处理完以后进行前向传播、计算损失、后向传播更新参数 z(i)W.avg(i)bz^{(i)} W . avg^{(i)} bz(i)W.avg(i)b a(i)softmax(z(i))a^{(i)} softmax(z^{(i)})a(i)softmax(z(i)) L(i)−∑k0ny−1Yohk(i)∗log(ak(i))\mathcal{L}^{(i)} - \sum_{k 0}^{n_y - 1} Yoh^{(i)}_k * log(a^{(i)}_k)L(i)−k0∑ny−1Yohk(i)∗log(ak(i)) # GRADED FUNCTION: modeldef model(X, Y, word_to_vec_map, learning_rate 0.01, num_iterations 400):Model to train word vector representations in numpy.Arguments:X -- input data, numpy array of sentences as strings, of shape (m, 1)Y -- labels, numpy array of integers between 0 and 7, numpy-array of shape (m, 1)word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representationlearning_rate -- learning_rate for the stochastic gradient descent algorithmnum_iterations -- number of iterationsReturns:pred -- vector of predictions, numpy-array of shape (m, 1)W -- weight matrix of the softmax layer, of shape (n_y, n_h)b -- bias of the softmax layer, of shape (n_y,)np.random.seed(1)# Define number of training examplesm Y.shape[0] # number of training examplesn_y 5 # number of classes n_h 50 # dimensions of the GloVe vectors # Initialize parameters using Xavier initializationW np.random.randn(n_y, n_h) / np.sqrt(n_h)b np.zeros((n_y,))# Convert Y to Y_onehot with n_y classesY_oh convert_to_one_hot(Y, C n_y) # Optimization loopfor t in range(num_iterations): # Loop over the number of iterationsfor i in range(m): # Loop over the training examples### START CODE HERE ### (≈ 4 lines of code)# Average the word vectors of the words from the ith training exampleavg sentence_to_avg(X[i], word_to_vec_map)# Forward propagate the avg through the softmax layerz np.dot(W, avg)ba softmax(z)# Compute cost using the ith training labels one hot representation and A (the output of the softmax)cost - sum(Y_oh[i]*np.log(a))### END CODE HERE #### Compute gradients dz a - Y_oh[i]dW np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))db dz# Update parameters with Stochastic Gradient DescentW W - learning_rate * dWb b - learning_rate * dbif t % 100 0:print(Epoch: str(t) --- cost str(cost))pred predict(X, Y, W, b, word_to_vec_map)return pred, W, b1.4 在训练集上测试 print(Training set:) pred_train predict(X_train, Y_train, W, b, word_to_vec_map) print(Test set:) pred_test predict(X_test, Y_test, W, b, word_to_vec_map)输出 Training set: Accuracy: 0.9772727272727273 Test set: Accuracy: 0.8571428571428571随机猜测的话平均概率是 20%1/5模型的效果很不错在只有127个训练样本的情况下让我们来测试我们在训练集里看到了 I love you 有标签 ❤️我们来检查下使用 adore爱慕该词没有在训练集出现过 X_my_sentences np.array([i adore you, i love you, funny lol, lets play with a ball, food is ready, not feeling happy]) Y_my_labels np.array([[0], [0], [2], [1], [4],[3]])pred predict(X_my_sentences, Y_my_labels , W, b, word_to_vec_map) print_predictions(X_my_sentences, pred)输出 Accuracy: 0.83333333333333345/6最后一个错了 i adore you ❤️adore 跟 love 有相似的 embedding i love you ❤️ funny lol lets play with a ball ⚾ food is ready not feeling happy 识别错误不能发现 not 这类组合词检查错误打印混淆矩阵可以帮助了解哪些样本模型预测不准。一个混淆矩阵显示了一个标签是一个类真实标签的例子被算法用不同的类预测错误错误标记的频率 print(Y_test.shape) print( label_to_emoji(0) label_to_emoji(1) label_to_emoji(2) label_to_emoji(3) label_to_emoji(4)) print(pd.crosstab(Y_test, pred_test.reshape(56,), rownames[Actual], colnames[Predicted], marginsTrue)) plot_confusion_matrix(Y_test, pred_test)2. Emojifier-V2: Using LSTMs in Keras 让我们构建一个LSTM模型它将单词序列作为输入。这个模型将能够考虑单词顺序。 Emojifier-V2 将继续使用预先训练过的 word embeddings 来表示单词将把它们输入LSTMLSTM的任务是预测最合适的表情符号。导入一些包 import numpy as np np.random.seed(0) from keras.models import Model from keras.layers import Dense, Input, Dropout, LSTM, Activation from keras.layers.embeddings import Embedding from keras.preprocessing import sequence from keras.initializers import glorot_uniform np.random.seed(1)2.1 模型预览 2.2 Keras and mini-batching 为了使样本能够批量训练我们必须处理句子使他们的长度都一样长长度不够最大长度的后面补上一些 0 向量 (ei,elove,eyou,0⃗,0⃗,…,0⃗)(e_{i}, e_{love}, e_{you}, \vec{0}, \vec{0}, \ldots, \vec{0})(ei,elove,eyou,0,0,…,0) 2.3 Embedding 层 https://keras.io/zh/layers/embeddings/ 先把所有句子的单词对应的 idx 填好 # GRADED FUNCTION: sentences_to_indicesdef sentences_to_indices(X, word_to_index, max_len):Converts an array of sentences (strings) into an array of indices corresponding to words in the sentences.The output shape should be such that it can be given to Embedding() (described in Figure 4). Arguments:X -- array of sentences (strings), of shape (m, 1)word_to_index -- a dictionary containing the each word mapped to its indexmax_len -- maximum number of words in a sentence. You can assume every sentence in X is no longer than this. Returns:X_indices -- array of indices corresponding to words in the sentences from X, of shape (m, max_len)m X.shape[0] # number of training examples### START CODE HERE #### Initialize X_indices as a numpy matrix of zeros and the correct shape (≈ 1 line)X_indices np.zeros((m, max_len))for i in range(m): # loop over training examples# Convert the ith training sentence in lower case and split is into words. You should get a list of words.sentence_words X[i].lower().split()# Initialize j to 0j 0# Loop over the words of sentence_wordsfor w in sentence_words:# Set the (i,j)th entry of X_indices to the index of the correct word.X_indices[i, j] word_to_index[w]# Increment j to j 1j j1### END CODE HERE ###return X_indices实现 pretrained_embedding_layer() 初始化词嵌入矩阵注意 shape填充词嵌入矩阵从word_to_vec_map里抽取定义 Keras embedding 层注意设置trainable False使之不可被训练如果为True则允许算法修改词嵌入的值将嵌入权重设置为与嵌入矩阵相等 # GRADED FUNCTION: pretrained_embedding_layerdef pretrained_embedding_layer(word_to_vec_map, word_to_index):Creates a Keras Embedding() layer and loads in pre-trained GloVe 50-dimensional vectors.Arguments:word_to_vec_map -- dictionary mapping words to their GloVe vector representation.word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)Returns:embedding_layer -- pretrained layer Keras instancevocab_len len(word_to_index) 1 # adding 1 to fit Keras embedding (requirement)emb_dim word_to_vec_map[cucumber].shape[0] # define dimensionality of your GloVe word vectors ( 50)### START CODE HERE #### Initialize the embedding matrix as a numpy array of zeros of shape (vocab_len, dimensions of word vectors emb_dim)emb_matrix np.zeros((vocab_len, emb_dim))# Set each row index of the embedding matrix to be the word vector representation of the indexth word of the vocabularyfor word, index in word_to_index.items():emb_matrix[index, :] word_to_vec_map[word]# Define Keras embedding layer with the correct output/input sizes, make it trainable. Use Embedding(...). Make sure to set trainableFalse. embedding_layer Embedding(vocab_len, emb_dim, trainableFalse)### END CODE HERE #### Build the embedding layer, it is required before setting the weights of the embedding layer. Do not modify the None.embedding_layer.build((None,))# Set the weights of the embedding layer to the embedding matrix. Your layer is now pretrained.embedding_layer.set_weights([emb_matrix])return embedding_layer2.3 建立 Emojifier-V2 https://keras.io/zh/layers/core/#input https://keras.io/zh/layers/embeddings/#embedding https://keras.io/zh/layers/recurrent/#lstm https://keras.io/zh/layers/core/#dropout https://keras.io/zh/layers/core/#dense https://keras.io/zh/activations/ https://keras.io/zh/models/about-keras-models/#model # GRADED FUNCTION: Emojify_V2def Emojify_V2(input_shape, word_to_vec_map, word_to_index):Function creating the Emojify-v2 models graph.Arguments:input_shape -- shape of the input, usually (max_len,)word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representationword_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)Returns:model -- a model instance in Keras### START CODE HERE #### Define sentence_indices as the input of the graph, it should be of shape input_shape and dtype int32 (as it contains indices).sentence_indices Input(input_shape, dtypeint32)# Create the embedding layer pretrained with GloVe Vectors (≈1 line)embedding_layer pretrained_embedding_layer(word_to_vec_map, word_to_index)# Propagate sentence_indices through your embedding layer, you get back the embeddingsembeddings embedding_layer(sentence_indices)# Propagate the embeddings through an LSTM layer with 128-dimensional hidden state# Be careful, the returned output should be a batch of sequences.X LSTM(128,return_sequencesTrue)(embeddings)# Add dropout with a probability of 0.5X Dropout(rate0.5)(X)# Propagate X trough another LSTM layer with 128-dimensional hidden state# Be careful, the returned output should be a single hidden state, not a batch of sequences.X LSTM(128, return_sequencesFalse)(X)# Add dropout with a probability of 0.5X Dropout(rate0.5)(X)# Propagate X through a Dense layer with softmax activation to get back a batch of 5-dimensional vectors.X Dense(5)(X)# Add a softmax activationX Activation(softmax)(X)# Create Model instance which converts sentence_indices into X.model Model(inputssentence_indices, outputsX)### END CODE HERE ###return model创建模型 model Emojify_V2((maxLen,), word_to_vec_map, word_to_index) model.summary()输出 Model: model_1 _________________________________________________________________ Layer (type) Output Shape Param # input_3 (InputLayer) (None, 10) 0 _________________________________________________________________ embedding_4 (Embedding) (None, 10, 50) 20000050 _________________________________________________________________ lstm_3 (LSTM) (None, 10, 128) 91648 _________________________________________________________________ dropout_1 (Dropout) (None, 10, 128) 0 _________________________________________________________________ lstm_4 (LSTM) (None, 128) 131584 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 5) 645 _________________________________________________________________ activation_1 (Activation) (None, 5) 0 Total params: 20,223,927 Trainable params: 223,877 Non-trainable params: 20,000,050 注400,001个单词*50词向量维度 _________________________________________________________________配置模型 model.compile(losscategorical_crossentropy, optimizeradam, metrics[accuracy])训练模型转换 XY 的格式 X_train_indices sentences_to_indices(X_train, word_to_index, maxLen) Y_train_oh convert_to_one_hot(Y_train, C 5)训练 model.fit(X_train_indices, Y_train_oh, epochs 50, batch_size 32, shuffleTrue)输出 WARNING:tensorflow:From c:\program files\python37\lib\site-packages\keras\backend\tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.Epoch 1/50 132/132 [] - 1s 5ms/step - loss: 1.6088 - accuracy: 0.1970 Epoch 2/50 132/132 [] - 0s 582us/step - loss: 1.5221 - accuracy: 0.3636 Epoch 3/50 132/132 [] - 0s 574us/step - loss: 1.4762 - accuracy: 0.3939 (省略) Epoch 49/50 132/132 [] - 0s 597us/step - loss: 0.0115 - accuracy: 1.0000 Epoch 50/50 132/132 [] - 0s 582us/step - loss: 0.0182 - accuracy: 0.9924在训练集上的准确率几乎 100% 在测试集上测试 X_test_indices sentences_to_indices(X_test, word_to_index, max_len maxLen) Y_test_oh convert_to_one_hot(Y_test, C 5) loss, acc model.evaluate(X_test_indices, Y_test_oh) print() print(Test accuracy , acc)输出 56/56 [] - 0s 2ms/stepTest accuracy 0.875测试集上准确率为 87.5% 查看预测错误的样本 # This code allows you to see the mislabelled examples C 5 y_test_oh np.eye(C)[Y_test.reshape(-1)] X_test_indices sentences_to_indices(X_test, word_to_index, maxLen) pred model.predict(X_test_indices) for i in range(len(X_test)):x X_test_indicesnum np.argmax(pred[i])if(num ! Y_test[i]):print(Expected emoji: label_to_emoji(Y_test[i]) prediction: X_test[i] label_to_emoji(num).strip())输出 Expected emoji: prediction: work is hard Expected emoji: prediction: This girl is messing with me ❤️ Expected emoji: prediction: work is horrible Expected emoji: prediction: any suggestions for dinner Expected emoji: prediction: you brighten my day ❤️ Expected emoji: prediction: go away ⚾ Expected emoji: prediction: I did not have breakfast ❤️ 用自己的例子测试 # Change the sentence below to see your prediction. Make sure all the words are in the Glove embeddings. x_test np.array([not feeling happy]) X_test_indices sentences_to_indices(x_test, word_to_index, maxLen) print(x_test[0] label_to_emoji(np.argmax(model.predict(X_test_indices))))not feeling happy 这次 LSTM 可以预测 not 这类的组合词了 not very happy very happy i really love my wife ❤️ 总结如果你有一个训练集很小的NLP任务使用单词嵌入可以显著地帮助你的算法。单词嵌入允许模型处理测试集中没有出现在训练集中的单词在Keras和大多数其他深度学习框架中中训练序列模型需要一些重要的细节要使用 mini-batches需要填充序列以便 mini-batches 中的所有样本具有相同的长度“Embedding()” 层可以用预先训练的值初始化。这些值可以是固定的也可以在数据集中进一步训练。如果数据集很小就不要接着训练了效果不大LSTM() 有一个名为“return_sequences”的标志用于决定是返回每个隐藏状态还是只返回最后一个隐藏状态可以在LSTM() 之后使用Dropout()来正则化网络本文地址https://michael.blog.csdn.net/article/details/108902060 我的CSDN博客地址 https://michael.blog.csdn.net/ 长按或扫码关注我的公众号Michael阿明一起加油、一起学习进步

查看全文

http://www.zqtcl.cn/news/211594/