tensorflow word2vec demo详解

2023-11-23 03:24:03

word2vec有CBOW与Skip-Gram模型

CBOW是根据上下文预测中间值，Skip-Gram则恰恰相反

本文首先介绍Skip-Gram模型，是基于tensorflow官方提供的一个demo，第二大部分是经过简单修改的CBOW模型，主要参考：

https://www.cnblogs.com/pinard/p/7160330.html

两部分以###########################为界限

好了，现在开始！！！！！！

###################################################################################################

tensorflow官方demo:

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/word2vec

（一）首先：就是导入一些包没什么可说的

import argparse
import os
import syscurrent_path = os.path.dirname(os.path.realpath(sys.argv[0]))parser = argparse.ArgumentParser()
parser.add_argument('--log_dir',type=str,default=os.path.join(current_path, 'log'),help='The log directory for TensorBoard summaries.')
FLAGS, unparsed = parser.parse_known_args()print(FLAGS)
print(unparsed)

（三）接下来是下载数据集（这里稍微做了一点修改）：

vocabulary_size = 50000def build_dataset(words, n_words):"""Process raw inputs into a dataset."""count = [['UNK', -1]]count.extend(collections.Counter(words).most_common(n_words - 1))dictionary = dict()for word, _ in count:dictionary[word] = len(dictionary)data = list()unk_count = 0for word in words:index = dictionary.get(word, 0)if index == 0:  # dictionary['UNK']unk_count += 1data.append(index)count[0][1] = unk_countreversed_dictionary = dict(zip(dictionary.values(), dictionary.keys()))return data, count, dictionary, reversed_dictionarydata, count, dictionary, reverse_dictionary = build_dataset(vocabulary, vocabulary_size)
del vocabulary  # Hint to reduce memory.
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10], [reverse_dictionary[i] for i in data[:10]])

其中下面是统计每个单词的词频，并选取前50000个词频较高的单词作为字典的备选词

extend追加一个列表

data_index = 0def generate_batch(batch_size, num_skips, skip_window):global data_indexassert batch_size % num_skips == 0assert num_skips <= 2 * skip_windowbatch = np.ndarray(shape=(batch_size), dtype=np.int32)labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)span = 2 * skip_window + 1  # [ skip_window target skip_window ]buffer = collections.deque(maxlen=span)  # pylint: disable=redefined-builtinif data_index + span > len(data):data_index = 0buffer.extend(data[data_index:data_index + span])data_index += spanfor i in range(batch_size // num_skips):context_words = [w for w in range(span) if w != skip_window]words_to_use = random.sample(context_words, num_skips)for j, context_word in enumerate(words_to_use):batch[i * num_skips + j] = buffer[skip_window]labels[i * num_skips + j, 0] = buffer[context_word]if data_index == len(data):buffer.extend(data[0:span])data_index = spanelse:buffer.append(data[data_index])data_index += 1# Backtrack a little bit to avoid skipping words in the end of a batchdata_index = (data_index + len(data) - span) % len(data)return batch, labelsbatch, labels = generate_batch(batch_size=8, num_skips=2, skip_window=1)
for i in range(8):print(batch[i], reverse_dictionary[batch[i]], '->', labels[i, 0],reverse_dictionary[labels[i, 0]])

batch_size：就是批次大小

num_skips：就是重复用一个单词的次数，比如 num_skips=2时，对于一句话：i love tensorflow very much ..........

当tensorflow被选为目标词时，在产生label时要利用tensorflow两次即：

tensorflow---》 love tensorflow---》 very

skip_window：是考虑左右上下文的个数，比如skip_window=1，就是在考虑上下文的时候，左面一个，右面一个

skip_window=2时，就是在考虑上下文的时候，左面两个，右面两个

span ：其实在分批次的过程中可以看做是一个固定大小的框框（比较流行的说法数滑动窗口）在不断移动，而这个框框的大小就是 span，可以看到span = 2 * skip_window + 1

buffer = collections.deque(maxlen=span)：就是申请了一个buffer（其实就是固定大小的窗口这里是3）即每次这个buffer队列中最多能容纳span个单词

所以过程应该是这样的：比如batch_size=6， num_skips=2，skip_window=1，data:

batch_size // num_skips=3,循环3次

( I am looking for the missing glass-shoes who has picked it up .............)

2 23 56 3 45 84 123 45 23 12 1 14 ...............

i=0时：2 ,23 ,56首先进入 buffer（ context_words = [w for w in range(span) if w != skip_window]的意思就是取窗口中不包括目标词的词即上下文），然后batch[i * num_skips + j] = buffer[skip_window]（skip_window=1，所以每次就是取窗口的中间数为目标词）即batch=23， labels[i * num_skips + j, 0] = buffer[context_word]就是取其上下文为labels即2和56

所以此时batch=[23,23] labels=[2,56](当然也可能是[2,56]，因为可能先取右边，后取左面)，同时data_index=3即单词for的位置

i=1时：data[data_index]进队列，即 buffer为 23,56,3 赋值后为：batch=[23,23,56,56] labels=[2,56,23,3](也可能是换一下顺序)

同时data_index=4即单词the

i=2时：data[data_index]进队列，即 buffer为 56,3,45 赋值后为：batch=[23,23,56,56,3,3] labels=[2,56,23,3,56,45](也可能是换一下顺序) 同时data_index=5即单词missing

至此循环结束，按要求取出大小为6的一个批次即：

batch=[23,23,56,56,3,3] labels=[2,56,23,3,56,45]

然后data_index = (data_index + len(data) - span) % len(data)即data_index回溯3个单位，回到 looking,因为global data_index

所以data_index全局变量，所以当在取下一个批次的时候，buffer从looking的位置开始装载，即从上一个批次结束的位置接着往下取batch和labels

（七）定义一些参数大小：

with graph.as_default():# Input data.with tf.name_scope('inputs'):train_inputs = tf.placeholder(tf.int32, shape=[batch_size])train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])valid_dataset = tf.constant(valid_examples, dtype=tf.int32)# Ops and variables pinned to the CPU because of missing GPU implementationwith tf.device('/cpu:0'):# Look up embeddings for inputs.with tf.name_scope('embeddings'):embeddings = tf.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))embed = tf.nn.embedding_lookup(embeddings, train_inputs)# Construct the variables for the NCE losswith tf.name_scope('weights'):nce_weights = tf.Variable(tf.truncated_normal([vocabulary_size, embedding_size],stddev=1.0 / math.sqrt(embedding_size)))with tf.name_scope('biases'):nce_biases = tf.Variable(tf.zeros([vocabulary_size]))# Compute the average NCE loss for the batch.# tf.nce_loss automatically draws a new sample of the negative labels each# time we evaluate the loss.# Explanation of the meaning of NCE loss:#   http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/with tf.name_scope('loss'):loss = tf.reduce_mean(tf.nn.nce_loss(weights=nce_weights,biases=nce_biases,labels=train_labels,inputs=embed,num_sampled=num_sampled,num_classes=vocabulary_size))# Add the loss value as a scalar to summary.tf.summary.scalar('loss', loss)# Construct the SGD optimizer using a learning rate of 1.0.with tf.name_scope('optimizer'):optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)# Compute the cosine similarity between minibatch examples and all embeddings.norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keepdims=True))normalized_embeddings = embeddings / normvalid_embeddings = tf.nn.embedding_lookup(normalized_embeddings,valid_dataset)similarity = tf.matmul(valid_embeddings, normalized_embeddings, transpose_b=True)# Merge all summaries.merged = tf.summary.merge_all()# Add variable initializer.init = tf.global_variables_initializer()# Create a saver.saver = tf.train.Saver()

这里可以分为两部分来看，一部分是训练Skip-gram模型的词向量，另一部分是计算余弦相似度，下面我们分开说：

首先看下tf.nn.embedding_lookup的API解释：

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/embedding_ops.py

def sigmoid_cross_entropy_with_logits(  # pylint: disable=invalid-name_sentinel=None,labels=None,logits=None,name=None):"""Computes sigmoid cross entropy given `logits`.Measures the probability error in discrete classification tasks in which eachclass is independent and not mutually exclusive.  For instance, one couldperform multilabel classification where a picture can contain both an elephantand a dog at the same time.For brevity, let `x = logits`, `z = labels`.  The logistic loss isz * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))= z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))= z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))= z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))= (1 - z) * x + log(1 + exp(-x))= x - x * z + log(1 + exp(-x))For x < 0, to avoid overflow in exp(-x), we reformulate the abovex - x * z + log(1 + exp(-x))= log(exp(x)) - x * z + log(1 + exp(-x))= - x * z + log(1 + exp(x))Hence, to ensure stability and avoid overflow, the implementation uses thisequivalent formulationmax(x, 0) - x * z + log(1 + exp(-abs(x)))`logits` and `labels` must have the same type and shape.Args:_sentinel: Used to prevent positional parameters. Internal, do not use.labels: A `Tensor` of the same type and shape as `logits`.logits: A `Tensor` of type `float32` or `float64`.name: A name for the operation (optional).Returns:A `Tensor` of the same shape as `logits` with the componentwiselogistic losses.Raises:ValueError: If `logits` and `labels` do not have the same shape."""# pylint: disable=protected-accessnn_ops._ensure_xent_args("sigmoid_cross_entropy_with_logits", _sentinel,labels, logits)# pylint: enable=protected-accesswith ops.name_scope(name, "logistic_loss", [logits, labels]) as name:logits = ops.convert_to_tensor(logits, name="logits")labels = ops.convert_to_tensor(labels, name="labels")try:labels.get_shape().merge_with(logits.get_shape())except ValueError:raise ValueError("logits and labels must have the same shape (%s vs %s)" %(logits.get_shape(), labels.get_shape()))# The logistic loss formula from above is#   x - x * z + log(1 + exp(-x))# For x < 0, a more numerically stable formula is#   -x * z + log(1 + exp(x))# Note that these two expressions can be combined into the following:#   max(x, 0) - x * z + log(1 + exp(-abs(x)))# To allow computing gradients at zero, we define custom versions of max and# abs functions.zeros = array_ops.zeros_like(logits, dtype=logits.dtype)cond = (logits >= zeros)relu_logits = array_ops.where(cond, logits, zeros)neg_abs_logits = array_ops.where(cond, -logits, logits)return math_ops.add(relu_logits - logits * labels,math_ops.log1p(math_ops.exp(neg_abs_logits)),name=name)

可以看到其实最关键的就是下面这个公式：

z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))

其实z * -log(x) + (1 - z) * -log(1 - x)就是交叉熵，对的，没看错这个函数其实就是将输入先sigmoid再计算交叉熵

如上所示最后化简结果为：x - x * z + log(1 + exp(-x))

这里考虑到当x<0时exp(-x)有可能溢出，所以当x<0时有- x * z + log(1 + exp(x))

最后综合两种情况归纳出：

max(x, 0) - x * z + log(1 + exp(-abs(x)))

关于交叉熵的概念可以看一下：

https://blog.csdn.net/rtygbwwwerr/article/details/50778098

（3）最后看一下_sum_rows

num_steps = 100001with tf.Session(graph=graph) as session:# Open a writer to write summaries.writer = tf.summary.FileWriter(FLAGS.log_dir, session.graph)# We must initialize all variables before we use them.init.run()print('Initialized')average_loss = 0for step in xrange(num_steps):batch_inputs, batch_labels = generate_batch(batch_size, num_skips,skip_window)feed_dict = {train_inputs: batch_inputs, train_labels: batch_labels}# Define metadata variable.run_metadata = tf.RunMetadata()# We perform one update step by evaluating the optimizer op (including it# in the list of returned values for session.run()# Also, evaluate the merged op to get all summaries from the returned "summary" variable.# Feed metadata variable to session for visualizing the graph in TensorBoard._, summary, loss_val = session.run([optimizer, merged, loss],feed_dict=feed_dict,run_metadata=run_metadata)average_loss += loss_val# Add returned summaries to writer in each step.writer.add_summary(summary, step)# Add metadata to visualize the graph for the last run.if step == (num_steps - 1):writer.add_run_metadata(run_metadata, 'step%d' % step)if step % 2000 == 0:if step > 0:average_loss /= 2000# The average loss is an estimate of the loss over the last 2000 batches.print('Average loss at step ', step, ': ', average_loss)average_loss = 0# Note that this is expensive (~20% slowdown if computed every 500 steps)if step % 10000 == 0:sim = similarity.eval()for i in xrange(valid_size):valid_word = reverse_dictionary[valid_examples[i]]top_k = 8  # number of nearest neighborsnearest = (-sim[i, :]).argsort()[1:top_k + 1]log_str = 'Nearest to %s:' % valid_wordfor k in xrange(top_k):close_word = reverse_dictionary[nearest[k]]log_str = '%s %s,' % (log_str, close_word)print(log_str)final_embeddings = normalized_embeddings.eval()# Write corresponding labels for the embeddings.with open(FLAGS.log_dir + '/metadata.tsv', 'w') as f:for i in xrange(vocabulary_size):f.write(reverse_dictionary[i] + '\n')# Save the model for checkpoints.saver.save(session, os.path.join(FLAGS.log_dir, 'model.ckpt'))# Create a configuration for visualizing embeddings with the labels in TensorBoard.config = projector.ProjectorConfig()embedding_conf = config.embeddings.add()embedding_conf.tensor_name = embeddings.nameembedding_conf.metadata_path = os.path.join(FLAGS.log_dir, 'metadata.tsv')projector.visualize_embeddings(writer, config)writer.close()

这部分源码简单易懂主要依次做了以下几件事：

（1）训练模型

（2）在训练最后一次，保存模型用以后续可视化

（3）每2000次，计算一次平均loss

（4）每10000次，打印（八）中随机选取16个词各自对应的与其最相近的8个词

（5）保存训练好的embeddings矩阵为.tsv格式用于在tensorboard通过降维来可视化

（6）保存模型为.ckpt

（十）二维可视化

def generate_batch(batch_size, cbow_window):global data_indexassert cbow_window % 2 == 1span = 2 * cbow_window + 1# 去除中心word: span - 1batch = np.ndarray(shape=(batch_size, span - 1), dtype=np.int32)labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)buffer = collections.deque(maxlen=span)for _ in range(span):buffer.append(data[data_index])# 循环选取 data中数据，到尾部则从头开始data_index = (data_index + 1) % len(data)for i in range(batch_size):# target at the center of spantarget = cbow_window# 仅仅需要知道context(word)而不需要wordtarget_to_avoid = [cbow_window]col_idx = 0for j in range(span):# 略过中心元素 wordif j == span // 2:continuebatch[i, col_idx] = buffer[j]col_idx += 1labels[i, 0] = buffer[target]# 更新 bufferbuffer.append(data[data_index])data_index = (data_index + 1) % len(data)return batch, labelsbatch, labels = generate_batch(batch_size=8, cbow_window=1)
for i in range(8):print(reverse_dictionary[batch[i,0]],'and',reverse_dictionary[batch[i,1]] ,'->', reverse_dictionary[labels[i, 0]])

with graph.as_default():# Input data.with tf.name_scope('inputs'):train_dataset = tf.placeholder(tf.int32, shape=[batch_size,2 * cbow_window])train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])valid_dataset = tf.constant(valid_examples, dtype=tf.int32)# Ops and variables pinned to the CPU because of missing GPU implementationwith tf.device('/cpu:0'):# Look up embeddings for inputs.with tf.name_scope('embeddings'):embeddings = tf.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))# Construct the variables for the NCE losswith tf.name_scope('weights'):nce_weights = tf.Variable(tf.truncated_normal([vocabulary_size, embedding_size],stddev=1.0 / math.sqrt(embedding_size)))with tf.name_scope('biases'):nce_biases = tf.Variable(tf.zeros([vocabulary_size]))embeds = Nonefor i in range(2 * cbow_window):embedding_i = tf.nn.embedding_lookup(embeddings, train_dataset[:,i])print('embedding %d shape: %s'%(i, embedding_i.get_shape().as_list()))emb_x,emb_y = embedding_i.get_shape().as_list()if embeds is None:embeds = tf.reshape(embedding_i, [emb_x,emb_y,1])else:embeds = tf.concat([embeds, tf.reshape(embedding_i, [emb_x, emb_y,1])], 2)print("Concat embedding size: %s"%embeds.get_shape().as_list())avg_embed =  tf.reduce_mean(embeds, 2, keep_dims=False)print("Avg embedding size: %s"%avg_embed.get_shape().as_list())print('--------------------------------------------------------------------------------------------')print(avg_embed.shape)print(train_labels.shape)print('--------------------------------------------------------------------------------------------')# Compute the average NCE loss for the batch.# tf.nce_loss automatically draws a new sample of the negative labels each# time we evaluate the loss.# Explanation of the meaning of NCE loss:#   http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/with tf.name_scope('loss'):loss = tf.reduce_mean(tf.nn.nce_loss(weights=nce_weights,biases=nce_biases,labels=train_labels,inputs=avg_embed,num_sampled=num_sampled,num_classes=vocabulary_size))# Add the loss value as a scalar to summary.tf.summary.scalar('loss', loss)# Construct the SGD optimizer using a learning rate of 1.0.with tf.name_scope('optimizer'):optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)# Compute the cosine similarity between minibatch examples and all embeddings.norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keepdims=True))normalized_embeddings = embeddings / normvalid_embeddings = tf.nn.embedding_lookup(normalized_embeddings,valid_dataset)similarity = tf.matmul(valid_embeddings, normalized_embeddings, transpose_b=True)# Merge all summaries.merged = tf.summary.merge_all()# Add variable initializer.init = tf.global_variables_initializer()# Create a saver.saver = tf.train.Saver()

这里的特殊之处在于怎么将batch和labels维数对应

以下假设embeding的特征数都是128

对于Skip-Gram模型模型

经过generate_batch后返回：

batch：batch_size

labels : batch_size*1

然后经过（八）的（train_inputs=batch） embed = tf.nn.embedding_lookup(embeddings, train_inputs)后

embed : batch_size*128

labels : batch_size*1

所以传给 tf.nn.nce_loss进行训练的

此时对于CBOW模型

经过generate_batch后返回：

batch：batch_size*span - 1（batch_size*2）

labels : batch_size*1

如果还是按照Skip-Gram模型则

经过（八）的（train_inputs=batch） embed = tf.nn.embedding_lookup(embeddings, train_inputs)后

会报错，而且原先Skip-Gram中目标词汇是一个，现在CBOW中目标词汇是两个

基于此本程序是进行如下处理来产生embed 的：

首先先遍历目标词汇的一个（i=0），通过 embedding_i = tf.nn.embedding_lookup(embeddings, train_dataset[:,i])

此时embedding_i维度为batch_size*128然后reshape为batch_size*128*1

然后将embedding_i赋给 embeds

接着再遍历目标词中剩下的一个（i=1）,通过 embedding_i = tf.nn.embedding_lookup(embeddings, train_dataset[:,i])

此时embedding_i维度为batch_size*128然后reshape为batch_size*128*1

然后再将embedding_i合并到embeds（合并的维度是2）此时embeds维度为batch_size*128*2

然后avg_embed = tf.reduce_mean(embeds, 2, keep_dims=False)后avg_embed维数为batch_size*128

最后将

avg_embed : batch_size*128

labels： batch_size*1

所以传给 tf.nn.nce_loss进行训练的

这样说不是很直观下面举个类子（batch_size=8）

即比如batch为[ [ 1 , 2 ] , [ 3 , 4 ], [ 5 , 6 ], [ 7 , 8 ], [ 9 , 10 ], [ 11 , 12 ], [ 13 , 14 ], [ 15 , 16 ]]

embeddings为 [ [ 1.1 , 1.2 ,1.3 ,.....................................................1.128]

[ 2.1 , 2.2 ,2.3 ,.....................................................2.128]

[ 3.1 , 3.2 ,3.3 ,.....................................................3.128]

................................

[ 50000.1 , 50000.2 ,50000.3 ,..................50000.128]

]

那么embeds为 [ [ [1.1 , 2.1] , [1.2 , 2.2] , [1.3 , 2.3] .......... , [1.128 , 2.128] ]

[ [3.1 , 4.1] , [3.2 , 4.2] , [3.3 , 4.3] .......... , [3.128 , 4.128] ]

[ [5.1 , 6.1] , [5.2 ,6.2] , [5.3 , 6.3] .......... , [5.128 , 6.128] ]

[ [7.1 , 8.1] , [7.2 , 8.2] , [7.3 , 8.3] .......... , [7.128 , 8.128] ]

[ [9.1 , 10.1] , [9.2 , 10.2] , [9.3 , 10.3] .......... , [9.128 , 10.128] ]

[ [11.1 , 12.1] , [11.2 , 12.2] , [11.3 , 12.3] .......... , [11.128 , 12.128] ]

[ [13.1 , 14.1] , [13.2 , 14.2] , [13.3 , 14.3] .......... , [13.128 , 14.128] ]

[ [15.1 , 16.1] , [15.2 , 16.2] , [15.3 , 16.3] .......... , [15.128 , 16.128] ]

]

avg_embed为

[ [ [ 3.2 ] , [ 3.4 ] , [ 3.6 ] .......... , [ 3.256 ] ]

[ [ 7.2 ] , [ 7.4 ] , [ 7.6 ] .......... , [ 7.256 ] ]

[ [ 11.2 ] , [ 11.4 ] , [11.6 ] .......... , [ 11.256 ] ]

.........................

[ [ 31.2 ] , [ 31.4 ] , [ 31.6 ] .......... , [ 31.256 ] ]

]

可以看出CBOW模型其实就是将目标词汇中的两个词对应的128个特征分别相加，整体作为一个输入的

其他代码就与Skip-Gram模型基本相同了

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

以下为运行结果的部分截图：

可以看到最后一步的loss为3.91,而Skip-Gram最后一步的loss为4.69，所以CBOW模型训练的结果更好一点，是因为其同时考虑了两个词去预测一个词，而Skip-Gram是用一个词去预测上下文

可视化也可以看到结果

tensorboard中的可视化：

记得点红色的部分就能看到三维的分布了

###################################################################################################

全部代码：https://github.com/Mryangkaitong/tensorflow/tree/master/word2vec

关于可视化，可以使用tensorboard内置的embedding工具进行可视化，可以参考：

https://blog.csdn.net/szj_huhu/article/details/75308970

http://www.360doc.com/content/17/0706/14/10408243_669327270.shtml

等等

或者将数据集制作成.csv，直接在tensorboard内置的embedding的工具中加载：

地址为：http://projector.tensorflow.org/

https://blog.csdn.net/u010099080/article/details/53560426?fps=1&locationNum=11

关于怎样制作.csv可以参考：

https://jingyan.baidu.com/article/9c69d48ff3123d13c9024e06.html

本文来自互联网用户投稿，文章观点仅代表作者本人，不代表本站立场，不承担相关法律责任。如若转载，请注明出处。 如若内容造成侵权/违法违规/事实不符，请点击【内容举报】进行投诉反馈！

标签：技术

上一篇 > 模拟登录百度做一些事
下一篇 > Selenium配合chromedriver爬取网页pythonscraping.com/pages/javascript/ajaxDemo.html遇到的问题记录

Duilib中list控件支持ctrl和shif多行选中的实现

[ICML2015]Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shif

win10系统微软输入法于eclipse ctrl+shif+f冲突间接处理办法

Codeforces Round #259 (Div. 2) B. Little Pony and Sort by Shif

读LDD3，内存映射与DMA--PAGE_SHIF…

VMware虚拟机安装XP【要先分区，再设置BOOT 启动CD，shif+上移】

更换iBus五笔的左与右Shif

sublime ctrl+shif+f 没用解决办法

idea 对 ctrl + z 的撤销是 ctrl + shif + z

计算机最早的设计师应用于,计算机应用基础选择题doc.doc

win10自带截图神器：Win+Shift+S

Python基础之文件目录操作

python简述目录_Python基础之文件目录操作(示例代码)

tp5 如何做数据采集

任务2-7(服务器字体+阿里巴巴矢量库)

html标签（1)：h1~h6,p,br,pre,hr

TI 电量计介绍与芯片选型指南

几款TI电源芯片简介

TI DSP芯片C2000系列读取FLASH数据

德州仪器(Ti)平台嵌入式开发基础

TI三相电机智能栅极驱动芯片特点分类

省选模拟（12.08） T3 圈圈圈圈圈圈圈圈

Hadoop生态圈技术栈（上）

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之6.Impala交互式查询

小猿圈之Linux下Mysql 操作命令

大数据Hadoop生态圈常用面试题

大数据开发基础入门与项目实战（三）Hadoop核心及生态圈技术栈之4.Hive DDL、DQL和数据操作

备战Noip2018模拟赛11（B组）T3 Monogatari 物语

【智能优化算法-圆圈搜索算法】基于圆圈搜索算法Circle Search Algorithm求解单目标优化问题附matlab代码

NYOJ 78 圈水池

递归问题跑道汽车绕圈问题 Python实现

Hadoop生态圈（三）：MapReduce

tensorflow word2vec demo详解

word2vec有CBOW与Skip-Gram模型

tensorflow官方demo:

（一）首先：就是导入一些包没什么可说的

（三）接下来是下载数据集（这里稍微做了一点修改）：

（七）定义一些参数大小：

（十）二维可视化

这里的特殊之处在于怎么将batch和labels维数对应

对于Skip-Gram模型模型

此时对于CBOW模型

可以看出CBOW模型其实就是将目标词汇中的两个词对应的128个特征分别相加，整体作为一个输入的

记得点红色的部分就能看到三维的分布了

相关文章