chainer.functions.negative_sampling

負例サンプリング損失関数。

自然言語処理において、特に、言語モデル化において、総語彙数は極めて大きくなり得ます。ゆえに、埋め込み行列の勾配計算には膨大な時間が必要になります

この負例サンプリングを用いる事によって、幾つかの標本化されたサンプル負例の勾配を計算するだけでよいことになります。

目的関数は下記のとおり:

f (x, p) = log σ (x ⊤ w p) + k E i \sim P (i) [log σ (- x ⊤ w i)],

ただし $σ (\cdot)$ はシグモイド関数、 $w_{i}$ iは単語 $i$ の荷重ベクトル、 $p$ は正例。.これは確率P(i)から標本化された $k$ の例 $N$ で近似される。

f (x, p) \approx log σ (x ⊤ w p) + \sum n \in N log σ (- x ⊤ w n) .

各 $N$ のサンプルは単語分布 $P (w)$ から求められる。これは、次のように計算される。 $P (w) = \frac{1}{Z} c (w)^{α}$ , ただし、 $c (w)$ は単語 $w$ のユニグラム(1-gram)数、 $α$ はハイパーパラメータ、 $Z$ は規格化定数。

Parameters:	x (Variable) – 入力ベクトルのバッチ t (Variable) – Vector of ground truth labels.グランドトゥルースラベルのベクトル。 W (Variable) – 荷重行列。 sampler (FunctionType) – 標本化関数。これは、 shapeを引数にとり、このshapeの整数配列を返す。この配列の各要素は単語分布からの標本である。単語の出現頻度のべき分布で構築された`WalkerAlias` オブジェクトが推奨されている。 sample_size (int) – サンプル数。 reduce (str) – 削減オプション。この値は `'sum'` か `'no`を設定しなければならない。それ以外では`ValueError` が発生する。
Returns:	A variable holding the loss value(s) calculated by the above equation. If `reduce` is `'no'`, the output variable holds array whose shape is same as one of (hence both of) input variables. If it is `'sum'`, the output variable holds a scalar value.
Return type:	Variable