赞
踩
该论文用NTN(Neural tensor networks)进行知识库中嵌入对象(Embedding)之间关系(Relation)的学习和预测(完善)。主要考虑了3个重要思路。
Triple(e1, R, e2), which means relation R keeps between e1 and e2. Model built to score the existing triples high and score negative samples low. Score function in this paper is constructed as a bilinear one, which consists of e1*M* e2, V*[e1, e2]t, b.
In my opinion, e1* M * e2 represents inter-operations, V*[e1, e2]’ represents linear combination, b is bias, where M, V and b are parameters of the model for specific relation.
Loss function is constructed as contrastive max-marginal distance. For existing(positive) triple (e1, R, e2), randomly sample e1 or e2 to be replaced with some randomly chosen entity e’ to form negative relation(e, R, e’) , then loss can be defined as
max(0, 1 - (g(e1, R, e2) - g(e, R, e’))) plus L2-regularization. g is the score function.
The combination of word information utilizes semantical information, which is useful to mine underlying text as new information sources.
It improves model performance more than NTN model does.
Pre-trained word embeddings as initialization, incorporates large scale corpus distributions, make results even better.
Information sources are important for models, not necessarily new ones, sometimes the ignored ones.
Vector relations can be modeled as inter-operations as bilinear functions, or as traditionally linear combinations. This can be understood as variable multiplication and summation ops.
Contrastive max-margin is a often-used loss function construction method for non-probabilistic problems
Negative sampling is widely used to construct negative samples
Word embeddings are import information sources for KB-like text based applications.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。