A Bidirectional Hierarchical Skip-Gram Model for Text Topic Embedding | |
Suncong Zheng; Hongyun Bao; Jiaming Xu; Yuexing Hao; Zhenyu Qi; Hongwei Hao | |
2016 | |
会议日期 | 2016 |
会议地点 | Canada |
英文摘要 | Taking advantage of the large scale corpus on the web to effectively and efficiently mine the topics within texts is an essential problem in the era of big data. We focus on the problem of learning text topic embedding in an unsupervised manner, which enjoys the properties of efficiency and scalability. Text topic embedding represents words and documents in a semantic topic space, in which the words and documents with similar topic will be embedded close to each other. When compared with con-ventional topic models, which implicitly capture the document-level word co-occurrence patterns, text topic embedding alleviates the data sparsity problem and captures the semantic relevance between different words and documents. To model text topic embedding, we propose a Bidirectional Hierarchical Skip-Gram model (BHSG) based on skip-gram model. BHSG includes two components: semantic generation module to learn semantic relevance between texts and topic enhance module to produce the text topic embedding based on text embedding learned in the former module. We evaluated our method on two kinds of topic-related tasks: text classification and information retrieval. The experimental results on four public datasets and one dataset we provide all demonstrate that our proposed method can achieve a better performance. |
会议录出版者 | IEEE |
会议录出版地 | Canada |
内容类型 | 会议论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/40650] |
专题 | 数字内容技术与服务研究中心_听觉模型与认知计算 |
作者单位 | CASIA |
推荐引用方式 GB/T 7714 | Suncong Zheng,Hongyun Bao,Jiaming Xu,et al. A Bidirectional Hierarchical Skip-Gram Model for Text Topic Embedding[C]. 见:. Canada. 2016. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论