Seq2Set2Seq: A Two-stage Disentangled Method for Reply Keyword Generation in Social Media
Liu, Jie3,4; Li, Yaguang6; He, Shizhu1; Wu, Shun1; Liu, Kang1; Liu, Shenping2; Wang, Jiong6; Zhang, Qing4,5
刊名ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING
2024-03-01
卷号23期号:3页码:20
关键词Social media reply keyword prediction text generation multi-label classification determinantal point processes
ISSN号2375-4699
DOI10.1145/3644074
通讯作者Zhang, Qing(zqicl@ncut.edu.cn)
英文摘要Social media produces large amounts of content every day. How to predict the potential influences of the contents from a social reply feedback perspective is a key issue that has not been explored. Thus, we propose a novel task named reply keyword prediction in social media, which aims to predict the keywords in the potential replies in as many aspects as possible. One prerequisite challenge is that the accessible social media datasets labeling such keywords remain absent. To solve this issue, we propose a new dataset,1 to study the reply keyword prediction in social media. This task could be seen as a single-turn dialogue keyword prediction for open-domain dialogue system. However, existing methods for dialogue keyword prediction cannot be adopted directly, which has two main drawbacks. First, they do not provide an explicit mechanism to model topic complementarity between keywords which is crucial in social media to controllably model all aspects of replies. Second, the collocations of keywords are not explicitly modeled, which also makes it less controllable to optimize for fine-grained prediction since the context information is much less than that in dialogue. To address these issues, we propose a two-stage disentangled framework, which can optimize the complementarity and collocation explicitly in a disentangled fashion. In the first stage, we use a sequence-to-set paradigm via multi-label prediction and determinantal point processes, to generate a set of keyword seeds satisfying the complementarity. In the second stage, we adopt a set-to-sequence paradigm via seq2seq model with the keyword seeds guidance from the set, to generate the more-fine-grained keywords with collocation. Experiments show that this method can generate not only a more diverse set of keywords but also more relevant and consistent keywords. Furthermore, the keywords obtained based on this method can achieve better reply generation results in the retrieval-based system than others.
资助项目National Key Research and Development Program of China[2020AAA0109703] ; National Natural Science Foundation of China[62076167] ; Beijing Municipal Education Commission-Beijing Natural Fund Joint Funding Project[KZ201910028039]
WOS研究方向Computer Science
语种英语
出版者ASSOC COMPUTING MACHINERY
WOS记录号WOS:001208772400007
资助机构National Key Research and Development Program of China ; National Natural Science Foundation of China ; Beijing Municipal Education Commission-Beijing Natural Fund Joint Funding Project
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/56993]  
专题复杂系统认知与决策实验室
通讯作者Zhang, Qing
作者单位1.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
2.Beijing Unisound Informat Technol Co Ltd, Beijing, Peoples R China
3.Capital Normal Univ, China Language Intelligence Res Ctr, Beijing, Peoples R China
4.North China Univ Technol, Sch Informat Sci, Beijing, Peoples R China
5.CNONIX Natl Standard Applicat & Promot Lab, Beijing, Peoples R China
6.Capital Normal Univ, Coll Informat Engn, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Liu, Jie,Li, Yaguang,He, Shizhu,et al. Seq2Set2Seq: A Two-stage Disentangled Method for Reply Keyword Generation in Social Media[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING,2024,23(3):20.
APA Liu, Jie.,Li, Yaguang.,He, Shizhu.,Wu, Shun.,Liu, Kang.,...&Zhang, Qing.(2024).Seq2Set2Seq: A Two-stage Disentangled Method for Reply Keyword Generation in Social Media.ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING,23(3),20.
MLA Liu, Jie,et al."Seq2Set2Seq: A Two-stage Disentangled Method for Reply Keyword Generation in Social Media".ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING 23.3(2024):20.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace