Extracting categorical topics from tweets using topic model
Zheng, Lei; Han, Kai
2013
会议名称9th Asia Information Retrieval Societies Conference on Information Retrieval Technology, AIRS 2013
会议地点Singapore, Singapore
英文摘要Over the past few years, microblogging websites, such as Twitter, are growing increasingly popular. Different with traditional medias, tweets are structured data and with a lot of noisy words. Topic modeling algorithms for traditional medias have been studied well, but our understanding of Twitter still remains limited and few algorithms are specially designed to mine Twitter data according to its own characteristics. Previous studies usually employ only one type of topic to analyze hot topics of the Twitter community and are greatly affected by the large amount of noisy words in tweets. We have observed that, in the Twitter community, users tend to discuss two types of topics actually. One mainly focuses on their personal lives and the other on hot issues of the society. These two types of topics usually yield different distributions. In this paper, we introduce the Categorical Topic Model. This model incorporates the features of Twitter data to divide topics into two types in semantic and introduce a word distribution for background words to filter out noisy words. Our model is able to discover different types of topics efficiently, indicate which topics are interested by an user and find hot issues of the Twitter community. Employing the Gibbs sampling, we compare our model with Latent Dirichlet Allocation and Author Topic Model on the TREC2011 data set and examples of discovered public topics and personal topics are also discussed in our paper.
收录类别EI
语种英语
内容类型会议论文
源URL[http://ir.siat.ac.cn:8080/handle/172644/4977]  
专题深圳先进技术研究院_医工所
作者单位2013
推荐引用方式
GB/T 7714
Zheng, Lei,Han, Kai. Extracting categorical topics from tweets using topic model[C]. 见:9th Asia Information Retrieval Societies Conference on Information Retrieval Technology, AIRS 2013. Singapore, Singapore.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace