CORC  > 软件研究所  > 基础软件国家工程研究中心  > 学位论文
题名个性化微博信息推荐系统研究
作者彭泽环
学位类别硕士
答辩日期2013-05-28
授予单位中国科学院研究生院
授予地点北京
导师孙乐
关键词个性化 微博 信息推荐 潜在因素模型
学位专业计算机软件与理论
中文摘要

微博是目前最热门的互联网应用之一,吸引了数以亿计的用户。通过微博系统用户可以自由地关注感兴趣的人,同时发布、分享、评论感兴趣的信息。目前微博用户每天产生的微博总数超过一亿条,导致社交信息严重过载。推荐系统一直是社交网络研究领域的热点,其中的许多研究成果都已经应用于微博数据中,如用户好友推荐、微博标签推荐、新闻话题推荐,在某种程度上解决了社交信息过载问题。本文试图从用户兴趣圈子的角度为用户推荐热点微博,主要工作如下:

Ÿ   提出了一种融合边权重的改进GN算法,用于检测微博用户的兴趣社区。该算法(WGN)通过逐步删除割边将图分割成一个个独立的点,然后根据模块化指标Q确定最终社区划分。在计算机生成图、微博用户社交图和WebKB共引图上的实验结果表明:(1WGN算法可以有效地检测出用户的兴趣社区;(2)融合边权重可以提高了社区检测的效果。

Ÿ   提出了一个基于潜在因素模型(LFM)融合显式特征和潜在特征的社区热点微博推荐算法(CWR。该算法首先采用随机梯度下降方法在训练数据集上学习出用户对微博的评分模型,然后应用该模型计算测试数据中每个用户对每条微博的评分,最后根据社区中每条微博平均评分筛选出评分较高的社区热点微博推荐给用户。实验结果表明:(1)融合两种特征信息的推荐效果好于使用单一特征信息;(2)和基于转发次数的对照实验(WRR)相比,CWR推荐效果好于WRR;(3)通过分析算法推荐微博的内容发现CWR倾向于为用户推荐兴趣社区相关微博,WRR倾向于为用户推荐公共热点微博。

Ÿ   基于上述两种算法构建了一个实时的个性化微博信息推荐系统SIRMIR。该系统实时获取登录用户的微博社交关系图后检测出用户的兴趣社区,然后基于兴趣社区向用户推荐当日热点微博。

英文摘要

As one of the most popular web applications, Micro-blog has attracted hundreds of millions of users. Micro-blog users can follow people they are interested in, post and repost statuses, share useful information with others, and so on. Every day, Micro-blog users generate over 100 million statuses, which lead to a serious social information overload problem. Recommendation in micro-blog is a hot research topic in social network, there are many research topics such as user recommendation, tag recommendation and news recommendation have resolved the information overload problem to some extent. In this thesis, we try to recommend useful statuses to users based on interest community (circle), the main contributions are as follows:

Ÿ  Based on GN, we propose a revised social graph community detection algorithm WGN which takes edge weight into consideration. WGN, a graph partition algorithm, is an iterated algorithm which deletes a cut-edge at each time until each vertex is in an individual community. WGN use modularity norm Q to measure the community detection quality. Experiments on computer-generate graph, user social graph and WebKB co-citation graph show that: (1) WGN can accurately discover users’ interest communities; (2) by taking edge weight into consideration, our method can improve the performance of community detection.

Ÿ  We propose a community hot status detection algorithm CWR which can integrate the explicit features and the latent features. Concretely, the proposed algorithm first train the latent factor vectors of users and terms, then combine the latent factor score (by calculating vector inner product) and explicit features score to get the final interest score, finally CWR calculate the average score of a status in the community to discover the hot statuses with high score. Experiments results shows that: (1) the performance of the recommendation system using both kinds of features is better than that using only one type of feature; (2) compared with WRR system (only using status repost times to recommendation), CWR recommend more useful statuses for users in the community; (3)WRR prefer recommending hot status in the whole micro-blog system, which are usually not community-related, in contrast CWR recommend community-related statuses .

Ÿ  Based on the above two algorithms, we build a personalized micro-blog information recommendation system SIRMIR. The system recommends hot statuses to users based on their interest communities.

公开日期2013-05-31
内容类型学位论文
源URL[http://ir.iscas.ac.cn/handle/311060/14812]  
专题软件研究所_基础软件国家工程研究中心_学位论文
推荐引用方式
GB/T 7714
彭泽环. 个性化微博信息推荐系统研究[D]. 北京. 中国科学院研究生院. 2013.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace