Learning similarity measures in non-orthogonal space

CORC > 北京大学 > 数学科学学院

	Learning similarity measures in non-orthogonal space
	Liu, Ning ; Zhang, Benyu ; Yan, Jun ; Yang, Qiang ; Yan, Shuicheng ; Chen, Zheng ; Bai, Fengshan ; Ma, Wei-Ying
	2004
英文摘要	Many machine learning and data mining algorithms crucially rely on the similarity metrics. The Cosine similarity, which calculates the inner product of two normalized feature vectors, is one of the most commonly used similarity measures. However, in many practical tasks such as text categorization and document clustering, the Cosine similarity is calculated under the assumption that the input space is an orthogonal space which usually could not be satisfied due to synonymy and polysemy. Various algorithms such as Latent Semantic Indexing (LSI) were used to solve this problem by projecting the original data into an orthogonal space. However LSI also suffered from the high computational cost and data sparseness. These shortcomings led to increases in computation time and storage requirements for large scale realistic data. In this paper, we propose a novel and effective similarity metric in the non-orthogonal input space. The basic idea of our proposed metric is that the similarity of features should affect the similarity of objects, and vice versa. A novel iterative algorithm for computing non-orthogonal space similarity measures is then proposed. Experimental results on a synthetic data set, a real MSN search click-thru logs, and 20NG dataset show that our algorithm outperforms the traditional Cosine similarity and is superior to LSI. Copyright 2004 ACM.; EI; 0
语种	英语
出处	EI
内容类型	其他
源URL	[http://hdl.handle.net/20.500.11897/329098]
专题	数学科学学院
推荐引用方式 GB/T 7714	Liu, Ning,Zhang, Benyu,Yan, Jun,et al. Learning similarity measures in non-orthogonal space. 2004-01-01.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们