Extended VSM for XML Document Classification Using Frequent Subtrees | |
Yang, Jianwu ; Wang, Songlin | |
2010 | |
关键词 | XML Document Classification Vector Space Model (VSM) Structured Link Vector Model (SLVM) Frequent Subtree |
英文摘要 | Structured link vector model (SLVM) is a representation proposed for modeling XML documents which was extended from the conventional vector space model (VSM) by incorporating document structures In this paper we describe the classification approach for XML documents based on SLVM in the Document Mining Challenge of INEX 2009 where the closed frequent subtrees as structural units are used for content extraction from the XML document and the Chi-square test is used for feature selection; Computer Science, Information Systems; Computer Science, Software Engineering; Computer Science, Theory & Methods; EI; CPCI-S(ISTP); 5 |
语种 | 英语 |
DOI标识 | 10.1007/978-3-642-14556-8_44 |
内容类型 | 其他 |
源URL | [http://ir.pku.edu.cn/handle/20.500.11897/406219] |
专题 | 信息科学技术学院 |
推荐引用方式 GB/T 7714 | Yang, Jianwu,Wang, Songlin. Extended VSM for XML Document Classification Using Frequent Subtrees. 2010-01-01. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论