ViP-CNN: Visual Phrase Guided Convolutional Neural Network

	ViP-CNN: Visual Phrase Guided Convolutional Neural Network
	Yikang Li; Wanli Ouyang; Xiaogang Wang; Xiaoou Tang
	2017
会议地点	美国
英文摘要	As the intermediate level task connecting image cap- tioning and object detection, visual relationship detection started to catch researchers’ attention because of its de- scriptive power and clear structure. It detects the objects and captures their pair-wise interactions with a subject- predicate-object triplet, e.g. hperson-ride-horsei. In this paper, each visual relationship is considered as a phrase with three components. We formulate the visual relationship detection as three inter-connected recognition problems and propose a Visual Phrase guided Convolutional Neural Net- work (ViP-CNN) to address them simultaneously. In ViP- CNN, we present a Phrase-guided Message Passing Struc- ture (PMPS) to establish the connection among relationship components and help the model consider the three problems jointly. Corresponding non-maximum suppression method and model training strategy are also proposed. Experimen- tal results show that our ViP-CNN outperforms the state- of-art method both in speed and accuracy. We further pre- train ViP-CNN on our cleansed Visual Genome Relation- ship dataset, which is found to perform better than the pre- training on the ImageNet for this task.
语种	英语
内容类型	会议论文
源URL	[http://ir.siat.ac.cn:8080/handle/172644/11767]
专题	深圳先进技术研究院_集成所
作者单位	2017
推荐引用方式 GB/T 7714	Yikang Li,Wanli Ouyang,Xiaogang Wang,et al. ViP-CNN: Visual Phrase Guided Convolutional Neural Network[C]. 见:. 美国.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们