End-to-end visual grounding via region proposal networks and bilinear pooling

End-to-end visual grounding via region proposal networks and bilinear pooling

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Phrase-based visual grounding aims to localise the object in the image referred by a textual query phrase. Most existing approaches adopt a two-stage mechanism to address this problem: first, an off-the-shelf proposal generation model is adopted to extract region-based visual features, and then a deep model is designed to score the proposals based on the query phrase and extracted visual features. In contrast to that, the authors design an end-to-end approach to tackle the visual grounding problem in this study. They use a region proposal network to generate object proposals and the corresponding visual features simultaneously, and multi-modal factorised bilinear pooling model to fuse the multi-modal features effectively. After that, two novel losses are posed on top of the multi-modal features to rank and refine the proposals, respectively. To verify the effectiveness of the proposed approach, the authors conduct experiments on three real-world visual grounding datasets, namely Flickr-30k Entities, ReferItGame and RefCOCO. The experimental results demonstrate the significant superiority of the proposed method over the existing state-of-the-arts.

Related content

This is a required field
Please enter a valid email address