IJSRSET calls volunteers interested to contribute towards the scientific development in the field of Science, Engineering and Technology

Home > IJSRSET162346                                                     

Object Detection and Sentence Generation from Images


Anakha P. J. , Devika Hari, Rinku Roy, Prof. Joby George
  • Abstract
  • Authors
  • Keywords
  • References
  • Details
Being able to automatically describe the content of an image using properly formed English sentences is a very challenging task. The ultimate goal is to generate descriptions of image regions. A model that generates natural language descriptions of images and their regions is thus developed. The approach leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between language and visual data. Alignment model is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. A Multimodal Recurrent Neural Network architecture is described that uses the inferred alignments to learn to generate novel descriptions of image regions. The alignment model produces state of the art results in retrieval experiments on Flickr8K dataset. The generated descriptions significantly outperform retrieval baselines on both full images and on a new dataset of region-level annotations.

Anakha P. J. , Devika Hari, Rinku Roy, Prof. Joby George

Computer vision, Object detection, RNN

  1. Andrej Karpathy Li Fei-Fei “Show and Tell: A Neural Image Caption Generator”, In CVPR, 2015
  2. Mike Schuster and Kuldip K. Paliwal, “Bidirectional Recurrent Neural Networks “, Member IEEE,June 2006
  3. M. Hodosh, P. Young, and J. Hockenmaier,”Framing image description as a ranking task: data, models and evaluation Metrics” ,Journal of Artificial Intelligence Research, 2013
  4. D. Elliott and F. Keller,”Image description using visual dependency representations” In EMNLP, pages 1292–1302, 2013.

Publication Details

Published in : Volume 2 | Issue 3 | May-June - 2016
Date of Publication Print ISSN Online ISSN
2016-06-30 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
277-280 IJSRSET162346   Technoscience Academy

Cite This Article

Anakha P. J. , Devika Hari, Rinku Roy, Prof. Joby George, "Object Detection and Sentence Generation from Images", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 3, pp.277-280, May-June-2016.
URL : http://ijsrset.com/IJSRSET162346.php