Publications

- "Mind Your Language: Learning Visually Grounded Dialog in a Multi-Agent Setting"
Published at Conference on Computer Vision and Pattern Recognition(CVPR), VQA Challenge and Visual Dialog Workshop, Salt Lake City, USA, 2018

In this paper we cast the problem of Visual Dialog in a multi agent framework setting and train it using the REINFORCE algorithm in an unsupervised self play fashion. The Multi Bot agent setup prevents the bots from developing their own language and hence deviate from natural language.

- "BioAMA: Towards an End to End BioMedical Question Answering System"
Published at Annual Meeting of the Association for Computational Linguistics(ACL), BioNLP track, Melbourne, Australia 2018

This paper presents a novel Biomedical Question Answering system, which was the winning entry on task 5b of the annual BioASQ challenge in the ideal answer category. We also present novel and competitive systems for answering Factoid and List type questions.

- "Segmentation Guided Attention Networks for Visual Question Answering"
Published at Annual Meeting of the Association for Computational Linguistics(ACL), Vancouver, Canada, 2017

This project involved enhancing the attention maps generated by the CNN for the task of visual question answering by using pixel level dense segmentation maps. The segmentation maps gave the network pixel level grounding enhancing them and giving an improved performance on the Visual7W dataset.

- "Automatic tagging and retrieval of E-Commerce products based on visual Features"
Published at NAACL, Association for Computational Linguistics(ACL) conference, San Diego, 2016

In this paper we propose an Automatic Image Annotation system based on a Deep Learning pipeline in conjunction with the FastXML algorithm to build a state of the art Image Annotation system on the IAPR-TC12, ESP-Game, MIRFlickr and several other benchmark datasets.
We also build an extremely fast content based image retrieval system by extending this approach.

- "A Deep Neural Network Based Approach For Vocal Extraction From Songs"
Published at IEEE’s International Conference on Signal and Image Processing Applications 2015

This paper proposes several Deep Learning Frameworks for extracting Vocals from songs.
Deep Learning Frameworks like MultiLayer Perceptrons, Autoencoders, Restricted Boltzmann Machines and their extension to Deep Belief Nets were trained and their performance on this task was compared.

- "Analyzing Newspaper Crime Reports For Identification Of Safe Transit Paths"
Published at NAACL, Association for Computational Linguistics(ACL) conference, Colorado, 2015

This paper proposes a method to find the safest path between two locations, based on the geographical model of crime intensities
This paper describes how the techniques of topic modeling like Latent Dirichlet allocation, Latent semantic analysis along with mathematical modeling and NLP techniques were successfully employed to identify the safest path based on model of crime intensities.

- "Image Summarization using Topic Modelling"
Published at IEEE’s International Conference on Signal and Image Processing Applications 2015

This paper proposes a method to use topic modeling and clustering based techniques to summarize a large collection of images into a smaller, representative subset.
This paper describes how the techniques of topic modeling like Latent Dirichlet allocation, Latent semantic analysis along with methods like K-Means Clustering etc. can be used for image summarization.

- "A Deep Autoencoder Decoder pipeline for audio based music database search and retrieval"
Submitted to Neural Information Processing Systems, Montreal(NIPS 2015)

In this paper I used a Bottleneck autoencoder and decoder network and a novel alignment scheme to convert audio snippets to representative vectors and populated an audio database with these snippets
Used fast binary hash based comparisons to compute similarities of the query snippet with the audio database