Projects - Vasu Sharma

Work Experience
- Articulab, Carnegie Mellon University
Research Assistant
Justine Cassell and Tom Mitchell | Aug 2017 - present| Pittsburgh, USA

I am working on the Socially Aware Robotic Assistant project at the ArticuLab, which focus on building a socially aware robotics assistant. My primary focus is on trying to combine the user’s multimodal visual, vocal and verbal cues to build an end to end conversational voice agent.
\item I am also working on the natural language response generation conditioned on the social and task intent to achieve task completion and social rapport building in conversations with the voice agent.

Internships

- Citadel LLC
Summer Intern, Machine Learning
Global Quantitative Strategies Group | May 2018 - Aug 2018 | Chicago, USA

Worked on "`Deep Neural Networks for Time series modeling of financial markets" and "Effective Feature scalability for Machine Learning models". In this project I explore a variety of Deep Learning models and effective training techniques to perform time series analysis on the large scale and highly noisy financial markets data. I also ensure that the models scale to arbitrarily large dimension feature sets.

- École Polytechnique Fédérale de Lausanne (EPFL)
Summer Intern, Machine Learning and Optimization Lab
Martin Jaggi | May 2017 - Jul 2017 | Lausanne, Switzerland

"Learning semantic sentence embeddings using generative models of text".
This project entailed on creating the Generative Adversarial network composed of Deep CNN's for text to learn semantic textual embeddings. The representations learned are trained in an unsupervised fashion and outperform major state of the art approaches on both supervised tasks like sentiment analysis and unsupervised ones like similarity matching.

- University of Toronto
Summer Intern, Machine Learning Team
Raquel Urtasun, Sanja Fidler | May 2016 – Jul 2016 | Toronto, Canada

“FlowSeg: A Deep Learning based approach for simultaneous semantic segmentation and flow estimation from videos”
The project focused on building Deep Convolutional Neural Network architectures to study the problem of Instance and Semantic segmentation of videos. We experiment with fairly advanced and novel Deep CNN architectures to jointly estimate semantic segmentation and flow from videos. The approach shows promising results on various datasets.

- Carnegie Mellon University
Summer Intern, School of Computer Science
Bhiksha Raj, Rita Singh| May 2014 – Jul 2014 | Pittsburgh, USA

''Deep Recurrent Gated Neural Networks for Dynamic Audio Denoising"
The project focused on construction of a Deep Recurrent neural network to achieve signal reconstruction by denoising noise corrupted signals by dynamic spectral subtraction.
Techniques used: Recurrent and Time Delay Neural Networks, Spectral Subtraction, Multi Layer Perceptrons and other Deep Learning techniques

- Abzooba Inc.
Research Intern
Labhesh Patel | Aug 2016 – Ongoing | California, USA (working remotely)

Worked on building "A Smart E-commerce Virtual Assistant". Implemented features like cloth parsing from images, similar image retrieval from a huge fashion catalogue and a state of the art Deep Recommender system.
Implemented a "Multi Turn Conversational Voice Agent" to facilitate user interaction. Involved the use of Memory Networks and a soft attention mechanism over previous queries and responses to figure out the best response to a given user query.
Also worked on "Query based document retrieval" by learning rich semantic document embeddings using a deep LSTM pipeline and using these to find the match the queries to relevant documents
"Abstractive summarization using Attention based encoder-decoder networks": Worked on building a deep residual LSTM pipeline which used temporal attention over both encoder and decoder networks to generate an abstractive summary of documents.

- Xerox Research Labs, Europe
Research Intern, Computer Vision Team
Diane Larlus, Albert Gordo | Sep 2015 – Dec 2015 | Grenoble, France

''Large Scale Image Recognition using Deep Convolutional Neural Nets"
The projects primarily focused on constructing Deep Learning frameworks for Image Recognition. Worked on designing some novel Deep Learning frameworks for the image recognition task on the ImageNet dataset. Also made extensive use of GPUs and the popular Caffe library for training Deep Convolutional Neural Nets.

- Xerox Research Labs, India
Research Intern, Speech Processing Team
Vivek Tyagi | May 2015 – Sep 2015 | Bangalore, India

Worked on 3 projects during this internship: ''Application of Deep Learning for Automatic Speech Recognition", ''A comprehensive analysis of Activation Functions in Deep Nets" and ''A new hashing technique to enhance Deep Net performance ". Also got the Best Project award for the same.

- Carnegie Mellon University
Winter School, School of Computer Science
Bhiksha Raj, Rita Singh| Nov 2013 – Dec 2013 | Bangalore, India

Worked on ''Identifying safest path in Real time based on crime records" and "Image Summarization using Topic Modelling". Received Overall Best Project Award and Gandhian Young Technological Innovation Award

Other Projects

- "Multi Agent Deep Reinforcement Learning for Co-operative Visual Dialog"
Course Project: Deep Reinforcement Learning (Prof. Ruslan Salakhutdinov )

The project involved building 2 conversational voice agents conversing autonomously with each other to play the ``20 Questions Image guessing game". We casted the problem in a Multi Agent Dialog setup which allowed the bots to adhere to natural language and avoiding language divergence properties as experienced in the prior state of the art works. We outperform all state of the art methods on both numerical metrics and human evaluation.

- "Data is the New Oil: Learning to Answer Questions in an Active Learning Setting"
Course Project: Neural Networks for NLP (Prof. Graham Neubig )

In this project we design an Active Learning pipeline to generate hard examples to train a question answering model on the Movie QA dataset. We use a question generator network which generates questions which are further checked for grammatical and formational correctness by a discriminator. A Query by committee ensemble network is then used and the inter classifier confusion is used to select hard examples for active learning. Our pipeline managed to attain accuracy comparable to the state of the art with 75\% lesser data.

- "End to End pipeline for Question Answering"
Course Project: Question Answering (Prof. Eric Nyberg, Teruko Mitamura )

In this project we design an end to end Question answering pipeline based on Joint Co-attention answer generation networks. We follow a pipeline of relevant snippet ranking, sentence selection and summary generation for the ideal answer type questions in the BioASQ challenge. We achieved 1st position in the BioASQ challenge and were ranked on the highly competitive MSMarco and the SQuAD leaderboards

- "Dynamic Co-Attention Networks for Open Domain Question Answering"
Course Project: Deep Learning (Prof.Bhiksha Raj)(Ongoing)

Working on implementing a 2 stage pipeline which involves building a Deep Dynamic Co-attention network to simultaneously compute attention over the question and knowledgebase and estimate the most likely span in a large knowledge base which is likely to contain the answer. A bi-directional GRU adversarial generator network then uses this predicted span to generate free form natural language responses to the asked questions. We are working with the popular Squad and MS Marco datasets.

- "Segmentation Guided Attention Networks for Visual Question Answering"
Course Project: Visual Recognition (Prof. Vinay Namboodiri)

This project involved enhancing the attention maps generated by the CNN for the task of visual question answering by using pixel level dense segmentation maps. The segmentation maps gave the network pixel level grounding enhancing them and giving an improved performance on the Visual7W dataset.

- "Automatic Tagging of Images and Content Based Image Retrieval"
(Undergraduate Project with Prof. Harish Karnick )

Designed a novel pipeline which combined a Deep ConvNet with Extreme Learning approaches for tagging of images and an LSTM for captioning them. Our model was able to handle potentially unbounded tag set and we also built a highly accurate Content Based Image retrieval system on top of it.

- "Real Time Video Surveillance using Deep Convolutional Neural Networks"
Course Project: Machine Learning Techniques (Prof. Harish Karnick)

Built a real time surveillance system which included object and entity detection and localisation along with face recognition and abnormal action detection. In this project we extended the Faster RCNN model and added time recurrent connections to model context across the video frames. The Face recognition and and abnormal action detection networks were integrated into this Recurrent Faster RCNN model using a novel combination layer and the whole network was trained in a joint end to end manner.

- "Visual Storytelling"
Course Project: Recent Advances in Computer Vision (Prof. Gaurav Sharma)

This task entails producing story like descriptions for a sequence of images. We experimented with a unique GRU based decoder which looks at all the encoder states simultaneously which allows the model to peek into relevant parts of the encoder states using a soft attention mechanism. We also use a bidirectional encoder and also implemented our own custom version of the beam search algorithm in a more parrallelized fashion rather than the traditionally used sequential version.

- "An Automatic Review generator and Restaurant Recommender System"
Course Project: Natural Language Processing (Prof. Harish Karnick)

We build a state of the art recommender system along with an automatic review summarization system to provide user with quick reviews and suggestions. We also implement a sentiment analysis system using Paragraph vectors to represent text documents and predict ratings from user reviews. This was trained jointly with a recurrent network based review generator network which enhanced the accuracy of both the networks.

- "3Dify: Automatically convert 2D images and videos to 3D using Deep Neural Networks"
Project for UnitedbyHCL Hackathon

Created a Deep Convolutional Network based pipeline to automatically learn 3D anaglyph maps from 2D images. Unlike other models this is trained directly on 3D images and learns the depth maps as implicit representations rather than learning them explicitly. We also created a web app around it.

- "DocGen: A novel Document Embedding technique"
(Supervised by Prof. Harish Karnick)

This project involves working towards designing a Document Embedding technique which is inspired by Generative Adversarial Networks. I trained a generator and discriminator network simultaneously in an adversarial manner. Once trained this network can be used to produce the document embedding vectors which are the latent activation values of the code layer in the network when the document in passed through it. The work is presently ongoing .

- "Artify: A Deep Neural Network based Image styling app"
Course Project: Software Engineering (Prof. TV Prabhakar)

Created an stylistic image editor Android app which implemented the Perceptual Style transfer method using Deep Convolutional Neural Networks to fuse artistic styles into a given image. Followed proper software architecture and design practices and implemented a variety of tactics to enhance quality attributes of the system.

- "Internet Procastinator: Studying the effect of a users browsing history on his/her GPA"
Course Project: Human Centered Computing (Prof. Nisheeth Srivastava)

Used a NLP based pipeline including LDA and semantic document embeddings to analyze users browsing histories and then building interpretable machine learning models to predict one's GPA using them. The interpretable models allowed us to analyze the features which affect the GPA the most.

- "YAMDB: Yet Another Movie Database"
Course Project: Database Management Systems (Prof. Sumit Ganguly)

Created an online movie database system similar to IMDb with the state of the art implementation of Recommender Systems. Also implemented concepts like database sharding, scalable database systems and policies to increase fault tolerance.

- "Online File Sharing System with Collaborative Editing''
(Course Project Computing Laboratory-II under Prof. Arnab Bhattacharya)

Implemented file/folder sharing among multiple users, allowing upload/download from remote server.
Modified ShareLatex to integrate with the system, allowing multiple users to collaboratively edit and render shared Tex files collaboratively. Used python client to synchronize the local and remote file systems.

- "Extension of NachOS''
(Course Project Operating Systems under Prof. Mainak Choudhari)

Designed various functionalities in NachOS - instructional software in C++ to run as secondary OS on linux.
Implemented various system calls (fork, join) , various scheduling algorithms (FIFO, RR, Unix Scheduler), various techniques for synchronization (semaphores, condition variables), demand paging, shared memory and various page replacement algorithms(FIFO, LRU, LRU Clock).

- "Education based Webapp for Primary education''
Secured 2nd Position in El-Eduvate (Techkriti, Technical Festival, IIT Kanpur)

The webapp was designed with several interactive applications and a highly attractive GUI to provide primary school students an exciting platform to learn new things away from the monotonous classroom environment.
Used HTML, CSS, Javascript, PHP to design webapp.

- "Karaoke Generator Webapp with vocal comparison''
( Honorable Mention at Yahoo HackU!)

Given a song, the webapp muted the vocals, fetched the lyrics and allowed the user to sing along it.
Techniques Used: Audio Processing, Vocals extraction and removal, Heuristic based vocal comparison.

"Every day, in every way, I try to get better and better and better."