
I am a rising second year Computer Science MS (specialization in Machine Learning) student at Georgia Tech advised by Prof. Dhruv Batra and work closely with Prof. Devi Parikh. I work at the intersection of software engineering and AI research, solving problems which lie at the intersection of Machine Learning, Computer Vision and Natural Language Processing.
I am excited about building tools for reproducible AI research and scalable software which will help in training, visualizing and evaluating machine learning models in real-time on static and dynamic datasets.
I am currently working as a Graduate Research Assistant with Prof. Dhruv Batra and Prof. Devi Parikh for simplifying and standardizing the process of evaluating Reinforcement Learning (RL) models by developing and managing a widely adopted evaluation platform, EvalAI, for reproducibility of results, maintaining evaluation consistency, evaluating of model’s code instead of predictions from the model, and to measure constant progress on pushing the frontiers of AI.
I also lead an open source organization, CloudCV, where we are building several open-source softwares for reproducible AI research. Previously, I spent a year as a visiting research scholar in Machine Learning and Perception Lab at Georgia Tech.
I am graduating in May 2021 and am on the job market. Please feel free to reach out to me at rishabhjain@gatech.edu.
I am excited about building tools for reproducible AI research and scalable software which will help in training, visualizing and evaluating machine learning models in real-time on static and dynamic datasets.
I am currently working as a Graduate Research Assistant with Prof. Dhruv Batra and Prof. Devi Parikh for simplifying and standardizing the process of evaluating Reinforcement Learning (RL) models by developing and managing a widely adopted evaluation platform, EvalAI, for reproducibility of results, maintaining evaluation consistency, evaluating of model’s code instead of predictions from the model, and to measure constant progress on pushing the frontiers of AI.
I also lead an open source organization, CloudCV, where we are building several open-source softwares for reproducible AI research. Previously, I spent a year as a visiting research scholar in Machine Learning and Perception Lab at Georgia Tech.
I am graduating in May 2021 and am on the job market. Please feel free to reach out to me at rishabhjain@gatech.edu.
News
- [Sep 2020] Our paper dialog without dialog is accepted to NeurIPS 2020.
- [Jun 2020] Invited speaker at EmbodiedAI workshop. [Talk]
- [May 2020] Interned at eBay with Roman Maslovskis and Uwe Mayer for summer 2020.
- [Feb 2020] Organization Administrator for Google Summer of Code 2020 with CloudCV.
- [Feb 2020] CloudCV selected as a mentoring organization in Google Summer of Code for the 6th time in a row.
- [Oct 2019] Represented CloudCV at Google Summer of Code Mentor Summit 2019 , Munich Germany.
- [Oct 2019] EvalAI accepted in AI systems workshop at SOSP conference.
- [Aug 2019] Joined Georgia Tech for Masters in Computer Science.
- [Jun 2019] Presented EvalAI in Habitat Workshop at CVPR.
- [Mar 2019] Our paper nocaps: novel object captioning at scale is accepted to ICCV 2019.
- [Feb 2019] Served as a Google Summer of Code orgnization administrator with CloudCV.
- [Jan 2019] Team Lead, CloudCV.
- [Nov 2018] Served as a Google Code-In orgnization administrator with CloudCV.
- [Oct 2018] Represented CloudCV at Google Summer of Code Mentor Summit 2018 , Google Sunnyvale.
- [Jul 2018] Joined as a Visiting Research Scholar at Georgia Tech to work with Prof. Dhruv Batra & Prof. Devi Parikh.
- [Apr 2018] Served as a Google Summer of Code mentor with CloudCV.
- [Nov 2017] Served as a Google Code In 2017 Mentor with CloudCV.
- [May 2017] Selected as a Google Summer of Code student with CloudCV.
Projects
Built an open source platform for evaluating and benchmarking AI models. We have hosted 85+ AI challenges with 9,000+ users, who have created 100,000+ submissions. More than 25 organizations from industry and academia use it for hosting their AI challenges. The project is open source with 100+ contributors, and 1.8M+ pageviews.. Some of the organizations using it are Google, Facebook, IBM, Intel, eBay, etc. and research labs from Stanford, CMU, MIT, Georgia Tech, etc. also use it and it's forked version for hosting their internal challenges instead of reinventing the wheel.

In Visual Question Answering, given an image and a free-form natural language question about the image (e.g., "What kind of store is this?", "How many people are waiting in the queue?", "Is it safe to cross the street?") the model's task is to automatically produce a concise, accurate, free-form, natural language answer ("bakery", "5", "Yes"). This demo is implemented using Pythia model. It is used by 100K+ users.

Visual Chatbot: A chatbot that can see!
Built a visual chatbot which can hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the chatbot will ground the question in image, infer context from history, and answer the question accurately. This demo is implemented using Late Fusion model from CVPR 2017 Paper. It is used by 170K+ users.

Built a demo of a single model trained on 12 datasets from four broad categories of tasks including visual question answering, caption-based image retrieval, grounding referring expressions, and multi-modal verification and compared to independently trained single-task models, this represents a reduction from approximately 3 billion parameters to 270 million while simultaneously improving performance by 2.05 points on average across tasks. It is used by 20K+ users.

Trick or TReAT: Thematic Reinforcement for Artistic Typography Demo
Given an input word (e.g.exam) and a theme (e.g.education), the individual letters of the input word are replaced by cliparts relevant to the theme which visually resemble the letters - adding creative context to the potentially boring input word.

JSS InfoConnect: Information Center for College
Web based officially recognized application that acts as a medium for interaction between all students, faculties and management of JSSATE, Noida. It serves 10K+ requests per day, 20K+ registered users, 20K+ notices/documents were uploaded since it’s launch in 2016.

Publications
Dialog without Dialog: Learning Image-Discriminative Dialog Policies from Single-Shot Question Answering Data
EvalAI: Towards Better Evaluation Systems for AI Agents
nocaps: novel object captioning at scale

Evaluating visual and text explanations in an interactive, goal-driven human-AI task
(* denotes equal contribution)
Experience

(Jan 2019 - Present)
Leading a team of 15+ contributors to actively maintain CloudCV Project which aims to make AI research more reproducible.
Graduate Research Assistant, Machine Learning and Perception Lab

(Aug 2019 - Present)
EvalAI: Built an open source platform called as EvalAI for evaluating and benchmarking AI models. We have hosted 85+ challenges with 9,000+ users, who have created 100,000+ submissions. The project is open source with 100+ contributors, 1k+ stars, 500+ forks. More than 25 organizations are using it including Google, Facebook, IBM, Intel, eBay etc. and research labs from MIT, Stanford, CMU, Georgia Tech, etc. also use it and it's forked version for hosting their internal challenges instead of reinventing the wheel.
GuessWhich: Evaluating the role of interpretable explanations towards making a model predictable to a human. We studied if the textual or visual explanations from an AI model in the context of an interactive, goal-driven, collabora-tive human-AI task help humans to predict it’s behavior.
Software Engineering Intern, Structured Data and Applied Research Team

(May 2020 - Aug 2020)
Evaluating and Predicting attribute values in listings: Given an image, and a text description about the listing on eBay, the task is to predict the missing attributes in the listing. For instance, predicting the missing color or brand attribute in the listing. I built an end-to-end system for processing visual and textual data along with training AI models. I also trained an early fusion model of image and text data which outperformed the uni-modal models of image and text by 5% and 22% on the test dataset for color attribute and by 20% and 4% on the test dataset for brand attribute.
Visiting Research Scholar, Machine Learning and Perception Lab

(Aug 2018 - June 2019)
To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task. Dubbed nocaps, for novel object captioning at scale, our benchmark consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets containing more than 500 objects, out of which more than 400 objects are never described in COCO captions dataset.
Google Summer Of Code (GSoC)

(May 2017 - Aug 2020)
2017:
I was selected as GSoC student where I developed new features for hosting AI challenges in streamlined manner, implemented REST-API’s, frontend and several analytics features for both participants and hosts in EvalAI.
2018: Mentored a student to design a command line tool (EvalAI-CLI) for EvalAI which lets the participants to install and use EvalAI as a python package.
Google Code In (GCI)

(Nov 2017 - Jan 2020)
2017: Applied with CloudCV as a mentoring organization and got it accepted to mentor for the first time in Google Code-In. I mentored high school students on open-source projects in frontend, backend and DevOps.
2018: Led a team of 10+ mentors to mentor high school students on open-source projects EvalAI, Fabrik and, Orgami.
2019: Led a team of 10+ mentors to mentor high school students on open source projects EvalAI, EvalAI-CLI, EvalAI-ngx.
Python Software Society of India (PSSI)
iAugmentor Labs

(Jun 2016 - Aug 2016)
Developed a classifier using SVM to detect and classify videos containing an answer of a job interview question as an input and outputs a confidence score on the smiling behaviour of a person in the video.
Nibble Computer Society (NCS)

(Feb 2015 - May 2018)
Organized multiple code labs, seminars, workshops on OOPs, Advanced C, C++, Google Summer of Code, Git etc.and mentored 30+ undergraduate students on software development. I was also responsible for organizing college’sannual techno-cultural fest Zealicon.
Invited Talks

- [Jun 2020] Invited speaker at EmbodiedAI workshop. [Talk]
- [Oct 2019] Represented CloudCV in Google Summer of Code Mentor Summit at Munich Germany. (Slides)
- [Jun 2019] Presented EvalAI in Habitat workshop at CVPR. (Slides)
- [Oct 2018] Represented CloudCV in Google Summer of Code Mentor Summit at Google Sunnyvale. (Slides)
Education

Masters in Computer Science
Georgia Institute of Technology, Atlanta, USA(Aug 2019 - Present)
- Specialization in Machine Learning
- Current GPA: 4.0/4.0
- Expected Graduation - May 2021

Bachelor of Technology in Computer Science and Engineering
JSS Academy of Technical Education, Noida, India(Aug 2014 - May 2018)
- Passed with an aggregate of 80.4% (with Hons.)
- Class Rank: 8th out of 150 Students (or top 5%)