Siddhartha Gairola

Deep Neural Networks (DNNs) have demonstrated great success in a variety of computer vision
(and language, speech, etc) tasks like image classification, object detection, semantic segmentation,
action recognition, image captioning, visual question answering, and many more. However, these
powerful models currently serve as black-box systems that are extremely hard to interpret. This
limits their adoption into our everyday lives, especially in the case of sensitive applications like
medical diagnosis, autonomous driving, security, etc. Thus, building interpretable DNNs and
deriving meaningful explanations for their predictions is an important area of research. Recent
works [1,2] have proposed new architectural modules that can be easily incorporated into existing
architectures to make DNNs inherently interpretable. Similar in spirit, through the course of the
Ph.D., we will explore novel methods in which DNNs can be made more interpretable such that the
explanations are both faithful as well as visually sensible.
Furthermore, it has been shown that DNNs are brittle and learn features that are prone to
adversarial attacks [3,4,5] i.e., a small change in the input can result in drastically different
predictions. Another key aim of the Ph.D. is to build schemes that would enable DNNs to learn
powerful generalized representations that are robust and immune to such issues (such as adversarial
attacks, domain shifts, etc).
This would help create DNNs that are more robust, trustworthy and explainable, thus allowing for
easier adoption of these powerful tools into real-world applications