Projects

Augmented Reality system using Planar Homographies.

Planar homography is a warp operation that maps pixel coordinates from one camera frame to another with the fundamental assumption that the points are lying on a plane in the real world. This concept allows us to create cool applications such as an augmented reality system or a panorama stitcher.

Face Verification.

Face Verification is a problem whereby we are required to confirm if a pair of images depict the same peron's facial features. This task is widely used in modern day applications like the popular 'Face-unlock' feature in smartphones, document id verification etc. This task can essentially be split into two steps, face classification followed by face verification. Convolutional Neural Networks are the most popular choice while dealing with such tasks, hence ResNet-18 is the chosen architecture here.

Fake News Detection.

Fake news is rampant in today's date and verifying the authenticity of a news article is paramount. The aim of this project is to train various machine learning models to classify a given news article as authentic or fake. This task falls under the domain of Natural Language processing. The machine learning models explored in this project are Naive Bayes classifier, Random forest classifier and Logistic Regression.

Real time Eye/Gaze Tracking.

Real time Eye/Gaze tracking is a process which is really useful in the field of medical science and has proven to be the only method to objectively and accurately record and analyse visual behaviour. This use case is designed as a part of a larger medical subsystem catered towards improving the quality of life of Huntington's disease patients. This eye tracking implementation enables us to quantify the range of voluntary eye movement of HD patients through Occular Pursuit exercises. This implementation leverages the vast capabilities of the OpenCV library.

Speech to text transcription network.

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). This project attempts to create an end-to-end speech transcription network consisting of encoder-decoder structure equipped with attention mechanism. Levenshtein distance was the evaluation metric used to gauge the performance of the network. This architecture obtains an average Levenshtein distance of 24.3.