I am a PhD student at National University of Singapore. I am working on programmable networks.
My recent research interests have revolved around networking and machine learning. Recent works include incremental deployment of class based hybridization models for campus networks and real time monitoring of QoS metrics in Software Defined Networks (SDN). I am also working on lung cancer detection and ear biometrics leveraging CNNs and ResNets.
My work on leveraging WebRTC for P2P content distribution in web browsers and character recognition in natural scene images have been accepted at international conferences. I have been among top 25% at various data science challenges (Kaggle.com) and completed 12 MOOCs on data science and related.
I have played Tabla, drums and octopad and national level competitions.
A full deployment of Software Defined Networking (SDN) paradigm poses multi-dimensional challenges viz., technical, financial and business challenges. Technical challenges of scalability, fault tolerance, centralization guarantees exist. Financial challenges of budget constraints, non-availability of phased transition model exist. Business challenges like acceptability, building confidence among network operators etc. exist. Therefore a direct and sudden transition from legacy networks to pure SDN seems unlikely. A hybrid deployment of SDN can be one of the plausible intermediate paths primarily because it provides an environment where both legacy and SDN nodes can work together. Thus, an incremental deployment strategy can be developed. Further, hybrid SDN can enforce the benefits of both the traditional networks and SDN paradigm. Hybrid SDN deployment has many advantages including adaptability to budget constraints, central programmability to the network, fall back to time-tested legacy mechanisms and so on. But there are challenges specific to hybrid models, like added complexity of running multiple paradigms together, realizing cooperation between control planes, etc.; but we envision that more research work is needed to maximize the benefits and limit the drawbacks.
In this paper, we present a comprehensive survey of hybrid SDN models, techniques, inter-paradigm coexistence and interaction mechanisms. First, we delineate an overview of hybrid SDN roots and consequently we discuss the definition, architectural pillars, benefits and limitations of hybrid SDN. Further, we categorize the different models under various headings, that can be used for deploying hybrid SDN. Next, we do a comparative analysis of each model. We discuss implementation approaches in each model and challenges that may arise in the deployment of hybrid SDN.
Abstract: Convolutional Neural Network based Human Identification using Outer Ear Images
This paper presents a deep learning approach for ear localization and recognition. The comparable complexity between human outer ear and face in terms of its uniqueness and permanence has increased interest in use of ear as a biometric. But similar to face recognition, it poses challenges such as illumination, contrast, rotation, scale and pose variation. Most of the techniques used for ear biometric authentication are based on traditional image processing techniques or hand crafted en- semble features. Owing to extensive work in the field of computer vision using Convolutional Neural Networks (CNNs) and Histogram of Ori- ented Gradients (HOG), the feasibility of Deep Neural networks (DNN) in the field of ear-biometrics has been explored in this research paper. A framework for ear localization and recognition is proposed that aims to reduce the pipeline for a biometric recognition system. The proposed framework uses HOG with Support Vector Machines (SVM) for ear local- ization and CNN for ear recognition. CNNs combine feature extraction and ear recognition tasks into one network with an aim to resolve issues such as variations in illumination, contrast, rotation, scale and pose. The feasibility of the proposed technique has been evaluated on USTB III database. This work demonstrates 97.9 % average recognition accu- racy using CNNs without any image preprocessing, which shows that the proposed approach is promising in the field of biometric recognition.
Abstract: Lung Cancer Detection: A Deep Learning Approach
We present an approach to detect lung cancer from CT scans using deep residual learning. We delineate a pipeline of pre-processing techniques to highlight lung regions vulnerable to cancer and extract features using UNet and ResNet models. The feature set is fed into multiple classifiers viz., XGBoost and Random Forest and the individual predictions are ensembled to predict the likelihood of a CT scan being cancerous. The accuracy achieved is 84% on LIDCIRDI out-performing previous attempts.
In SDN based networks, for network management such as monitoring, performance tuning, enforcing security, configurations, calculating QoS metrics etc. a certain fraction of traffic is responsible. It consists of packets for many network protocols such as DHCP, MLD, MDNS, NDP etc. Most of the time these packets are created and absorbed at midway switches. We refer to these as raw packets. Cumulative statistics of sent and received traffic is sent to the controller by OpenFlow compliant switches that includes these raw packets. Although, not part of the data traffic these packets get counted and leads to noise in the measured statistics and thus, hamper the accuracy of methods that depend on these statistics such as calculation of QoS metrics.
In this paper, we propose a method to estimate the fraction of the network traffic that consists of raw packets in Software Defined Networks. The number of raw packets transferred depends on the number of switches and hosts in the network and it is a periodic function of time. Through experiments on several network topologies, we have estimated a way to find a cap on the generated raw packets in the network, using spanning tree information about the topology.
Abstract: Real Time Monitoring of Packet Loss in Software Defined Networks
In order to meet QoS demands from customers, currently, ISPs over-provision capacity. Networks need to continuously monitor performance metrics, such as bandwidth, packet loss etc., in order to quickly adapt forwarding rules in response to changes in the workload. The packet loss metric is also required by network administrators and ISPs to identify clusters in network that are vulnerable to congestion. However, the existing solutions either require special instrumentation of the network or impose significant measurement overhead.
Software-Defined Networking (SDN), an emerging paradigm in networking advocates separation of the data plane and the control plane, separating the network's control logic from the underlying routers and switches, leaving a logically centralized software program to control the behavior of the entire network, and introducing network programmability. Further, OpenFlow allows to implement fine-grained Traffic Engineering (TE) and provides flexibility to determine and enforce end-to-end QoS parameters.
In this paper, we present an approach for monitoring and measuring online per-flow as well as per-port packet loss statistics in SDN. The controller polls all the switches of the network periodically for port and flow statistics via OpenFlow 1.3 multipart messages. The OpenFlow compliant switches send cumulative statistics of sent and received packets to the controller that includes raw packets (control, non-user generated packets responsible for network management); which, although not being part of the end-to-end data traffic, get counted and act as noise in the statistics. The proposed method takes into account the effect of raw packets and thus, hamper the accuracy of methods.
Other implementations propose approaches for per-flow packet loss only. We also take into account the effect of raw packets (control, non-user generated packets) which makes our packet loss estimation more accurate than other implementations. We also present a study of extrapolation techniques for predicting packet loss within poll interval.
Among the factors that determine the performance of a computationally intensive application running on a highperformance computing (HPC) system, communication between processes is vital. Our fundamental idea behind optimizing Message Passing Interface (MPI) communications is to maximize the utilization of the network of the cluster system, by deploying the Software Defined Networking (SDN) paradigm. We identify two primary issues: overuse of shortest paths leaving them choked and dynamically unoptimized path selection. SDN will be used to leverage dynamic routing to avoid these two issues. In this paper, we present an application-aware network routing mechanism specifically for enhancing MPI applications with the help of an adaptive routing algorithm.
Abstract: Control-data plane intelligence trade-off in SDN
With the decoupling of network control and data planes, the upcoming Software Defined Networking (SDN) paradigm advocates better network control and manageability. It introduces logical centralized control, network programmability and abstraction of underlying infrastructure from network services and applications. With global visibility of network state and central control that eases real time monitoring, policy alterations etc., it certainly enhances network security inherently. However, the separation of planes opens up new challenges like denial of service (DoS) attack, saturation attack, man-in-the middle attack and so on. Many of the issues of controller availability, controller-switch communication delay and scalability can be solved separately by distributed controllers, out-of-band communication links and parallelization respectively. Control-data plane intelligence trade-off has the potential to solve all of these. It increases controller availability, reduces latency for traffic engineering & decision making, and improves controller scalability. Moreover, control-data plane intelligence trade-off enables the control-data plane communication to be more secure. This will tremendously offload the processing load on the controller. We present how to realize control-data plane intelligence tradeoff extending OpenFlow.
The new paradigm of Software Defined Networking (SDN) although has great potential to address the complex problems presented by enterprise networks, it has its own deployment and scalability issues. Further, a full SDN deployment has its own business and economic challenges. A smooth transition from legacy networks to SDN (disruption free, accommodating budget constraints, with progressive improvement in network management) requires a hybrid networking model as an inevitable intermediate step; that allows heterogeneous paradigms to function together while the full transition is realized in phases. Therefore, the need of the hour is to develop an incremental deployment strategy that caters to the needs of the organization. We present here a classbased hybrid SDN model for Multi Protocol Label Switching (MPLS) networks. We discuss the model, design, components, their interactions, advantages and drawbacks. We also present an implementation and evaluation of a prototype. In legacy networks, MPLS architecture closely resembles SDN paradigm in terms of separation of control and data planes, flow-abstraction etc. Moreover, ISPs have preferred MPLS over the years due to benefits of virtual private networks and traffic engineering. The central idea is to partition traffic using forwarding equivalence classes at the ingress router, the rules of which can be updated via a centralized controller using OpenFlow. Therefore, we aim to use the standard MPLS data-plane together with a controlplane based on OpenFlow to come up with a systematic incremental deployment methodology as well as a hybrid operation model
Abstract: Meticulous Measurement of Control Packets in SDN
The data packet statistics sent by OpenFlow compliant switches cumulatively includes statistics about control traffic which is used for network control and management. This reduces the accuracy of calculation of QoS metrics and thus hampers network monitoring. We present here a novel algorithm to accurately measure the fraction of control packets in SDN within 3% error rate.
Abstract: A Browser-based Distributed Framework for Content Sharing and Student Collaboration
The utilization of the networks in education system has become increasingly widespread in recent years. WebRTC has been one of the hottest topics recently when it comes to Web technologies for distributed systems as it enables peer-to-peer (P2P) connectivity between machines with higher reliability and better scalability without the overhead of resource management.
In this paper, we propose a browser based, asynchronous framework of a P2P network using distributed, lookup protocol (Chord), NodeJS and RTCDataChannel; which is scalable and lightweight. The design combines the advantages of P2P networks for better and sophisticated education delivery. The framework will facilitate students to share course content and discuss with fellow students without requiring any centralized infrastructure support.
Abstract: Addressing Challenges in Browser Based P2P Content Sharing Framework Using WebRTC
Abstract: Comparative Study of Preprocessing and Classification Methods in Character Recognition of Natural Scene Images
This paper presents an approach to character recognition in natural scene images. Recognizing such text is a challenging problem in the field of Computer Vision, more than the recognition of scanned documents due to several reasons. We propose a classification technique for classifying characters based on a pipeline of image processing operations and ensemble machine learning techniques. This pipeline tackles problems where Optical Character Recognition (OCR) fails. We present a framework that comprises a sequence of operations such as resizing, grey scaling, thresholding, morphological opening and median filtering on the images to handle background clutter, noise, multi-sized and multi-oriented characters and variance in illumination. We used image pixels and HOG (Histogram of Oriented Gradients) as features to train three different models based on Nearest-Neighbour, Random Forest and Extra Tree classifiers. When the input images were pre-processed, HOG features were extracted and fed into extra tree classifier, and the model classified the characters with maximum accuracy, among the other models that we tested. The proposed steps have been experimentally proven to yield better accuracy than the present state-of-the-art classification techniques on the Chars74k dataset. In addition, the paper includes a comparative study elaborating on various image processing operations, feature extraction methods and classification techniques.
Honors and Awards
Merit Scholarship cum. 40% Fee Wavier, BITS Pilani
Offered only to 5% of the higher degree students
KVPY Scholarship, DST, Govt. of India
Awarded to top 125 students from the country for research excellence
National Award for Science Exhibit, CBSE
CBSE National Level Science Exhibition
Ear Biometrics, A Convolutional Neural Network Approach
Ear localization with a HOG+SVM framework and ear recognition using a CNN approach with Adagrad Optimization, 92.3% in USTB III dataset
Springleaf Marketing Response, Kaggle.com
Deployed XGBoost to predict which customers will respond to a direct mail, placed in top 16% at Kaggle.com among 2500 international participants
LLVM IR Superoptimizer using GreenThumb Advanced Compilation Techniques
Lung Cancer Detection, Deep Learning
Used ResNets for feature extraction from CT scans
Prayag Sangeet Samiti, Allahabad
Sangeet Prabhakar (BA) in Tabla percussion
Featured on TV Show: Jharkhand Ke Sitare as National Level Percussionist at Naxatra News, popular Hindi news channel in the Jharkhand state
National Level Championship in All India Youth Festival
Awarded by D. A. V. College Management Committee, represented 100 schools
Best Letter, in Jharkhand state, in 35th UPU Letter Writing Competition
organized by Universal Postal Union, the United Nations