Guide: Balamurugan Palaniappa
Department: Industrial Engineering & Operations Research
1. Can you give a specific example application of where your work could be useful?
Several researchers have suggested that neural networks are unstable to small, imperceptible, non-random input perturbations. These small perturbations to the original input do not change the object category for a human observer and therefore we expect a machine learning classifier to be invariant to such perturbations. However, as it turns out, it is possible to ’fool’ machine learning classifiers and arbitrarily change network prediction. These perturbed examples which in appearance are imperceptible as compared to benign inputs but fool the classifier are called Adversarial Examples in literature.
As the popular joke goes in adversarial literature, "Machine learning has the power to make pigs fly."
As we seek to deploy machine learning systems not only on virtual domains, but also in real systems, it becomes critical that we examine not only whether the systems don’t simply work “most of the time”, but which are truly robust and reliable.
Robustness to such perturbations is imperative for deployable machine learning in real systems. Imagine a self-driving car crashing into another car because it ignores a stop sign. Someone places a picture over the sign, which looks like dirt for a human but can be “read” as parking prohibited by a self-driving car’s computer vision system.
My thesis concentrates on designing neural networks that are robust to these perturbations and are not easily fooled. In addition to this, we see the interplay between model sparsity and how only some of the neurons actually take part in decision making; the rest of them can be ‘switched off’ while ensuring robustness and accuracy. The outcome will be faster and safer machine learning models.
2. What do you understand by scalable deep learning and lossless model compression? Can you explain
these in simple words?
Usually, when deep learning models are deployed in the real world, they come with a prerequisite of huge storage, computation power, and energy requirements. A typical deep learning model can be somewhere between a few megabytes to gigabytes. As the technology moves to portable smart devices, with on-device machine learning, these constraints on storage, computation, and energy become more stringent. It has been shown that these requirements are proportional to the number of parameters in a machine learning model. We look to find a balance between these constraints and being accurate i.e. reduce the number of parameters while being accurate and robust to adversarial images. This is in the literature mentioned as lossless compression.
3. Can you draw a block diagram of your work so that we can understand the methodology of your
The methodology is simple and can be directly incorporated into mainstream machine learning. We build
on the existing rich literature of evolutionary algorithms [ Initialization, Mutation, Selection, Termination]
1. Start with a trained network
2. Generate m mutated network by randomly switching off a few parameters
3. The crossover between mutated networks to generate m*(m-1)/2 children network
4. Retrain on clean and adversarial examples
5. Measure fitness; choose the best k networks (a.k.a parent networks)
6. Repeat until the desired compression is achieved & fitness is above the desired threshold:
A. The crossover between k parent networks to output k*(k-1)/2 children network
B. Retrain on clean and adversarial examples
C. Measure fitness; choose best k networks (which will now be parents)
Flowchart attached below: