Sparsification and TVM AutoScheduler

Pinterest also fights with misinformation

Mar 14, 2021

I was reading a good number of Neural Architecture Search articles last weekend and someone pointed out a number of seminars format videos from YouTube. They are at the end of the newsletter.

Articles

Pinterest writes about how they fight against misinformation. They consider the problem as multi-category classification and use embedding based approach for image and text(text), they also use OCR to extract various pieces of text and then concatenate this into a single vector to do the classification.

Google published an article on how sparsifying the neural networks can make the neural networks more efficient in terms of compute and memory.

TVM wrote about their new auto scheduler which generates optimized code for chosen backend and hardware. This removes a step for parameter optimization and automates that search process under the hood with a single step.

Papers

Calibrate Before Use: Improving Few-Shot Performance of Language Models
- Few shot learning has been very popular and successful for large language models , but these methods do not consider how biased the model is or how to calibrate the answers towards a domain. This paper provides a calibration mechanism, “contextual calibration” and reports that the language model can perform much better in select domains.
Lambda Networks paper brings another way to model self-attention mechanism with a lambda function and henceforth the name of lambda networks.
- The main idea is to represent attention-like mechanism as a linear function rather than representing the attention mechanism to a distribution.
- By doing so, memory hungry self-attention layers can be represented in a much smaller memory footprint.

The following pseudo-code can give an idea on how this layer can be implemented in Python.

CheckFreq: Frequent, Fine-Grained DNN Checkpointing talks about a new frequent, neural network based checkpointing mechanism which performs better than traditional checkpointing mechanisms by being more efficient and storing less amount of space on disk.
- Slides are also available.
  Code is also available.
Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness is a paper from Facebook research talks about how SOTA chatbots are poorly calibrated and shows how to calibrate them properly to express confidence with the probability of higher accuracy.
ArtEmis: Affective Language for Visual Art: Images trigger and inspire various emotions in people. People will go to museums partially due to this reason as well. This paper looks at what type of emotions that we can classify from the images and especially on visual art domain. It is very interesting paper in the sense that rather than focusing on the content, it focuses on what type of emotions triggered in the human beings.
- The code is available.
How to represent part-whole hierarchies in a neural network: Geoffrey Hinton tries to build a system called GLOM which can explain how neural network encodes various pieces of information to perform a good job for a variety of tasks. This system does not exist in reality, but he is arguing that if this system would be made to work, then we can explain clearly on what neural network stores in a very good manner.
Are All Layers Created Equal?: The paper responds to the question a resounding no. All layers are not created equal even so that they create a classification on top of the layers as critical and ambient. The way they are trying to measure robustness is through re-initialization or re-randomization. If the model accuracy does not degrade, layer is called as ambient. If model accuracy drops after re-initialization, then the layer is called critical.
Rigging the Lottery Ticket Hypothesis: this paper talks about an end to end training system which does better than pure pruning on the large neural network architecture. The code is available in here for Tensorflow.
- It tries to remove various weights based on the weights and then tries to create new connections based on the gradient information in the training.

Libraries

TorchDrift is a new library to do data/model drift written in PyTorch.
FFT came to PyTorch. It supports an API similar to Numpy.

Models

Wav2Vec is a model from Facebook research that encodes the sound bites into various vectors.
TLCBench is a benchmarking library for Apache TVM.

Videos

Berkeley published a class for Machine Learning. If you like CS231N from Stanford, might be another good class that covers a variety of modern topics in deep learning.

Sanmi Koyejo talks about how to do metric selection for AutoML. Broadly, he talks about metric elicitation and its application on fairness setting.

Ameet Talwalkar talks about various approaches for neural architecture search in this talk. It first surveys the possible approaches and then talks about the research that his group is conducting.

MLOps Newsletter