Articles
Google published an article on generalization in deep learning models. As article mentions generalization is a well studied area in ml, but not so much in deep learning. The accompanying paper gives more detail on the approach to understand generalization in deep learning models.
Lyft writes about how they serve features in production. They utilize Redis’s batch functionality(when cache misses, they will default back to DynamoDB) and DynamoDB’s changefeed’s to populate analytics jobs. It is very interesting read on how they think about ingesting features up to the serving layer.
OctoML wrote about compiling HummingBird models through TVM and shows significant gains across RandomForest models in scikit-learn.
Papers
Distilling a Neural Network into A Decision Tree talks about how to distill the neural network’s output process into a decision tree.
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter talks about how to distill BERT into a smaller neural network through knowledge distillation to result in a smaller neural network.
Learning by Turning: Neural Architecture Aware Optimisation talks about how to update/optimize hyper parameters through neural architecture. The main contribution of this paper is to suggest that optimization of hyper parameters should be connected to the neural architecture by considering the neural network type as it is rotating through neurons when it comes to update.
Tutorials
I shared last week AutoScheduler in TVM. This nice tutorial shows how to configure that for GPU within TVM.
JAX 101 covers a lot of basics of Jax in a concise format.
Deep Implicit Layers is a workshop/tutorial that talks about:
Implicit Functions and Automatic Differentiation
Neural Ordinary Differential Equations
Differentiable Optimization
Libraries
VoxMovies Dataset and its accompanying library is a good pair.
SLING - A natural language frame semantics parser aims to parse various pieces of data in different languages to extract various facts. An example is here.
HummingBird is a library to convert various machine learning models into tensor computation to speed up the inference. This is a very interesting library and it is different from most of the compilers and JIT’s through supporting non-deep learning methods such as decision tree and random forest.
KFServing provides a way to deploy and serve various machine learning models in Kubernetes.
NeuroNer is a Named Entity Recognition library written in Tensorflow.
Classes
Bayesian Data Analysis class is a good, introductory class for Bayesian techniques.
Deep Learning Class is a hands on class for a variety of topics in deep learning in PyTorch.
Events
PyTorch has an EcoSystem day on April 21st. Be sure to register that day.
Videos
TVM Conference has a number of nice videos on TVM.
Percy Liang talked about deep learning models evolution over time in the following video:
Troubleshooting and debugging neural networks is not as fun to build them. This talk shows some of the common techniques on how you might approach to debug these neural networks: