Taming Transformers

CNN, Transformers, Dictionary Learning, GAN and all that

Feb 28, 2021

Inferential Perspective on Federated Learning from CMU is a nice writeup on an algorithm which uses probabilistic inference to do federated learning which provides much better results.

Stanford AI Lab(SAIL) published a nice blog post that shows how removing some of the features can hurt some of the groups/classes in classification tasks.

Taming Transformers uses transformers to build a codebook for generative image synthesis. It is pretty interesting paper that combines Convolutional Neural Network, Transformers, GAN and dictionary learning to enable image synthesis. The code and Paper is also available.

It is hard to read equations in the papers and I shared some of the other ways in one of the previous letters on how some people are trying to propose named tensor notation to overcome these. There was an interesting blog post that outlines how some of these formulas can be written as “illuminated equations”. In one of the previous newsletter, I also mentioned for “named tensor” as a possible notation. I am happy that we are seeing innovation and research that focuses on the communication side of research.

Papers

Learning the predictability of future is a fun paper that learns higher dimensional representation in a hyperbolic space to be able to do prediction tasks on videos on what will come next.
- Hyperbolic space is being used to construct trees of various actions and based on these trees, the model can predict various actions even in lower level of confidence.
- There is a seminar that explains this paper as well if you want to lear more about.

Notebooks

Transducer implementation in PyTorch shows an example of spellcheck example. It is well explained and very detailed. If you are interested in a write-up, you can check that out as well.
Distilling zero-shot classifier is a tutorial that shows how to train a “student” classifier from a larger “teacher” model. The code is also available in here.
Transformer example shows a simple transformer in PyTorch.

Classes

Walk with Fastai is a nice class that shows different capabilities of fast.ai. One of the videos I watched is callback system which is a sweet capability that results in clean APIs.
One of the subscribers sent me this tutorial which was presented in KDD 2020: LinkedIn demonstrates on how they are using some of the deep learning for search and recommender systems.
- Part 1, Part 2, Part 3, Part 4, Part 5, Part 6
MIT has a seminar series that they published recently.
- Surprises in the quest for robustness in ML is one of the interesting ones that are part of this seminar series that talks about various problems that we see in the robustness in ML and possible mitigations for these problems.

Libraries

DALL-E open-sourced the code. There is also a model card which talks about the use cases as well as the datasets that it has been trained.
Meltano allows you to configure data pipelines in an expressive manner.
Knowledge Repo is an open source library to make the datasets discoverable and meaningful.
Ivy is an interesting approach to build deep learning models in various different languages like Jax, Numpy, Tensorfow and PyTorch. One of the submodules that shows how memory modules can be implemented is in here.

Workshops

DEEM is a workshop organized along with SIGMOD for applied machine learning, data management and systems research.

MLOps Newsletter