Colin Rafel wrote rather interesting article on role playing paper reading seminars. If you are familiar with role playing games, this article tries to adapt that in a paper reading setting. He defines various rather hilarious roles to improve the efficiency and productivity of the paper reading sessions.
Lilian Weng surveyed various approaches to remove toxic content learning from pre-trained language models. It goes about how to structure about removing toxic content first by outlining detection/classification and then talks about other types of techniques such as controllable text generation and as basic as blacklisting in vocabulary. Unlearning is something I wrote about in one of the previous newsletter extensively.
I strongly believe that unlearning will become more and more important as fine-tuning will improve the model for a given/specified tasks, but currently we are nowhere near to guard these models in terms of toxicity, responsibleness and fairness.
Salesforce talks about their experience of building an AI platform. It talks on a high level how they think about platform for AI models and enablement for the rest of the company. Definitely a worth of a read if you are interested in platform/productionalization of these models in an enterprise setting.
PyTorch released a profiler in the most recent release to understand where the main bottlenecks are and optimize various parts of the machine learning flows. It further integrates with popular IDE of VSCode.
Pinterest open-sourced their big data collaboration tool called Querybook with a blog post. This tool is to increase collaboration and allow people to share the scripts/sqls in a standardized manner. I immediately thought about Jupyter notebooks when I looked at the tool. If you are sharing a number of SQL scripts across the board and it becomes hard to track with what everyone’s doing in a remote world, it can be used to consolidate all of these scripts in a single place.
Salesforce wrote about how they built a multi-tenant real-time prediction system in their AI platform. They leverage Kubernetes to build inference containers and make sure that different types of prediction tasks can be satisfied in this multi-tenant environment.
Etsy wrote about how they built an ad bidding process for their sellers. It gives first an introduction to ad bidding process and then talks about how they built a neural network based architecture to build a prediction mechanism for conversion rate.
Papers
Understanding the Representation and Representativeness of Age in AI Data Sets examines the age as a variable and how representative across different datasets. It finds that older people are under-represented in these datasets and society is getting older and older, the representatives of these datasets will decrease.
OmniNet: Omnidirectional Representations from Transformers proposes self-attention mechanism across the entire network. Since this approach is computationally intensive, it proposes efficient ways to compute this self-attention.
Pretrained Transformers As Universal Computation Engines proposes a frozen pretrained transformer for transfer learning without fine-tuning for a variety of tasks.
Libraries
PLMPapers is a collection of papers for pre-trained models.
Autokeras is a library for AutoML for Keras.
Lightly is a self-supervised learning library built in PyTorch for images.
AutoGL is a auto-ml library for Graph learning.
DL-Translate is a natural language translation library that is built on top of HuggingFace.
Classes
CMU published a class on Deep Learning for NLP. Stanford has a similar class as well. Fast.ai’s class and AWS’s class on NNs are also somehow similar.
Full Stack Deep Learning covers a variety different topics anything from how to build deep learning models to how to productionize them.
Machine Learning Systems Design mostly focuses on the productionization and deployment of the machine learning by emphasizing strongly on the component and system design aspect.
Conferences
MILA organizes a conference at 23rd April.
MLSys starts at 5th April.