Articles
Do Language Models Know How Heavy an Elephant Is article investigates how much large scale language models know about various scales about the nouns. It turns out that they know quite a bit, but still need specific training to be able to accomplish various quantity tasks. Do Language Embeddings Capture Scales paper talks about a new model called NumBERT which performs better in the quantitative tasks than the other language models. Another interesting paper that is referenced: Inducing Distributions over Quantitative Attributes which attacks the problem of quantitative attribute distribution learning. The dataset is also available for the paper.
Sebastian Ruder wrote about recent developments of language model fine-tuning. It covers a thorough review of papers that examines various different types of fine-tuning of large language models.
Multi-modal Neurons from Distill shows what type of features that have been learned by very large neural networks like GPT for certain phrases and words. Similar article from OpenAI is also published. Microsoft also has an interactive tool to do further block visualization.
For a given input text, the neural network can learn different types of modalities and various facets like the image above.By doing so, they can learn concepts with various “facets” in the region. As research is in the direction of getting bigger and bigger models, this type of faceting approach to different “languages”, “local areas”, “modal types(image, video, text)” could be very useful.
Facebook AI published a new SOTA paper on the image classification, object detection and segmentation by using swaths of data in a self-supervised manner. The library for self-supervised learning is also open-sourced.
Papers
Keyword Spotting
I watched this excellent talk on Keyword Spotting:
There were a lot of excellent references in the talk and here are some of the papers that were referenced:
A Cascade Architecture for Keyword Spotting on Mobile Devices
Locally-Connected and Convolutional Neural Networks for Small Footprint Speaker Recognition
DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems is a paper that builds new lessons on top of Wide & Deep Learning for Recommender Systems that was published a while ago. If you are interested in new architectures and components in the recommender systems, definitely a good read from Google.
Facebook publishes a paper that tries to address of limitation of transformers which does not capture all of the possible sequential relations unlike recurrent neural networks through Addressing Some Limitations of Transformers with Feedback Memory .
Learning by Turning: Neural Architecture Aware Optimization from Caltech talks about a new optimization method that incorporates neural network architecture in the time of updating the learning rate.
The code is available as well.
Courses
Mathematics, computer vision and machine learning is a masters level lecture series that covers a variety of machine learning topics with an application focus on computer vision. The course page also has a link to presentation material as well.
Tubingen had a good graduate level class on Statistical Machine Learning.
Machine Learning for Healthcare teaches various applications of machine learning in healthcare.
PlotNeuralNet allows you to draw nice SVG pictures for various neural network architectures in Python.
Videos
Dall-E and image generation is a conversation between Andrej Karpathy and Justin Johnson is a good listen. They also followed another episode. Originally, they recorded these in Clubhouse and then later put into YouTube.
Libraries
PyTorch Flops Counter is an excellent library if you want to look at the various performance characteristics of PyTorch models. This library enables you to count of total number of operations in the model.
Torchdistill is a nifty PyTorch library for Knowledge Distillation tasks. There are 2 excellent notebooks on how this library can be used, first one and second one.
TIMM(PyTorch Image Models) has a list of models.
Sentence Transformers has a list of pre-trained models to produce sentence embeddings based on PyTorch.