Articles
NamedTensor talks about adapting a different notation for neural network operations by giving a good example on infamous attention formula( it is from Attention is all you need paper). It argues that linear algebra notation comes short when it comes to neural networks and why named tensor is better for representing these operations.
Notation’s source code is also available. PyTorch has a prototype implementation as well.
I am also one of those people who think that notation that we have is overly complicated to communicate the various operations for linear algebra anything nontrivial. Along with einsum notation, I am glad that other people are thinking about changing these notations to make them more intuitive and easier to understand .
Jax Articles
Jax is another deep learning library that is getting a lot of attention due to flexibility and composability of its approach to build various deep learning libraries. This week, I have seen so many articles around Jax that I decided to create a separate section.
Parallel Networking with Jax talks about how to parallelize various operations through Jax by giving examples through `Jax.jit`. Its notebook is also very good.
Evolving networks in Jax is a long and very well-written in a tutorial manner which implements a paper. If you want to learn Jax while trying to understand the paper, it is excellent. If you want to learn Jax only, it is still very good.
Hyperparameter Meta-loss landscapes with Jax is the most approachable post on Jax among all of the articles in this week’s newsletter. It motivates why Jax exists and then talks about hyperparameter landscapes. Its notebook is also good.
Autodata(automating common data operations) article explains the data landscape and why we need better tools to do these common data operations. So-called data janitor work needs to be automated, and this article gives a good landscape for available tools/companies and outline the gap between ideal state and what is out there.
Self-organizing textures is another excellent article by Distill. It talks about Neural Cellular Automata(NCA) and how it can be used to generate various textures. It first motivates how texture generation happens in nature through biology(zebra example is excellent). Then, talks about how neural networks can be used to generate various textures.
When I read this article, I could not help, but taught about how neural networks is very different(in terms stochasticity) than any other signal processing based texture generation. If you are following any digital signal processing research, you would know that most of the filters(high-pass, low-pass) follow some of sort of texture especially in image processing domain. This produces nice, deterministic textures that you can use to extract various patterns through these filters. On the generation side, you can also use these filters to produce other textures. The difference is that, even through noise, the textures generally follow well-defined patterns rather than “natural” looking. NCA’s produces very well textures that can be as good as their “nature” equivalence.
HuggingFace wrote a blog post on how Retrieval Augmented Generation(RAG) can be implemented with Ray. The original article talks about what RAG is, HuggingFace’s article is focusing more about how to scale the approach through Ray library in a distributed manner.
Representation Learning with Catalyst and Faces is an excellent article around representation learning for faces by Catalyst team. It talks about various different approaches to learn face representations and visualizes them through TSNE and PCA. Some of the modules are available in catalyst and they are based on PyTorch. If you are interested in reproducibility of the article, they also published a good notebook.
Books
MLStory is an interesting book about machine learning. It neither covers the fanciest, more modern deep learning algorithms nor its aim is to do so. It talks about some of the basic intuition behind machine learning and talks about mostly intuition of the concepts. I read three chapters (Datasets, Deep Learning and Representations/Features), it was really good. I was surprised in the Representation/Features that the authors covered the wavelets, basis functions from signal processing. They talked about SIFT,HOG in the deep learning sections. Unlike most of the machine learning books, it talks about “how we got here” by talking about what it was before. I really liked the “story”line as well where it talks about intuition and then talks about various concepts afterwards. Is it all great? Not really. The book is a little bit of everything. It tries to cover a number of different areas but it does not covert most of these in depth. That is why I think if you want to read areas where you do not have much knowledge, it is great, approachable book. However, it will not give you the expertise that you are looking for.
Datasets
PaperswithCode made 3000+ datasets publicly available. You can filter by “task” like object detection, recognition or question answering. In case of NLP models, you can also filter by language.
Videos
PyTorch has deep dives on a variety of topics in here.
Knowledge Graph Representation talks about first knowledge graphs and what they are. After that, it talks about how these concepts can be applied in machine learning concepts to learn various embedding representation and can be applicable for similarity measurement and a number of other domains.
Trustworthy ML videos are great watch. I especially liked Surprises in the Quest for Robust Machine Learning. It talks about why building robust machine learning systems are hard and talks about various interesting findings in developing various machine learning algorithms.
p.s: In the last newsletter, I made a misattribution for talk that Josh Tobbin gave. He presented in the seminar series that is organized by Stanford. I want to correct that he talked about EvaluationStore in the latest episode of the Stanford MLSys seminar series.