3D Scene Understanding

And building audio filters

Mar 28, 2021

Articles

Google published their 3D scene understanding models in Tensorflow and wrote a blog post explaining their work. They released the library as a separate package in here. They are extending Sparse Neural networks by implementing 3D kernels for pooling and sparse convolutions.

Google built a learnable filter mechanism for audio called Leaf(Learnable Audio Frontend) to change historically fixed Mel-cepstrum filter banks by learned filters in order to improve various audio tasks. They also open-sourced the library in here.

OpenML wrote and compared various formats for dataset standardization in the blog post. It is a good read if you want to learn about the new projects and their use cases in terms of what a good dataset format “should do and satisfy which requirements”.

HuggingFace wrote about how they are fine-tuning wav2vec2 to do automatic speech recognition.
Joelle Pineau has a nice machine learning reproducibility checklist.

Papers

Prefix-Tuning: Optimizing Continuous Prompts for Generation talks about a different type of fine-tuning that is lightweight and does not require to update/copy all of the model parameters.
Fine-Grained Stochastic Architecture Search proposes a new way to search and select the neural networks specifically for mobile applications.

Coin: Compression with Implicit Neural Representations talks about a new way to do image compression through weights of neural networks. Through an auto encoder, after it gets the encoded weights and compresses the weights rather than image pixels. It does not perform better than SOTA methods, but it is interesting approach to do compression through neural representations.

Libraries

Chatty-goose is a framework for conversational chat-bots.
pyserini is an interface for Anserini which is an open-source information retrieval toolkit in Java built on Lucene.
torchmetrics is a new library for PyTorch metrics. It integrates with PyTorch Distributed nicely.

Classes

Michigan published Deep Learning for Computer Vision for an introductory class for deep learning applications for computer vision.

Videos

Deep Multitask and Meta Learning is a nice course for multitask learning and meta learning.

Berkeley published a deep learning class for Deep Learning.

Foundations of Algorithmic Fairness is a good workshop for fairness in algorithms and machine learning.

Discussion about this post

No posts

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts