ChatGPT does Virtual Machine, debugging and even a sample issue of this newsletter!

DeeepMind built a tool to write scenarios for movies and tv series!

Dec 18, 2022

If you have been spending a lot of time with ChatGPT, you my want to take a look at the awesome prompts: https://github.com/f/awesome-chatgpt-prompts

You can open full-fledged terminal, write Python, JS in various terminals and then debug your code further. Of course, this does not really do any of those, it just imagines what it would be to run the code in Python and imagine what the code execution would be when it “runs” the code.

Before going further, a new announcement for embeddings also came from OpenAI:

Bugra Akyildiz @bugraa

OpenAI announces new embeddings API endpoint: openai.com/blog/new-and-i… It is cross-modal, and it is 1/500th of the price of the old embedding model DaVinci.

Articles

Dramatron is a system that uses large language models that could be useful for authors for co-writing theatre scripts and screenplays. Dramatron uses hierarchical story generation for consistency across the generated text. Starting from a log line, Dramatron interactively generates character descriptions, plot points, location descriptions and dialogue. These generations provide human authors with material for compilation, editing, and rewriting.
Dramatron is conceived as a writing tool and as a source of inspiration and exploration for writers. To evaluate Dramatron’s usability and capabilities, we engaged 15 playwrights and screenwriters in two-hour long user study sessions to co-write scripts alongside Dramatron.
One concrete illustration of how Dramatron can be utilised by creative communities is how one playwright staged 4 heavily edited and rewritten scripts co-written alongside Dramatron. In the public theatre show, Plays by Bots, a talented cast of experienced actors with improvisational skills gave meaning to Dramatron scripts through acting and interpretation.

Contrastive learning describes a set of techniques for training deep networks by comparing and contrasting the models' representations of data. The central idea in contrastive learning is to take the representation of a point, and pull it closer to the representations of some points (called positives) while pushing it apart from the representations of other points (called negatives). The goal is to ensure that the representations themselves are meaningful, instead of just optimizing the models to generate a particular prediction. Standard researchers wrote 3 posts that explain in detail what CL is and how it is being used:
- https://hazyresearch.stanford.edu/blog/2022-04-19-contrastive-1
- https://hazyresearch.stanford.edu/blog/2022-04-19-contrastive-2
- https://hazyresearch.stanford.edu/blog/2022-04-19-contrastive-3

Vespa wrote about how they are using pre-trained bidirectional encoders for search functionality. They found that the bi-encoder and cross-encoder using the product title performed better than all the zero-shot models from the previous post. However, the bi-encoder trained on the description performs worse than lexical baselines. This result demonstrates that not all text vectorization models perform better than lexical ranking models, even when fine-tuned on the domain. There is a good notebook that you can take a look at the approach in Python.

Brown researchers compared human visual recognition systems and Deep Neural Networks for what they believe where the important features of images are. They found a lot of overlaps, but they also found some distinct different parts of the images and they introduce a mechanism called “Harmonization” to introduce some of the learnings from the human visual recognition system into the DNN training(through “aligning gradients”), feature maps are used through the mean Spearman correlation between human and DNN feature importance maps, normalized by the average inter-rater alignment of humans. Interesting idea is that introduction of the human visual recognition system as a bias or a much more explicit inductive bias within the training of DNNs.

Models are available in model zoo, they compare the results in the following page, code is available in GitHub, the paper is also available in here.

Libraries

EasyRec implements state of the art deep learning models used in common recommendation tasks: candidate generation(matching), scoring(ranking), and multi-task learning. It improves the efficiency of generating high performance models by simple configuration and hyper parameter tuning(HPO).
Collecting, labeling, and cleaning data for computer vision is a pain. Jump into the future and create your own data instead! Synthetic data is faster to develop with, effectively infinite, and gives you full control to prevent bias and privacy issues from creeping in. zpy to make synthetic data easy, by simplifying the simulation (sim) creation process and providing an easy way to generate synthetic data at scale.
Dragon is a new foundation model (improvement of BERT) that is pre-trained jointly from text and knowledge graphs for improved language, knowledge and reasoning capabilities. Specifically, it was trained with two simultaneous self-supervised objectives, language modeling and link prediction, that encourage deep bidirectional reasoning over text and knowledge graphs.
Dragon can be used as a drop-in replacement for BERT. It achieves better performance in various NLP tasks, and is particularly effective for knowledge and reasoning-intensive tasks such as multi-step reasoning and low-resource QA.
SPyQL is a query language that combines:
- the simplicity and structure of SQL;
- with the power and readability of Python.
```
SELECT
    date.fromtimestamp(.purchase_ts) AS purchase_date,
    .price * .quantity AS total
FROM json
WHERE .department.upper() == 'IT'
ORDER BY 2 DESC
TO csv
```
SQL provides the structure of the query, while Python is used to define expressions, bringing along a vast ecosystem of packages.
SPyQL is fast and memory efficient. Take a look at the benchmarks with GB-size JSON data.
Elegy is a high level API for Deep Learning in JAX
Main Features
- 😀 Easy-to-use: Elegy provides a Keras-like high-level API that makes it very easy to use for most common tasks.
- 💪‍ Flexible: Elegy provides a Pytorch Lightning-like low-level API that offers maximum flexibility when needed.
- 🔌 Compatible: Elegy supports various frameworks and data sources including Flax & Haiku Modules, Optax Optimizers, TensorFlow Datasets, Pytorch DataLoaders, and more.
Neural Tangents is a high-level neural network API for specifying complex, hierarchical, neural networks of both finite and infinite width. Neural Tangents allows researchers to define, train, and evaluate infinite networks as easily as finite ones.
Infinite (in width or channel count) neural networks are Gaussian Processes (GPs) with a kernel function determined by their architecture. See References for details and nuances of this correspondence. Also see this listing of papers written by the creators of Neural Tangents which study the infinite width limit of neural networks.
Neural Tangents allows you to construct a neural network model from common building blocks like convolutions, pooling, residual connections, nonlinearities, and more, and obtain not only the finite model, but also the kernel function of the respective GP.
The library is written in python using JAX and leveraging XLA to run out-of-the-box on CPU, GPU, or TPU. Kernel computation is highly optimized for speed and memory efficiency, and can be automatically distributed over multiple accelerators with near-perfect scaling.
PyTorch Classification Uncertainty is to provide an easy-to-run demo using PyTorch with low computational requirements for the ideas proposed in the paper Evidential Deep Learning to Quantify Classification Uncertainty. The authors of the paper originally used Tensorflow in their implementation.
Click-through rate (CTR) prediction is a critical task for many industrial applications such as online advertising, recommender systems, and sponsored search. FuxiCTR provides an open-source library for CTR prediction, with key features in configurability, tunability, and reproducibility. We hope this project could benefit both researchers and practitioners with the goal of open benchmarking for CTR prediction.
Jina is a MLOps framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer.
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models. This repository holds the competition rules and the benchmark code to run it.

ChatGPT wrote an MLOPS following newsletter issue:

Sure, here's a sample newsletter about machine learning operations (MLOps) with respect to recent developments in the field:

In this issue:

Introduction to MLOps
Recent developments in machine learning
The role of MLOps in the machine learning process

Introduction to MLOps

Machine learning operations, or MLOps, is a practice that focuses on integrating machine learning into the software development lifecycle. It involves the use of tools, processes, and practices to manage the end-to-end machine learning process, from experimentation and training to deployment and monitoring.

Recent Developments in Machine Learning

The field of machine learning has seen many exciting developments in recent years, including the use of deep learning algorithms, the emergence of reinforcement learning, and the growth of transfer learning. These developments have enabled machines to perform complex tasks with impressive accuracy and efficiency.

The Role of MLOps in the Machine Learning Process

As the field of machine learning continues to evolve, the role of MLOps is becoming increasingly important. MLOps helps to ensure that machine learning models are developed, deployed, and maintained in a reliable, efficient, and scalable manner.

Some key ways in which MLOps supports the machine learning process include:

Automating the training, testing, and deployment of machine learning models.
Providing tools and infrastructure for managing the machine learning lifecycle.
Ensuring that machine learning models are well-documented, reproducible, and auditable.
Monitoring machine learning models in production and providing alerts when performance deviates from expected behavior.

By adopting MLOps practices, organizations can optimize the use of machine learning and unlock the full potential of this powerful technology.

On Twitter

Ofir Press @OfirPress

People have been asking me why I think sinusoidal embeddings don’t extrapolate while ALiBi does. I think it’s because with position embeddings, transformers don’t actually “understand” the concept of positioning. The following is my hypothesis- 🧵⬇

MLOps Newsletter

Discussion about this post