PyTorch is now officially 2.0
US Copyright launches New Artificial Intelligence Initiative, GPT => General Purpose Technology?
Articles
US Copyright office launches New Artificial Intelligence Intelligence. This is directly trying to address copyright claims in Generative AI technologies and build a point of view on how to address some of the copyright challenges and problems with regards to use in creative work.
Public listening sessions are in the following:
Literary Works on Wednesday, April 19, from 1:00 p.m. to 4:00 p.m. eastern time
Visual Works on Tuesday, May 2, from 1:00 p.m. to 4:00 p.m. eastern time
Audiovisual Works on Wednesday, May 17, from 1:00 p.m. to 4:00 p.m. eastern time
Music and Sound Recordings on Wednesday, May 31, from 1:00 p.m. to 4:00 p.m. eastern time
OpenAI published an interesting paper that examines how ChatGPT can impact the US labor-force. Their findings are rather interesting, I recommend checking it out. They are also positing GPT as an interesting acronym:
Our analysis indicates that the impacts of LLMs like GPT-4, are likely to be pervasive. While LLMs have consistently improved in capabilities over time, their growing economic effect is expected to persist and increase even if we halt the development of new capabilities today. We also find that the potential impact of LLMs expands significantly when we take into account the development of complementary technologies. Collectively, these characteristics imply that Generative Pre-trained Transformers (GPTs) are general-purpose technologies (GPTs).4 (Bresnahan and Trajtenberg, 1995; Lipsey et al., 2005).
Instead of expanding the acronym instead of Generative Pre-trained Transformer, we might be calling these General Purpose Technology in future with this trend.
PyTorch announces 2.0 in a blog post. PyTorch 2.0 comes with `torch.compile` that allows you to get a lot of performance improvements out of the box. Some of these performance improvements are through the following improvements:
torch.compile is the main API for PyTorch 2.0, which wraps PyTorch model and returns a compiled model. It is a fully additive (and optional) feature and hence 2.0 is 100% backward compatible by definition.
As an underpinning technology of torch.compile, TorchInductor with Nvidia and AMD GPUs will rely on OpenAI Triton deep learning compiler to generate performant code and hide low level hardware details. OpenAI Triton-generated kernels achieve performance that’s on par with hand-written kernels and specialized cuda libraries such as cublas.
TorchDynamo captures PyTorch programs safely using Python Frame Evaluation Hooks and is a significant innovation that was a result of 5 years of our R&D into safe graph capture.
AOTAutograd overloads PyTorch’s autograd engine as a tracing autodiff for generating ahead-of-time backward traces.
PrimTorch canonicalizes ~2000+ PyTorch operators down to a closed set of ~250 primitive operators that developers can target to build a complete PyTorch backend. This substantially lowers the barrier of writing a PyTorch feature or backend.
TorchInductor is a deep learning compiler that generates fast code for multiple accelerators and backends. For NVIDIA and AMD GPUs, it uses OpenAI Triton as a key building block. For intel CPUs, we generate C++ code using multithreading, vectorized instructions and offloading appropriate operations to mkldnn when possible.
Stanford wrote a post on the new developments in Foundation Models(especially focusing on LLMs). They call out mainly 4 themes:
Trend 1: Deployment on the Rise
After the launch, many companies are adopting the LLM like mechanisms in their products such as Duolingo, Stripe, Morgan Stanley. With platforms like HuggingFace/Gradio, it is very easy to test out and integrate the model’s endpoints into the product.
The development of these models also make it very easy to develop in isolation and integrate with other product surfaces
Trend 2: Worsening Transparency
License requirements and other things that might make the models to be available for anyone to use is getting less and less importance and most of the model launches are no longer open.
Trend 3: Massive Influx of Funding
A lot of startups are raising a lot of funding, no surprise there.
Trend 4: Demand for Policy
As these models become more and more important, there is a large appetite in policy makers that these technologies should be regulated.
HAI in Stanford focuses in both transparency and policy areas more. It is natural for them to highlight the concerns to prevent AI Spring to turn into another AI Winter.
Libraries
A fully featured audio diffusion library, for PyTorch. Includes models for unconditional audio generation, text-conditional audio generation, diffusion autoencoding, upsampling, and vocoding. The provided models are waveform-based, however, the U-Net (built using
a-unet
),DiffusionModel
, diffusion method, and diffusion samplers are both generic to any dimension and highly customizable to work on other formats.Code Alpaca project, which aims to build and share an instruction-following LLaMA model for code generation. This repo is fully based on Stanford Alpaca ,and only changes the data used for training. Training approach is the same.
Namex is a simple utility to separate the implementation of your Python package and its public API.
Instead of letting users access every symbol in your
.py
files, Namex lets you create an allowlist of public symbols. You have fully control of what they are named and under what path they are exposed, without having to change where the code is actually located.BEAR is a new BEnchmark framework for video Action Recognition.
BEAR is a collection of 18 video datasets grouped into 5 categories (anomaly, gesture, daily, sports, and instructional), which covers a diverse set of real-world applications. With BEAR, one can thoroughly evaluate 6 common spatiotemporal models pre-trained by both supervised and self-supervised learning. BEAR can serve as a fair and challenging evaluation benchmark to gain insights on building next-generation spatiotemporal learners.
MidJourney-Styles-and-Keywords-Reference is a reference containing Styles and Keywords that you can use with MidJourney. There are also pages showing resolution comparison, image weights to understand various aspects of the model.
RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode.
nanoPALM, inspired by nanoGPT, the simplest, fastest repository for training/finetuning small to medium-sized PALM models.
It is trained on OpenWebText, using ~213M params and running on a single Nvidia 3090 GPU for 100,000 iterations (~26 hours).