Natural Language Assessment opens up a new era of evaluation for exams and interviews

Google upgrades Open Images to v7, Farama Foundation will support Gymnasium

Nov 07, 2022

Farama Foundation wrote about their incorporation through a blog post. The foundation’s main aim is to:
- standardize major existing open source reinforcement learning(RL) libraries
- maintain and house them through long term
- ensure their reproducibility, performance, and quality of the features

The proposal and the foundation is going to be very similar to Linux Foundation or Apache Foundation, but they will be maintaining Reinforcement Learning libraries. They mainly maintain PettingZoo and Gymnasium as a solution for the environment layer of the Reinforcement learning, but beyond PettingZoo and Gymnasium, they’ve already begun officially maintaining several other popular benchmark environments. These include MAgent2, D4RL, Minigrid (formerly gym-minigrid), Miniworld (formerly gym-miniworld), MiniWoB++ and MicroRTS/MicroRTS-Py (formerly gym-microrts). You can view these and all of our other projects on their projects page.

Google wrote about how they do Natural Language Assessment(NLA) for a given query-answer through a language model in the following blog post.

This is very exciting as it can actually automate a number of questions’ evaluations in the education with the accuracy of the multi-selection questions.

Historically, multi-selection questions have been very popular as they are very easy to evaluate. However, they lack nuance and hard to evaluate on a granular level as the outcome of each question is binary(either true or not). Through this NLA mechanism, a question can be analyzed through multiple dimensions and multiple axises after they have been answered by the students.

A good example of the use case for this solution is also the interviews. As the interviewer evaluates interviewee through a finite set of questions and evaluate each questions a number of expectations. In the following diagram, question expects a clear definition of what an Hogwarts is and what type of student profile it attracts.

Student is able to capture the answer in the sense that it is a school, but fails to answer what type of school it is. Also, student’s answer is not confident and because of that it does not signal a depth knowledge in the topic.

Google also releases a tool called “interview-rampup” to demonstrate a a similar application to analyze the answers for an interview in their website.

Google released a new version of the popular image dataset Open Images, v7 in a blog post.

Open Images V7, which expands the Open Images dataset even further with a new annotation type called point-level labels and includes a new all-in-one visualization tool that allows a better exploration of the rich data available.

The paper goes into much more detail about the tool, the new version changes and how it is better in terms of class and class augmentation than the previous versions.

It has a nice explore section which is web based UI tool that allows you to look at the bounding boxes more, and allows you to interact with various classes such as “human arm”, if you have use cases for image labeling or overall labeling for your ML tasks, you can get a good amount of inspirations. You can download the images in here.

Libraries

Ibis-datasette brings both Ibis and Datasette libraries and integrate them together.
Gymnasium is standard API for reinforcement learning and a diverse set of reference environments.
- Classic Control - These are classic reinforcement learning based on real-world problems and physics.
- Box2D - These environments all involve toy games based around physics control, using box2d based physics and PyGame-based rendering
- Toy Text - These environments are designed to be extremely simple, with small discrete state and action spaces, and hence easy to learn. As a result, they are suitable for debugging implementations of reinforcement learning algorithms.
- MuJoCo - A physics engine based environments with multi-joint control which are more complex than the Box2D environments.
- Atari - A set of 57 Atari 2600 environments simulated through Stella and the Arcade Learning Environment that have a high range of complexity for agents to learn.
- Third-party - A number of environments have been created that are compatible with the Gymnasium API. Be aware of the version that the software was created for and use the apply_env_compatibility in gymnasium.make if necessary.
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you are able to combine them with other sources of computation or knowledge.
Langchain is aimed at assisting in the development of those types of applications. It aims to create:
1. a comprehensive collection of pieces you would ever want to combine
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
NN-SVG is a tool for creating Neural Network (NN) architecture drawings parametrically rather than manually. It also provides the ability to export those drawings to Scalable Vector Graphics (SVG) files, suitable for inclusion in academic papers or web pages.

MLOps Newsletter

Discussion about this post

MLOps Newsletter

Natural Language Assessment opens up a new era of evaluation for exams and interviews

Google upgrades Open Images to v7, Farama Foundation will support Gymnasium

Articles

Libraries

Discussion about this post