OpenAI - Accelerating Artificial General Intelligence
OpenAI is an artificial intelligence research laboratory consisting of OpenAI LP and its parent organization, the non-profit OpenAI Inc. The company, a competitor to technology giant DeepMind, conducts research in the field of artificial intelligence (AI) with the joint goal of promoting and developing friendly AI in a way that benefits humanity as a whole. The organization was founded in San Francisco in late 2015 by Elon Musk, Sam Altman, and others, who collectively pledged US$1 billion. Musk resigned from the board in February 2018, but remained a donor. In 2019, OpenAI LP received a US$1 billion investment from Microsoft. In June 2020, OpenAI announced GPT-3, a language model trained on trillions of words from the Internet. It also announced an associated API, named simply "the API", would form the heart of its first commercial product. GPT-3 is aimed at natural language answering of questions, but can also translate between languages and can coherently generate improvised text.
Jukebox is a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artistic styles. They are releasing the model weights and code, along with a tool to explore the generated samples. Automatic music generation dates back to more than half a century. A prominent approach is to generate music symbolically in the form of a piano roll, which specifies the timing, pitch, velocity, and instrument of each note to be played. This has led to impressive results like producing Bach chorals, polyphonic music with multiple instruments, as well as minute long musical pieces.
But symbolic generators have limitations—they cannot capture human voices or many of the more subtle timbres, dynamics, and expressivity that are essential to music. A different approach is to model music directly as raw audio. Generating music at the audio level is challenging since the sequences are very long.17 A typical 4-minute song at CD quality (44 kHz, 16-bit) has over 10 million timesteps. For comparison, GPT-2 had 1,000 timesteps and OpenAI Five took tens of thousands of timesteps per game. Thus, to learn the high-level semantics of music, a model would have to deal with extremely long-range dependencies.
MuseNet is a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles. MuseNet was not explicitly programmed with their understanding of music, but instead discovered patterns of harmony, rhythm, and style by learning to predict the next token in hundreds of thousands of MIDI files. MuseNet uses the same general-purpose unsupervised technology as GPT-2, a large-scale transformer model trained to predict the next token in a sequence, whether audio or text. MuseNet uses the recompute and optimized kernels of Sparse Transformer to train a 72-layer network with 24 attention heads—with full attention over a context of 4096 tokens. This long context may be one reason why it is able to remember long-term structure in a piece.
OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which they mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. The company will attempt to directly build safe and beneficial AGI, but will also consider their mission fulfilled if their work aids others to achieve this outcome.