Cue the elevator music

Meta's new AI-powered tool is like Dall-E but for music

The text-to-audio model adds to a growing catalogue of AI-generated music software

By
We may earn a commission from links on this page.
Meta logo.
Meta, like all other tech giants, is investing heavily in generative AI.
Photo: Dado Ruvic (Reuters)

Imagine AI one day composing quality music and sounds. It’s a reality that may be closer than you think. Today (Aug. 2), Meta announced it has released three open-sourced generative AI models—MusicGen, AudioGen, and EnCodec—that allow users to generate music as well as sound effects, from a dog barking to cars honking to footsteps on a wooden floor, with only a text prompt. The models, which are trained on Meta-owned and licensed music as well as public sound effects, are currently available for research purposes.

Generative AI models have wowed the world with their ability to create content that sounds and looks like a human’s. Of course, how good the outputs are is debatable. If you give the sample audio for the text prompt, “pop dance track with catchy melodies, tropical percussions, and upbeat rhythms, perfect for the beach” a listen, it sounds like something you would hear in an elevator.

Advertisement

Music is arguably the most challenging type of audio to create because it involves long sequences of data, according to Meta. In contrast, text-based generative AI models like Meta’s LLaMA work with much shorter data sequences. AI, so far, has not been able to “fully grasp the expressive nuances and stylistic elements found in music,” the company wrote in a blog post. Meta said that anyone can build on their models to create better sound generators.

The release of the text-to-audio tool comes as record labels and artists have voiced their concerns over AI models mimicking musicians.

Advertisement
Advertisement

What is Meta doing with generative AI?

Like all other tech giants, Meta has been investing in generative AI. In February, the company publicly released LLaMA, its own large language model (LLM) for research. Last month, the company revealed the latest version of its LLM, Llama 2, which is free for research and commercial use and available via the Azure cloud-computing service.

Meta is reportedly preparing to launch a range of AI-powered chatbots, likely powered by Llama, that exhibit different personalities, according to the Financial Times. Unlike Microsoft, which has been focusing on creating generative AI tools for enterprise companies, Meta differentiates itself by developing AI tools for users of its social media platforms. The AI chatbots reportedly are expected to take the form of different characters that can provide recommendations or new search functions in human-sounding conversations. This could boost user engagement for the social media giant and give Meta an opportunity to collect new data for targeted content and ads, according to the Financial Times.