NVIDIA introduces new artificial intelligence capable of creating music and modifying sounds

Nvidia has surprisingly introduced a new AI model known as Fugatto that can generate voices, songs and sound effects from text descriptions. The manufacturer has gone further with generative AI as it also allows you to edit existing audio.

The company claims that Fugatto is “the world’s most flexible music machine.” Nvidia has been investing in the system for years and has developed various models that produce music or voices from text, but none of them have been perfect.

NVIDIA achieves an unprecedented level of personalization with deep training. It uses open source data with 2.5 billion parameters fed from DGX servers with 32 H100 accelerators.

Nvidia used a new technology called ComposableART during training to integrate text instructions that previously could only be interacted with separately. Fugatto can understand user requests and generate new audio clips without repeating the data it was trained on.

Nvidia’s AI has started creating combinations based on user texts. If you ask it to create an audio clip with the sound of rain, birds in the background and an explosion at the end, it will mix the information it was trained to create in seconds.

It allows you to quickly edit the result, modify the text description or add new effects and remove others. NVIDIA has provided other functions such as uploading audio from the device or isolating vocal and instrument tracks while displaying this AI.

Nvidia has announced the final version of its AI for audio editing, but it hasn't confirmed when it will reach all users, or even if it's a simple experience.

CoreTech Pulse

NVIDIA introduces new artificial intelligence capable of creating music and modifying sounds

NVIDIA introduces new artificial intelligence capable of creating music and modifying sounds

About the Author

Default