Inside the creation of DBRX, the world's most powerful open source AI model

[ad_1]

Last Monday, about a dozen engineers and executives from data science and AI company Databricks gathered in conference rooms connected via Zoom to find out if they had succeeded in creating a top-notch artificial intelligence language model. The team spent months and about $10 million training DBRX, a large language model similar in design to OpenAI's ChatGPT. But they wouldn't know just how powerful their creation was until the results of the final test of its abilities came back.

“We've gone through everything,” Jonathan Frankel, chief neural network architect at Databricks and leader of the team that built DBRX, ultimately told the team, which was responded with whoops, cheers, and applause emojis. Frankel usually stays away from caffeine but was sipping an iced latte after getting a full night's sleep to write the results.

Databricks will release DBRX under an open source license, allowing others to build on top of its work. Frankel shared data showing that across a dozen or more benchmarks measuring the ability of AI models to answer general knowledge questions, reading comprehension, solve complex logical puzzles and generate high-quality code. , DBRX was superior to every other open source model available.

Four people standing at the corner of a brown and yellow wall in an office space

AI Decision Makers: Jonathan Frankel, Naveen Rao, Ali Ghodsi, and Hanlin Tang.Photograph: Gabriela Hasbun

It outperformed Meta's Llama 2 and Mistral's Mixtral, two of the most popular open source AI models available today. “Yes!” shouted Databricks CEO Ali Ghodsi when the scores came out. “Wait, did we beat Elon's thing?” Frankel responded that he had actually outperformed the Grok AI model recently unveiled by Musk's XAI, adding, “If we get one nasty tweet from him I'll consider it a success.”

To the team's surprise, DBRX was also surprisingly close to GPT-4 by several points, OpenAI's closed model that powers ChatGPT and is widely considered the pinnacle of machine intelligence. “We have installed state-of-the-art technology for open source LLMs,” Frankel said with a big smile.

building blocks

By open-sourcing, DBRX Databricks is further accelerating a movement that is challenging the secretive approach of most major companies to the current generative AI boom. OpenAI and Google closely guard the code for their GPT-4 and Gemini large language models, but some rivals, notably Meta, have released their models for others to use, arguing that it violates the technology. Putting it into the hands of more people will promote innovation. Researchers, entrepreneurs, startups and established businesses.

Databricks says it also wanted to explain the work involved in creating its open source model, something Meta has not done, thanks to some important details about the creation of its Llama 2 model. The company will release a blog post detailing the work involved in building the model, and WIRED will also be invited to spend time with Databricks engineers as they made key decisions during the final stages of the multimillion-dollar process of training DBRX. Had taken. This highlights how complex and challenging it is to build a leading AI model – but also how recent innovations in the field promise to reduce costs. With the availability of open source models like DBRX, it turns out that AI development is not going to slow down any time soon.

Ali Farhadi, CEO of the Allen Institute for AI, says there is a strong need for more transparency in the creation and training of AI models. The sector has become increasingly secretive in recent years as companies try to gain an edge over competitors. He says transparency is especially important when there are concerns about risks arising from advanced AI models. “I'm very happy to see any efforts at openness,” says Farhadi. “I believe a significant portion of the market will move toward the open model. We need more of this.”