An AI-generated robot sitting in front of a computer, responding to customer service tickets.

AI21 CEO says transformers not right for AI agents due to error perpetuation

11 Oct 2024, 20:03 by Emilia David · VentureBeat

As more enterprise organizations look to the so-called agentic future, one barrier may be how AI models are built. For enterprise AI developer AI21, the answer is clear, the industry needs to look to other model architectures to enable more efficient AI agents.

Ori Goshen, AI21 CEO, said in an interview with VentureBeat that Transformers, the most popular model architecture, has limitations that would make a multi-agent ecosystem difficult.

“One trend I’m seeing is the rise of architectures that aren’t Transformers, and these alternative architectures will be more efficient,” Goshen said. “The architecture is computationally expensive, meaning the longer the context it handles, the slower and more costly it becomes. Agents need to call LLMs multiple times, often with extensive context at each step, which makes the transformer architecture a bottleneck.”

AI21, which focuses on developing enterprise AI solutions, has made the case before that Transformers should be an option for model architecture but not the default. It is developing foundation models using its JAMBA architecture, short for Joint Attention and Mamba architecture. It is based on the Mamba architecture developed by researchers from Princeton University and Carnegie Mellon University, which can offer faster inference times and longer context.

Goshen said alternative architectures, like Mamba and Jamba, can often make agentic structures more efficient and, most importantly, affordable. For him, Mamba-based models have better memory performance, which would make agents, particularly agents that connect to other models, work better.

He attributes the reason why AI agents are only now gaining popularity — and why most agents have not yet gone into product — to the reliance on LLMs built with transformers.

“The main reason agents are not in production mode yet is reliability or the lack of reliability,” Goshen said. “Since LLMs are inherently stochastic, additional elements will need to be integrated to ensure the level of reliability required.”

Enterprise agents are growing in popularity

AI agents emerged as one of the biggest trends in enterprise AI this year. Several companies launched AI agents and platforms to make it easy to build agents.

ServiceNow announced updates to its Now Assist AI platform, including a library of AI agents for customers. Salesforce has its stable of agents called Agentforce while Slack has begun allowing users to integrate agents from Salesforce, Cohere, Workday, Asana, Adobe and more.

Goshen believes that this trend will become even more popular with the right mix of models and model architectures.

“Some use cases that we see now, like question and answers from a chatbot, are basically glorified search,” he said. “I think real intelligence is in connecting and retrieving different information from sources.”

Goshen added that AI21 is in the process of developing offerings around AI agents.

Other architectures vying for attention

Goshen strongly supports alternative architectures like Mamba and AI21’s Jamba, mainly because he believes transformer models are too expensive and unwieldy to run.

Instead of an attention mechanism that forms the backbone of transformer models, Mamba can prioritize different data and assign weights to inputs, optimize memory usage, and use a GPU’s processing power.

Mamba is growing in popularity. Other open-source and open-weight AI developers have begun releasing Mamba-based models in the past few months. Mistral released Codestral Mamba 7B in July, and in August, Falcon came out with its own Mamba-based model, Falcon Mamba 7B.

However, the transformer architecture has become the default, if not standard, choice when developing foundation models. OpenAI’s GPT is, of course, a transformer model—it’s literally in its name—but so are most other popular models.

Goshen said that, ultimately, enterprises want whichever approach is more reliable. But organizations must also be wary of flashy demos promising to solve many of their problems.

“We’re at the phase where charismatic demos are easy to do, but we’re closer to that than to the product phase,” Goshen said. “It’s okay to use enterprise AI for research, but it’s not yet at the point where enterprises can use it to inform decisions.”