news

Nvidia Has a New Way to Prevent A.I. Chatbots From ‘Hallucinating' Wrong Facts

Getty
  • Nvidia announced new software on Tuesday that will help software makers prevent AI models from saying incorrect facts, talking about harmful subjects, or opening up security holes.
  • The software, called NeMo Guardrails, is one example of how the AI industry is right now scrambling to address the "hallucination" issue with the latest generation of so-called large language models.

Nvidia announced new software on Tuesday that will help software makers prevent AI models from stating incorrect facts, talking about harmful subjects, or opening up security holes.

The software, called NeMo Guardrails, is one example of how the artificial intelligence industry is scrambling to address the "hallucination" issue with the latest generation of large language models, which is a major blocking point for businesses.

Large language models, like GPT from Microsoft-backed OpenAI and LaMDA from Google, are trained on terabytes of data to create programs that can spit out blocks of text that read like a human wrote them. But they also have a tendency to make things up, which is often called "hallucination" by practitioners. Early applications for the technology, such as summarizing documents or answering basic questions, need to minimize hallucinations in order to be useful.

Nvidia's new software can do this by adding guardrails to prevent the software from addressing topics that it shouldn't. NeMo Guardrails can force a LLM chatbot to talk about a specific topic, head off toxic content, and can prevent LLM systems from executing harmful commands on a computer.

"You can write a script that says, if someone talks about this topic, no matter what, respond this way," said Jonathan Cohen, Nvidia vice president of applied research. "You don't have to trust that a language model will follow a prompt or follow your instructions. It's actually hard coded in the execution logic of the guardrail system what will happen."

The announcement also highlights Nvidia's strategy to maintain its lead in the market for AI chips by simultaneously developing critical software for machine learning.

Nvidia provides the graphics processors needed in the thousands to train and deploy software like ChatGPT. Nvidia has more than 95% of the market for AI chips, according to analysts, but competition is rising.

How it works

NeMo Guardrails is a layer of software that sits between the user and the large language model or other AI tools. It heads off bad outcomes or bad prompts before the model spits them out.

Nvidia proposed a customer service chatbot as one possible use case. Developers could use Nvidia's software to prevent it from talking about off-topic subjects or getting "off the rails," which raises the possibility of a nonsensical or even toxic response.

"If you have a customer service chatbot, designed to talk about your products, you probably don't want it to answer questions about our competitors," said Nvidia's Cohen. "You want to monitor the conversation. And if that happens, you steer the conversation back to the topics you prefer."

Nvidia offered another example of a chatbot that answered internal corporate human resources questions. In this example, Nvidia was able to add "guardrails" so the ChatGPT-based bot wouldn't answer questions about the example company's financial performance or access private data about other employees.

The software is also able to use an LLM to detect hallucination by asking another LLM to fact-check the first LLM's answer. It then returns "I don't know" if the model isn't coming up with matching answers.

Nvidia also said Monday that the guardrails software helps with security, and can force LLM models to interact only with third-party software on an allowed list.

NeMo Guardrails is open source and offered through Nvidia services and can be used in commercial applications. Programmers will use the Colang programming language to write custom rules for the AI model, Nvidia said.

Other AI companies, including Google and OpenAI, have used a method called reinforcement learning from human feedback to prevent harmful outputs from LLM applications. This method uses human testers which create data about which answers are acceptable or not, and then trains the AI model using that data.

Nvidia is increasingly turning its attention to AI as it currently dominates the chips used to create the technology. Riding the AI wave that has made it the biggest gainer in the S&P 500 so far in 2023, with the stock rising 85% as of Monday.

Correction: Programmers will use the Colang programming language to write custom rules for the AI model, Nvidia said. An earlier version misstated the name of the language.

Copyright CNBC
Contact Us