While using generative AI technologies is beneficial, it’s good practice to have guardrails in place to ensure that incorrect facts, meanings, information or inferences are not given out to the consumer of the content. Large Language Models (LLMs) tend to hallucinate, hence it’s important to have checks and balances in place to avoid this behavior.
Consider the following example, where we asked a fact-based sport question to one of the best LLMs in the industry. The correct/valid facts in green, incorrect or hallucinated information in red and borderline facts in orange.
While the model generated some great valid facts like the match date, location of the match and the main controversy that happened during the game, it gave out some invalid facts as well:
- There was no solid start for India – India lost Sidhu at team score of 8 runs.
- The collapse started after Sachin Tendulkar’s dismissal and not after Sidhu’s when the team score was 98.
- Tendulkar was out stumped and not LBW, also there was no Indian umpire standing in that match, so Venkat couldn’t have been the umpire.
- After resumption, the match was interrupted again, and India did fall short, but the match was awarded to Sri Lanka by default as technically India was not all out
There is also one fact in orange, which is borderline case and is a matter of interpretation, rather than posting a challenging total of 252 runs, Sri Lanka posted a total of 251 runs, 252 was the target.
For those of you who are interested may use the following links to understand the actual events and facts:
- https://www.espncricinfo.com/series/wills-world-cup-1995-96-60981/india-vs-sri-lanka-1st-semi-final-65190/full-scorecard
- https://sportstar.thehindu.com/cricket/india-sri-lanka-1996-will-world-cup-semifinal-match-report-eden-gardens-crowd-trouble/article26522248.ece
Focused and Contextualized Content
Using LLMs can sometimes resemble dealing with an unbridled horse. LLMs have great power and ability to speed up information availability and generation but at the same time they can produce content which could be incorrect or biased. This problem can be addressed by using a strategy where the LLM is fed with contextualized information which is vetted as correct by humans – this method is broadly called Retrieval Augmented Generation or RAG. Additionally custom checks can be applied to ensure that the information generated is safe to deliver to the consumer.
If you are looking for a custom LLM implementation with safeguards in place over your own data sources, please contact us for a personalized demo.