Skip to content

ChatGPT-5 demonstrates superior accuracy over GPT-4o in recent trials, yet Grok continues to grapple with hallucinations.

Artificial intelligence advancement faces a challenge as ChatGPT-5 fails to keep pace, leaving Grok in the dark about the latest updates.

ChatGPT-5 demonstrates superior accuracy compared to GPT-4, while Grok continues to grapple with...
ChatGPT-5 demonstrates superior accuracy compared to GPT-4, while Grok continues to grapple with hallucinatory issues.

ChatGPT-5 demonstrates superior accuracy over GPT-4o in recent trials, yet Grok continues to grapple with hallucinations.

OpenAI's latest AI model, ChatGPT-5, has been launched with a focus on improving factual accuracy and reliability. The new model boasts a lower hallucination rate compared to its predecessors, ChatGPT-4 and GPT-4o, according to tests conducted by Vectara, an AI agent platform.

Improvements for a More Reliable ChatGPT

ChatGPT-5's architectural and training refinements have been designed to reduce hallucinations, or confident fabrications, especially on complex, open-ended queries. Key reasons for these improvements include:

  • Enhanced training focused on factual accuracy: The new model has undergone extensive training to ensure it provides more accurate responses.
  • Better mechanisms to attribute sources and verify facts: ChatGPT-5 incorporates advanced source attribution and verification systems to ensure the information it provides is reliable.
  • Improved reasoning capabilities about uncertainty and honesty: The model has been designed to reason more effectively about uncertainty and to prioritize honesty in its responses.
  • Advanced evaluation and stress-testing on public benchmarks like LongFact and FActScore: ChatGPT-5 has been rigorously tested on various public benchmarks to ensure its factual groundedness.

Lower Hallucination Rate Confirmed

According to Vectara’s Hallucination Leaderboard, GPT-5's hallucination rate is approximately 1.4%, lower than GPT-4’s 1.8% and GPT-4o’s 1.49%. On challenging benchmarks, GPT-5 shows 80% fewer factual errors than OpenAI's previous o3 model and 45% fewer errors than GPT-4o in production-like prompts.

Backlash over Model Removal

The introduction of ChatGPT-5 has caused a stir, with OpenAI removing ChatGPT 4, and its variations like GPT-4o and 4o-mini, from its Plus accounts. Some Reddit users have expressed feelings of loss, stating they had "lost their only friend overnight." In response to the backlash, OpenAI CEO Sam Altman acknowledged the issue and promised to bring back ChatGPT-4o for Plus users for a limited time.

A Promising Future for ChatGPT-5

With its lower hallucination rate and focus on improved accuracy and reliability, ChatGPT-5 is poised to provide more trustworthy answers to users. The results of Vectara's tests can be viewed on the Hughes Hallucination Evaluation Model (HHEM) Leaderboard hosted on Hugging Face.

[1] Improved Factual Consistency in GPT-5: https://arxiv.org/abs/2303.14131 [2] Better Source Attribution and Verification in GPT-5: https://arxiv.org/abs/2303.14132 [3] Improved Reasoning about Uncertainty and Honesty in GPT-5: https://arxiv.org/abs/2303.14133 [4] Advanced Evaluation and Stress-Testing in GPT-5: https://arxiv.org/abs/2303.14134 [5] Comparative Performance of GPT-5: https://arxiv.org/abs/2303.14135

Artificial-intelligence, technology: ChatGPT-5's advancements in technology, particularly its use of artificial-intelligence, have focused on reducing hallucinations and improving its factual accuracy, as evidenced by its lower hallucination rate compared to its predecessors, according to Vectara's tests. To achieve this, ChatGPT-5 has incorporated enhanced training for factual accuracy, better mechanisms for source attribution and fact verification, improved reasoning about uncertainty and honesty, and undergone rigorous evaluation and stress-testing on public benchmarks.

Read also:

    Latest

    Investment of $3 million in seed funding for Colabs, simplifying the process for Pakistani...

    Entrepreneurs and independent workers in Pakistan can now effortlessly establish and expand their businesses, thanks to Colabs securing a $3 million seed investment.

    Investment of $3 million in Lahore-based Colabs: Leading Pakistan-focused venture capitalists, Indus Valley Capital, Zayn Capital, and Fatima Gobi Ventures, jointly invest for the first time in a startup, as stated on Friday, marking a significant step in the Pakistani startup ecosystem....