Connect with us

Hi, what are you looking for?

Technology

Navigating the Challenges of Non-Deterministic AI in Enterprise Software

As generative AI technologies become increasingly embedded in software products and workflows, these systems start to mirror the characteristics of large language models (LLMs) themselves. This shift has led to concerns about reliability, as these models are inherently non-deterministic, often producing varied responses to identical inputs. This characteristic is both a feature and a challenge, particularly in enterprise environments where consistency and accuracy are paramount.

The non-deterministic nature of LLMs means that errors can propagate, especially when reasoning models and AI agents are involved. According to Dan Lines, COO of LinearB, “Ultimately, any kind of probabilistic model is sometimes going to be wrong. These kinds of inconsistencies that are drawn from the absence of a well-structured world model are always going to be present at the core of a lot of the systems that we’re working with and systems that we’re reasoning about.”

Understanding the Non-Determinism of LLMs

LLMs are designed to be “dream machines,” capable of generating novel and unexpected outputs. However, when these outputs are factually incorrect, they become problematic. In enterprise software, where reliability is crucial, understanding and mitigating these errors is essential. Daniel Loreto, CEO of Jetify, highlights the difficulty in predicting LLM behavior, emphasizing the need for tools and processes to ensure desired system performance.

Enterprise applications rely heavily on trust, which is built on authorized access, high availability, and idempotency. For GenAI processes, accuracy is an additional critical factor. Tariq Shaukat, CEO of Sonar, notes, “A lot of the real success stories that I hear about are apps that have relatively little downside if it goes down for a couple of minutes or there’s a minor security breach or something like that.”

Addressing Hallucinations and Enhancing Accuracy

One common issue with LLMs is “hallucinations,” where the model generates inaccurate or irrelevant information. Retrieval-augmented generation (RAG) is a typical approach to grounding responses in factual data, yet even RAG systems can falter. Amr Awadallah, CEO of Vectara, points out, “Even when you ground LLMs, 1 out of every 20 tokens coming out might be completely wrong, completely off topic, or not true.”

To mitigate these issues, additional guardrails on prompts and responses are necessary. Maryam Ashoori, Head of Product at watsonx.ai, IBM, explains the importance of filtering on both input and output sides to prevent harmful or inappropriate content from being generated.

Implementing Observability and Monitoring

Observability in LLMs is crucial for understanding and rectifying errors. Abby Kearns, CTO of Alembic, highlights the need for reinventing traditional tooling for machine workloads. While standard software metrics like logs and stack traces provide insights into system performance, LLMs require more nuanced approaches to measure hallucination rates, factual consistency, and bias.

Mark Doble, CEO of Alexi, suggests using multiple models to evaluate outputs, akin to a “LLM-as-judge” approach. This method ensures more reliable outputs by leveraging the collective intelligence of various models.

Ensuring Reliability in AI Workflows

Incorporating determinism into AI workflows is essential for enterprise applications. Jeremy Edberg, CEO of DBOS, emphasizes the importance of durable execution technologies that save progress within workflows, thereby preventing costly failures. “We’ve always had a cost to downtime, right? Now, though, it’s getting much more important because AI is non-deterministic,” he says.

Qian Li, cofounder of DBOS, advocates for checkpointing applications to ensure progress is saved, reducing the need for repeated prompts and minimizing the risk of varied responses.

Ultimately, while LLMs offer powerful capabilities, they also introduce complexity and risk. For personal projects, the non-determinism of AI can be intriguing and even delightful. However, for enterprise software, reliability and trust are non-negotiable. As Raj Patel, AI transformation lead at Holistic AI, aptly puts it, “Trust is key. I think trust takes years to build, seconds to break, and then a fair bit to recover.”

You May Also Like

Top Stories

California has taken a stand against a federal directive from the Trump administration demanding the exclusion of transgender athletes from girls’ and women’s sports....

Entertainment

Olivia Munn, the acclaimed actress, recently shared an intimate revelation about her personal struggles with trichotillomania, a disorder that compels individuals to pull out...

Top Stories

Frontier, a coalition of technology leaders including Google and Meta, has announced a landmark investment in Arbor, a cutting-edge startup specializing in bioenergy with...

Sports

Patrick Mahomes, the star quarterback for the Kansas City Chiefs, faced backlash recently due to a photo posted on July 4, where some critics...

Sports

Heavy rainfall in central Texas early on July 4, 2023, led to catastrophic flooding, resulting in a rising death toll that now exceeds 100...

Top Stories

The $10 billion AI startup Thinking Machines Lab (TML), founded by former OpenAI Chief Technology Officer Mira Murati in February, is making headlines for...

Health

Newswise — DALLAS – June 30, 2025 – Diets rich in phosphate additives, commonly found in processed foods, have been linked to increased blood...

Sports

The Houston Rockets have made a significant splash in the early stages of NBA free agency. On Monday night, they not only secured the...

Business

In a significant development for investors, National Bank Financial has upgraded CES Energy Solutions (TSE:CEU) from a hold rating to a strong-buy rating. This...

Top Stories

In a heartbreaking revelation, the parents of Kaylee Goncalves, one of the four University of Idaho students murdered in November 2022, have shared new...

Business

A summit of leaders from the BRICS group of major emerging economies commenced in Brazil on Sunday, but notably absent is the top leader...

Top Stories

In a bold move that could reshape the artificial intelligence landscape, xAI is set to unveil Grok 4 in a livestream event tomorrow. The...

Copyright © All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site.