AI Models reliability

Reliability can have many meanings. By reliability I refer to the ability of an AI model to, given any input or prompt, generate an output that is consistent with the expectations of what we define as success.

Defining success means maximizing a reward function, which is indeed to be intended as how we define success for the model.

This forces us to define variables for said reward function, for the function to be maximized. Hence, here we are going to define the KPIs we shall focus on to improve the reliability of the model itself.

The ability to maximize the reliability of the model is key to having a model - and a business - that can be distributed at scale. Doesn't matter if it is B2B or B2C, reliability is fundamental to scale companies’ distribution. The key is to spend time on selling my solution without having to worry about the, on average, results of the outputs of the model.

Not all industries need 100% reliability. Some, such as the legal or medical fields, of which decisions’ outcomes have bigger impact need a reliability level of to be 100% or close to that. As a matter of fact, those are dealing with the lives of people. However, others can afford today to have less reliability: an 85% reliability is acceptable in many industries.

What's next for companies? As the current segment of foundational models is crowded by giants like open AI, Google, Anthropic, Meta, etc.. where the money and the scale needed to properly train the models are unattainable for most of the business, companies shall focus on the reliability of the outputs by performing tests and evals at scale. As a matter of fact, companies today in YC mainly spend their time doing that.

This is, in addition to distribution, what companies today shall concentrate on.

Sources:

Previous
Previous

Sales vs Engineering Organization

Next
Next

Lessons from Successful Startup Founders: Insights from “Founders at Work”