Model-based Testing Examples

12h

iPhone 17e vs. iPhone 17: I compared the two models to decide which has the better value

Apple's new iPhone 17e is shaping up to be a great midrange device, but how does it stack up against the base iPhone 17?

Communications of the ACM

Measuring What Matters in Large Language Model Performance

As large language models (LLMs) gain momentum worldwide, there’s a growing need for reliable ways to measure their performance. Benchmarks that evaluate LLM outputs allow developers to track ...

Search Engine Land

OpenAI starts testing ChatGPT ads

OpenAI confirmed today that it’s rolling out its first live test of ads in ChatGPT, showing sponsored messages directly inside the app for select users. The details. The ads will appear in a clearly ...

marktechpost

A Coding Implementation to Establish Rigorous Prompt Versioning and Regression Testing Workflows for Large Language Models using MLflow

In this tutorial, we show how we treat prompts as first-class, versioned artifacts and apply rigorous regression testing to large language model behavior using MLflow. We design an evaluation pipeline ...

ministryoftesting.com

The future of testing: Autonomous agents, ethical AI, and human oversight

The role of the tester has never been static! From the personal touch of verification to automated regressions, Quality Assurance (QA), and now Quality Engineering, software testing has evolved ...

National Academies of Sciences%2c Engineering%2c and Medicine

DOE Should Develop AI-Based Foundation Models Fused with Traditional Computational Methods to Bring Paradigm Shift to Scientific Discovery

WASHINGTON — A new report from the National Academies of Sciences, Engineering, and Medicine examines how the U.S. Department of Energy could use foundation models for scientific research, and finds ...

ministryoftesting.com

Show inaccessible results

iPhone 17e vs. iPhone 17: I compared the two models to decide which has the better value

Measuring What Matters in Large Language Model Performance

OpenAI starts testing ChatGPT ads

A Coding Implementation to Establish Rigorous Prompt Versioning and Regression Testing Workflows for Large Language Models using MLflow

The future of testing: Autonomous agents, ethical AI, and human oversight

DOE Should Develop AI-Based Foundation Models Fused with Traditional Computational Methods to Bring Paradigm Shift to Scientific Discovery

Skyrocket Your Test Coverage With Model-Based Testing Using TestCompass

Gemini 3 Just Scored 100% On A Critical Test All Other AI Models Fail

Microsoft starts testing AI model that could escalate competition with OpenAI