Airma Model in Python with Code Explanation

19 large language models for safety or danger

These new models are specially trained to recognize when an LLM is potentially going off the rails. If they don’t like how an interaction is going, they have the power to stop it. Of course, every ...

Communications of the ACM

Measuring What Matters in Large Language Model Performance

As large language models (LLMs) gain momentum worldwide, there’s a growing need for reliable ways to measure their performance. Benchmarks that evaluate LLM outputs allow developers to track ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

19 large language models for safety or danger

Measuring What Matters in Large Language Model Performance

Trending now