Model Evaluation Icon

Order byBest matchMost fresh

News

Your AI models are failing in production—Here’s how to fix model selection - VentureBeat

Super excited that our second reward model evaluation is out. It's substantially harder, much cleaner, and well correlated with downstream PPO/BoN sampling. Happy hillclimbing!

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

News

Trending now