Skip to content

Viewing Evaluation Results

After the evaluation run is complete, the session grid is updated with evaluator-specific columns, displaying average scores for each session or trace. Clicking on a score allows you to drill down into detailed results—for example, selecting a Trajectory Evaluation score reveals which paths the agent followed and which it missed.

  • In the Sessions tab, both session-level and trace-level evaluation results are visible. Sessions tab

  • In the Traces tab, only trace-level results are shown. Traces tab

This helps you understand how agents behave, making it easier to debug issues, check quality, and improve performance based on real usage.