Stability
SDK Usage
Learn how to evaluate this metric programmatically in the Trustwise SDK Documentation.
The Trustwise Stability metric measures how similar an AI agent's responses are when given the same or similar inputs multiple times. It gives higher scores when responses stay consistent, even if asked by different personas or worded differently. This helps identify if an agent changes its answers unexpectedly.
FAQs
Why is Stability important?
If your agent gives inconsistent answers to reworded or similar questions, users may lose confidence in its reliability. Stability helps identify such inconsistencies.
How many responses should I provide?
At least two semantically similar responses are required.