Usage ================= This section provides comprehensive documentation on how to use the Trustwise SDK, including basic setup, configuration, and detailed examples for evaluating different metrics. .. note:: For understanding the concepts of Trustwise metrics, please refer to the `Trustwise Metrics Documentation `_. Basic Setup ----------- Here's how to set up and configure the Trustwise SDK: .. code-block:: python import os from trustwise.sdk import TrustwiseSDK from trustwise.sdk.config import TrustwiseConfig # Configure using environment variables os.environ["TW_API_KEY"] = "your-api-key" os.environ["TW_BASE_URL"] = "https://api.trustwise.ai" config = TrustwiseConfig() # Or configure directly # config = TrustwiseConfig( # api_key="your-api-key", # base_url="https://api.trustwise.ai" # ) # Initialize the SDK trustwise = TrustwiseSDK(config) Evaluating Metrics ------------------ Below are example requests and responses for evaluating different metrics using the SDK. Each metric exposes an `evaluate()` endpoint. We will use faithfulness as an example. Faithfulness Metric ~~~~~~~~~~~~~~~~~~~ Copy and run this example to evaluate faithfulness: .. code-block:: python # Evaluate faithfulness faithfulness_result = trustwise.metrics.faithfulness.evaluate( query="What is the capital of France?", response="The capital of France is Paris.", context=[ {"node_id": "1", "node_score": 1.0, "node_text": "Paris is the capital of France."} ] ) print(faithfulness_result) **Output:** .. code-block:: text score=99.971924 facts=[Fact(statement='The capital of France is Paris.', label='Safe', prob=0.9997192, sentence_span=[0, 30])] Stability Metric ~~~~~~~~~~~~~~~~ Copy and run this example to evaluate stability: .. code-block:: python # Evaluate stability stability_result = trustwise.metrics.stability.evaluate( responses=[ "The capital of France is Paris.", "Paris is the capital of France.", "The capital of France is Paris." ] ) print(stability_result) **Output:** .. code-block:: text min=80 avg=87 .. note:: The :class:`~trustwise.sdk.types.StabilityResponse` provides both direct string representation and JSON serialization methods for flexible integration with different workflows. Refusal Metric ~~~~~~~~~~~~~~ Copy and run this example to evaluate refusal: .. code-block:: python # Evaluate refusal refusal_result = trustwise.metrics.refusal.evaluate( query="What is the capital of France?", response="The capital of France is Paris." ) print(refusal_result) **Output:** .. code-block:: text score=5 .. note:: The :class:`~trustwise.sdk.types.RefusalResponse` provides both direct string representation and JSON serialization methods for flexible integration with different workflows. Completion Metric ~~~~~~~~~~~~~~~~~ Copy and run this example to evaluate completion: .. code-block:: python # Evaluate completion completion_result = trustwise.metrics.completion.evaluate( query="What is the capital of France?", response="The capital of France is Paris." ) print(completion_result) **Output:** .. code-block:: text score=99 .. note:: The :class:`~trustwise.sdk.types.CompletionResponse` provides both direct string representation and JSON serialization methods for flexible integration with different workflows. Adherence Metric ~~~~~~~~~~~~~~~~ Copy and run this example to evaluate adherence: .. code-block:: python # Evaluate adherence adherence_result = trustwise.metrics.adherence.evaluate( policy="Always answer in French.", response="La capitale de la France est Paris." ) print(adherence_result) **Output:** .. code-block:: text score=95 .. note:: The :class:`~trustwise.sdk.types.AdherenceResponse` provides both direct string representation and JSON serialization methods for flexible integration with different workflows. Async Usage ----------- The SDK provides a fully asynchronous interface for high-throughput or concurrent evaluation. The async API mirrors the synchronous API, but requires `await` and an event loop. .. code-block:: python import asyncio from trustwise.sdk import TrustwiseSDKAsync from trustwise.sdk.config import TrustwiseConfig async def main(): config = TrustwiseConfig() trustwise = TrustwiseSDKAsync(config) result = await trustwise.metrics.faithfulness.evaluate( query="What is the capital of France?", response="The capital of France is Paris.", context=[{"node_id": "1", "node_score": 1.0, "node_text": "Paris is the capital of France."}] ) print(result) asyncio.run(main()) .. note:: All metrics are available in both synchronous and asynchronous forms. For async, use ``TrustwiseSDKAsync`` and ``await`` the ``evaluate()`` methods. Working with JSON Output ------------------------ .. code-block:: python # Convert to JSON format for serialization json_output = faithfulness_result.to_json(indent=2) print(json_output) **Output:** .. code-block:: json { "score": 99.971924, "facts": [ { "statement": "The capital of France is Paris.", "label": "Safe", "prob": 0.9997192, "sentence_span": [0, 30] } ] } Working with Python Dict Output ------------------------------- .. code-block:: python # Convert to Python dict for programmatic access dict_output = faithfulness_result.to_dict() print(dict_output) **Output:** .. code-block:: python {'score': 99.971924, 'facts': [{'statement': 'The capital of France is Paris.', 'label': 'Safe', 'prob': 0.9997192, 'sentence_span': [0, 30]}]} Working with Result Properties ------------------------------ .. code-block:: python # Access individual properties print(f"Faithfulness score: {faithfulness_result.score}") for fact in faithfulness_result.facts: print(f"Fact: {fact.statement}, Label: {fact.label}, Probability: {fact.prob}, Span: {fact.sentence_span}") **Output:** .. code-block:: text Faithfulness score: 99.971924 Fact: The capital of France is Paris., Label: Safe, Probability: 0.9997192, Span: [0, 30] .. note:: The :class:`~trustwise.sdk.types.FaithfulnessResponse` (and all other response objects) provides both direct string representation and JSON serialization methods for flexible integration with different workflows. Guardrails (Experimental) ------------------------- .. code-block:: python # Create a multi-metric guardrail guardrail = trustwise.guardrails( thresholds={ "faithfulness": 0.8, "answer_relevancy": 0.7, "clarity": 0.7 }, block_on_failure=True ) # Evaluate with multiple metrics evaluation = guardrail.evaluate( query="What is the capital of France?", response="The capital of France is Paris.", context=[ {"node_id": "1", "node_score": 1.0, "node_text": "Paris is the capital of France."} ] ) print(evaluation) **Output:** .. code-block:: console passed=True blocked=False results={'faithfulness': {'passed': True, 'result': FaithfulnessResponse(score=99.971924, facts=[Fact(statement='The capital of France is Paris.', label='Safe', prob=0.9997192, sentence_span=[0, 30])])}, 'answer_relevancy': {'passed': True, 'result': AnswerRelevancyResponse(score=96.38003, generated_question='What is the capital of France?')}, 'clarity': {'passed': True, 'result': ClarityResponse(score=73.84502)}} **JSON Response:** .. code-block:: python json_output = evaluation.to_json(indent=2) print(json_output) **Output:** .. code-block:: json { "passed": true, "blocked": false, "results": { "faithfulness": { "passed": true, "result": { "score": 99.971924, "facts": [ { "statement": "The capital of France is Paris.", "label": "Safe", "prob": 0.9997192, "sentence_span": [ 0, 30 ] } ] } }, "answer_relevancy": { "passed": true, "result": { "score": 96.38003, "generated_question": "What is the capital of France?" } }, "clarity": { "passed": true, "result": { "score": 73.84502 } } } }