Summarization
The Summarization metric judges whether the response is supported by the context. The response is split into sentences, which are then checked with respect to the context. The scores are then aggregated into a single Summarization score.
FAQs
How do Faithfulness and Summarization differ?
Both metrics judge whether a response is supported by the provided context, but the difference lies in how this is achieved. Faithfulness verifies the response at the 'statement level', and thus is very fine-grained. In contrase, Summarization works at the 'sentence level', which is less fine-grained, but in exchange, it is faster.