Leaderboard

This page tracks benchmark results for HybridRAG-Bench.

Summary

  • Metric columns can be adapted to your final evaluation protocol.

  • Higher is better unless a column explicitly states lower-is-better.

  • Update the table by editing docs/source/_static/leaderboard.csv.

Current Results

Rank

Date

Model

Method

Dataset

EM

F1

Faithfulness

Notes

1

2026-02-12

Llama-3.3-70B

HybridRAG

movie

0.000

0.000

0.000

placeholder

2

2026-02-12

Llama-3.3-70B

KG-RAG

movie

0.000

0.000

0.000

placeholder

3

2026-02-12

Llama-3.3-70B

RAG

movie

0.000

0.000

0.000

placeholder

4

2026-02-12

Llama-3.3-70B

IO

movie

0.000

0.000

0.000

placeholder

Submission Format

Use this schema when adding new results:

  • Date: YYYY-MM-DD

  • Model: Model name and size

  • Method: IO, RAG, KG-RAG, HybridRAG, etc.

  • Dataset: Evaluated split/domain

  • EM: Exact match

  • F1: Token-level F1

  • Faithfulness: Attribution/grounding consistency

  • Notes: Optional details (retriever, hops, or config)