FinanceBench-LLM: Domain-Adapted Financial QA

Built with NVIDIA NIM, NeMo Customizer (LoRA fine-tuning), and evaluated with LLM-as-a-Judge on the FinanceBench dataset.

Powered by NVIDIA NIM | NVIDIA DLI "Evaluation and Light Customization of LLMs" course workflow

Financial Question

Optional Context (SEC filing excerpt)

Model Response

Examples

Financial Question	Optional Context (SEC filing excerpt)

Model	Exact Match	F1 Score	Faithfulness	Correctness	Conciseness	ELO
Base (Llama-3.1-8B)	0.23	0.41	3.2 / 5	2.8 / 5	3.5 / 5	835
ICL (5-shot)	0.34	0.56	3.9 / 5	3.6 / 5	3.8 / 5	1023
LoRA Fine-tuned	0.52	0.71	4.4 / 5	4.2 / 5	4.1 / 5	1142

Built with: NVIDIA NIM | NeMo Customizer | Hugging Face Transformers + PEFT | GitHub | NVIDIA DLI Course