top of page
Untitled design (8).png

Step into the Future of Compliance with Kodex AI

Connect today to learn how our technology can automate your regulatory workflows with Agentic Compliance so you can stay ahed of regulatory changes and mitigate risk effortlessly

KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models

  • Writer: Kodex AI
    Kodex AI
  • Aug 28, 2024
  • 1 min read

While powerful, current cutting-edge LLMs may not meet the needs of specialized sectors. We introduce KodeXv0.1, a family of language models that surpass GPT-4 in financial question answering. We use Llama 3.1 8B and 70B variants, adapting them to finance with a custom training regime. We collect publicly available financial documents like earnings calls and business reports to create a high-quality synthetic dataset of Context-Question-Answer triplets closely mirroring real-world tasks. Using this dataset, we perform RAG-aware 4bit LoRA instruction tuning with Llama 3.1 base variants to produce KodeX-8Bv0.1 and KodeX-70Bv0.1. We then conduct evaluations using FinanceBench, FinQABench, and a withheld test set. Our results show that KodeX-8Bv0.1 is more reliable in financial contexts than other models, surpassing them by up to 9.24%, and even outperforming GPT-4 by up to 7.07%. KodeX-70Bv0.1 further improves on this, exceeding GPT-4's performance in every benchmark tested.


Co-Authors

Neel Rajani, PhD Candidate in Responsible NLP, University of Edinburgh; BSc in Computing Science, University of Glasgow

Lilli Kiessling, MSc Candidate in Computational Neuroscience, Bernstein Center for Computational Neuroscience (BCCN) & Technische Universität Berlin; BSc in Physics, Technische Universität Berlin

Aleksandr Ogaltsov, Data Scientist (AI & ML), Kodex AI

Claus Lang, CTO & Co-Founder, Kodex AI




bottom of page