Fin-R1: Using Large Language Models For Financial Reasoning

March 25, 2025

246

Fin-R1: A Comprehensive Linguistic Model for Reinforcement Learning-Based Financial Reasoning.

A large language model for complicated financial reasoning, Fin-R1 was created and made publicly available by FinStep. AI and the SUFE-AIFLM-Lab at Shanghai University of Finance and Economics’ School of Statistics and Data Science. It is based on Qwen2.5-7B-Instruct and is fine-tuned with high-quality, verifiable financial questions to attain SOTA performance on several financial benchmarks.

Model Applications

Fin-R1 is a large language model(LLM) with a lightweight 7B parameter architecture that was created especially for the field of financial reasoning. The model goes through a two-stage training procedure that includes Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on high-quality chain-of-thought data that is specifically designed for financial reasoning scenarios, all while drastically lowering deployment costs.

The model’s capacity to carry out intricate financial reasoning is successfully improved by this approach, which offers a strong basis in theoretical support, business rules, decision logic, and technical execution for financial applications. Consequently, Fin-R1 provides robust support for fundamental financial business scenarios in trusts, securities, banking, and insurance.

Financial Code

From basic financial computations to intricate derivatives pricing, risk assessment, and portfolio optimisation, financial code is computer programming code used in the financial industry for a variety of financial models, algorithms, and analytical tasks.

Financial Calculations

In order to address real-world financial problems and provide a scientific foundation for financial decisions, financial calculations entail the quantitative study and computation of a variety of financial difficulties utilising numerical methods and math models.

Financial Security, Compliance

Preventing financial crimes and guaranteeing regulatory compliance are the main goals of financial security and compliance, which also assists businesses in setting up strong compliance management systems.

Intelligent Risk Control

Compared to conventional approaches, intelligent risk control offers greater efficiency, accuracy, and real-time capabilities by identifying and managing financial risks using AI and big data.

ESG Analysis

To ensure that investments yield financial returns while fostering sustainable development, ESG analysis assesses a company’s environmental, social, and governance performance.

Overall Workflow

Having built a data distillation framework based on DeepSeek-R1, closely adhering to the standard data processing parameter settings. To improve the quality of financial data, one can employed a two-stage data screening process, producing SFT and RL datasets. In order to improve accuracy and generalization in financial reasoning tasks, scientists trained the financial reasoning model Fin-R1 using Qwen2.5-7B-Instruct with supervised fine-tuning (SFT) and reinforcement learning (GRPO).

financial reasoning model Fin-R1 — Image Credit To GitHub

Data Construction

It used Deepseek-R1 (full version) to distil and screen several datasets (FinCorpus, Ant_Finance, FinPEE, FinCUGE, FinanceIQ, Finance-Instruct-500K, FinQA, TFNS, ConvFinQA, and FinanceQT) in order to apply DeepSeek-R1’s reasoning capabilities to financial scenarios and meet the needs for high-quality financial reasoning data.

This led to the creation of Fin-R1-Data, a high-quality COT dataset with over 60k items that covers multifaceted financial knowledge in both Chinese and English. It is separated into four modules to accommodate different financial core scenarios. Using rule matching and Qwen2.5-72B-Instruct, one can first evaluated response correctness before analyzing reasoning logic consistency and term compliance in the creative dual-round scoring system for reasoning chains.

Fin-R1-Data Data Distribution

The four modules of Fin-R1-Data financial code, knowledge, non-reasoning and reasoning business knowledge, and supporting basic banking, securities, and trust scenarios cover multifaceted financial competence in both Chinese and English.

Fine-tuning and Training

Two-Step Procedure

Through two-phase fine-tuning of Qwen2.5-7B-Instruct, people created the financial reasoning large language model Fin-R1 for complicated reasoning tasks in the financial arena. Using high-quality financial reasoning data, scientists first improved the model’s preliminary financial reasoning capabilities through Supervised Fine-Tuning (SFT). Then, using reinforcement learning based on the GRPO (Group Relative Policy Optimization) algorithm, one can significantly enhanced the accuracy and generalization of financial reasoning tasks by combining format and accuracy incentives.

Step One: Integration of Reasoning Skills:

It used the financial datasets ConvFinQA and FinQA to perform supervised fine-tuning on Qwen2.5-7B-Instruct in order to solve complicated reasoning in financial jobs. Researchers have fixed the problem of general-purpose models giving incorrect answers in financial reasoning tasks following one iteration of fine-tuning training, guaranteeing that the model comprehends and manages intricate financial reasoning difficulties.

Step Two: Optimization of Reinforcement Learning

Everyone used the GRPO algorithm as the fundamental framework to maximize output format and accuracy using a dual-reward system after giving the model sophisticated reasoning abilities. Furthermore, in order to reduce the possibility of biases in regex-based incentives, the developers implemented a Model-Based Verifier that uses Qwen2.5-Max for response evaluation. This method improves the efficacy and stability of reinforcement learning by producing reward signals that are more accurate and consistent.

Optimization of Reinforcement Learning — Image Credit To GitHub

Model Evaluation Results

There evaluated the model using a benchmark that included several different financial scenarios. The findings demonstrated that, in financial scenarios, Fin-R1-SFT, when just fine-tuned with instruction (SFT), performs better than the base model but still trails DeepSeek-R1.

Therefore, the team utilized reinforcement learning (RL) to further train Fin-R1-SFT. With only 7B lightweight parameters, the resulting Fin-R1 performs remarkably well, placing second with an average score of 75.2 and outperforming all other same-scale models. It is just 3.0% behind DeepSeek-R1, and it is 6.0% better than the 70B-parameter DeepSeek-R1-Distill-Llama-70B (69.2). Furthermore, Fin-R1 exhibits its superior skills in both financial reasoning and non-reasoning contexts by winning two important tests, FinQA (76.0) and ConvFinQA (85.0).

Model	Parameters	FinQA	ConvFinQA	Ant_Finance	TFNS	Finance-Instruct-500k	Average
DeepSeek-R1	671B	71.0	82.0	90.0	78.0	70.0	78.2
Fin-R1	7B	76.0	85.0	81.0	71.0	62.9	75.2
Qwen-2.5-32B-Instruct	32B	72.0	78.0	84.0	77.0	58.0	73.8
DeepSeek-R1-Distill-Qwen-32B	32B	70.0	72.0	87.0	79.0	54.0	72.4
Fin-R1-SFT	7B	73.0	81.0	76.0	68.0	61.0	71.9
Qwen-2.5-14B-Instruct	14B	68.0	77.0	84.0	72.0	56.0	71.4
DeepSeek-R1-Distill-Llama-70B	70B	68.0	74.0	84.0	62.0	56.0	69.2
DeepSeek-R1-Distill-Qwen-14B	14B	62.0	73.0	82.0	65.0	49.0	66.2
Qwen-2.5-7B-Instruct	7B	60.0	66.0	85.0	68.0	49.0	65.6
DeepSeek-R1-Distill-Qwen-7B	7B	55.0	62.0	71.0	60.0	42.0	58.0

Fin-R1: Using Large Language Models For Financial Reasoning

Model Applications

Financial Code

Financial Calculations

Financial Security, Compliance

Intelligent Risk Control

ESG Analysis

Overall Workflow

Data Construction

Fin-R1-Data Data Distribution

Fine-tuning and Training

Two-Step Procedure

Model Evaluation Results

Google Magic Mirror Experience Driven by Gemini Models

Pluto AI: A New Internal AI Platform For Enterprise Growth

Bolttech Improves Customer Experience with AWS Generative AI

LEAVE A REPLY Cancel reply

Page Content

Recent Posts

AMD Radeon Pro W6600 Benchmark in CAD, Video Editing

Intel Core Ultra 5 225H Performance for Everyday Tasks

Intel Core i9 13900K Price, Benchmark, and Specifications

NVIDIA Tesla V100 Price, Features And Specifications

Google Magic Mirror Experience Driven by Gemini Models

Pluto AI: A New Internal AI Platform For Enterprise Growth

About Us

Tutorials