Thursday, March 27, 2025

Fin-R1: Using Large Language Models For Financial Reasoning

Fin-R1: A Comprehensive Linguistic Model for Reinforcement Learning-Based Financial Reasoning.

A large language model for complicated financial reasoning, Fin-R1 was created and made publicly available by FinStep. AI and the SUFE-AIFLM-Lab at Shanghai University of Finance and Economics’ School of Statistics and Data Science. It is based on Qwen2.5-7B-Instruct and is fine-tuned with high-quality, verifiable financial questions to attain SOTA performance on several financial benchmarks.

Model Applications  

Fin-R1 is a large language model(LLM) with a lightweight 7B parameter architecture that was created especially for the field of financial reasoning. The model goes through a two-stage training procedure that includes Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on high-quality chain-of-thought data that is specifically designed for financial reasoning scenarios, all while drastically lowering deployment costs.

Fin-R1 Model Applications
Image Credit To GitHub

The model’s capacity to carry out intricate financial reasoning is successfully improved by this approach, which offers a strong basis in theoretical support, business rules, decision logic, and technical execution for financial applications. Consequently, Fin-R1 provides robust support for fundamental financial business scenarios in trusts, securities, banking, and insurance.

Financial Code  

From basic financial computations to intricate derivatives pricing, risk assessment, and portfolio optimisation, financial code is computer programming code used in the financial industry for a variety of financial models, algorithms, and analytical tasks.

Financial Calculations

In order to address real-world financial problems and provide a scientific foundation for financial decisions, financial calculations entail the quantitative study and computation of a variety of financial difficulties utilising numerical methods and math models.

Financial Security, Compliance

Preventing financial crimes and guaranteeing regulatory compliance are the main goals of financial security and compliance, which also assists businesses in setting up strong compliance management systems.

Intelligent Risk Control

Compared to conventional approaches, intelligent risk control offers greater efficiency, accuracy, and real-time capabilities by identifying and managing financial risks using AI and big data.

ESG Analysis

To ensure that investments yield financial returns while fostering sustainable development, ESG analysis assesses a company’s environmental, social, and governance performance.

Overall Workflow  

Having built a data distillation framework based on DeepSeek-R1, closely adhering to the standard data processing parameter settings. To improve the quality of financial data, one can employed a two-stage data screening process, producing SFT and RL datasets. In order to improve accuracy and generalization in financial reasoning tasks, scientists trained the financial reasoning model Fin-R1 using Qwen2.5-7B-Instruct with supervised fine-tuning (SFT) and reinforcement learning (GRPO).

financial reasoning model Fin-R1
Image Credit To GitHub

Data Construction

It used Deepseek-R1 (full version) to distil and screen several datasets (FinCorpus, Ant_Finance, FinPEE, FinCUGE, FinanceIQ, Finance-Instruct-500K, FinQA, TFNS, ConvFinQA, and FinanceQT) in order to apply DeepSeek-R1’s reasoning capabilities to financial scenarios and meet the needs for high-quality financial reasoning data.

This led to the creation of Fin-R1-Data, a high-quality COT dataset with over 60k items that covers multifaceted financial knowledge in both Chinese and English. It is separated into four modules to accommodate different financial core scenarios. Using rule matching and Qwen2.5-72B-Instruct, one can first evaluated response correctness before analyzing reasoning logic consistency and term compliance in the creative dual-round scoring system for reasoning chains.

Fin-R1-Data Data Distribution

The four modules of Fin-R1-Data financial code, knowledge, non-reasoning and reasoning business knowledge, and supporting basic banking, securities, and trust scenarios cover multifaceted financial competence in both Chinese and English.

Fine-tuning and Training

Two-Step Procedure

Through two-phase fine-tuning of Qwen2.5-7B-Instruct, people created the financial reasoning large language model Fin-R1 for complicated reasoning tasks in the financial arena. Using high-quality financial reasoning data, scientists first improved the model’s preliminary financial reasoning capabilities through Supervised Fine-Tuning (SFT). Then, using reinforcement learning based on the GRPO (Group Relative Policy Optimization) algorithm, one can significantly enhanced the accuracy and generalization of financial reasoning tasks by combining format and accuracy incentives.

  • Step One: Integration of Reasoning Skills:

It used the financial datasets ConvFinQA and FinQA to perform supervised fine-tuning on Qwen2.5-7B-Instruct in order to solve complicated reasoning in financial jobs. Researchers have fixed the problem of general-purpose models giving incorrect answers in financial reasoning tasks following one iteration of fine-tuning training, guaranteeing that the model comprehends and manages intricate financial reasoning difficulties.

  • Step Two: Optimization of Reinforcement Learning

Everyone used the GRPO algorithm as the fundamental framework to maximize output format and accuracy using a dual-reward system after giving the model sophisticated reasoning abilities. Furthermore, in order to reduce the possibility of biases in regex-based incentives, the developers implemented a Model-Based Verifier that uses Qwen2.5-Max for response evaluation. This method improves the efficacy and stability of reinforcement learning by producing reward signals that are more accurate and consistent.

Optimization of Reinforcement Learning
Image Credit To GitHub

Model Evaluation Results

There evaluated the model using a benchmark that included several different financial scenarios. The findings demonstrated that, in financial scenarios, Fin-R1-SFT, when just fine-tuned with instruction (SFT), performs better than the base model but still trails DeepSeek-R1.

Therefore, the team utilized reinforcement learning (RL) to further train Fin-R1-SFT. With only 7B lightweight parameters, the resulting Fin-R1 performs remarkably well, placing second with an average score of 75.2 and outperforming all other same-scale models. It is just 3.0% behind DeepSeek-R1, and it is 6.0% better than the 70B-parameter DeepSeek-R1-Distill-Llama-70B (69.2). Furthermore, Fin-R1 exhibits its superior skills in both financial reasoning and non-reasoning contexts by winning two important tests, FinQA (76.0) and ConvFinQA (85.0).

ModelParametersFinQAConvFinQAAnt_FinanceTFNSFinance-Instruct-500kAverage
DeepSeek-R1671B71.082.090.078.070.078.2
Fin-R17B76.085.081.071.062.975.2
Qwen-2.5-32B-Instruct32B72.078.084.077.058.073.8
DeepSeek-R1-Distill-Qwen-32B32B70.072.087.079.054.072.4
Fin-R1-SFT7B73.081.076.068.061.071.9
Qwen-2.5-14B-Instruct14B68.077.084.072.056.071.4
DeepSeek-R1-Distill-Llama-70B70B68.074.084.062.056.069.2
DeepSeek-R1-Distill-Qwen-14B14B62.073.082.065.049.066.2
Qwen-2.5-7B-Instruct7B60.066.085.068.049.065.6
DeepSeek-R1-Distill-Qwen-7B7B55.062.071.060.042.058.0
Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Recent Posts

Popular Post