FinBoardBench

FinBoardBench: Benchmarking Dynamic Wealth Management and Strategic Financial Reasoning of LLMs via Board Game Simulations

Xuesi Hu^1,2,*, Peng Wang^1,*, Jinpeng Miao¹, Xilin Tao¹, Caiwei Li³,
Yue Ma¹, Jie He¹, Qiancheng Zhang², Yuntao Zou⁴, Dagang Li^1,†

¹ Macau University of Science and Technology
² Anhui University
³ University of Macau
⁴ Huazhong University of Science and Technology
^*Indicates Equal Contribution.
^†Indicates Corresponding Author.

arXiv Code (Available Soon)

Abstract

Recently, large language models (LLMs) have achieved superior performance in static financial reasoning and simple dynamic trading tasks. However, existing static financial benchmarks are insufficient to assess the dynamic wealth management and financial decision-making capabilities of LLMs in real-world environments. To bridge this gap, we present FinBoardBench, an evaluation suite based on three classic financial board games: Cashflow, Acquire, and Monopoly. FinBoardBench assesses a comprehensive set of financial skills, including personal cash flow management with debt balancing, corporate investment and acquisition forecasting, and competitive trade negotiations with asset auctions. Our experiments with 9 advanced LLMs reveal that while exhibiting basic long-term planning and investment logic, they fail to effectively leverage complex interactions for profit, and their strong static reasoning performance does not transform into successful dynamic decision-making. Notably, they tend to prioritize immediate asset acquisition over maintaining sufficient liquidity, making them vulnerable to financial crises triggered by random events. We hope that FinBoardBench can provide a valuable reference for more intelligent LLM-based decision-making systems in the future.

Fig. 1. Illustration of FinBoardBench.

Video Demonstration

Selected Clips from Three Games.

Key Conclusion

We evaluate 9 advanced LLMs in FinBoardBench: GPT-5.4, Gemini-3.1-Pro Preview, GLM-5.1, HY3 Preview, DeepSeek V4 Pro, Doubao-2.0-Pro, Kimi K2.6, Qwen-3.6-Plus, and Mimo V2.5 Pro. Our key conclusions are as follows:

LLMs exhibit basic long-term planning and management logic, enabling them to play financial games, but lack the awareness and ability to use complex interactions to generate profits.
LLMs prioritize immediate acquisition of assets rather than maintaining sufficient liquidity, making them vulnerable to financial crises caused by random events.
The advanced LLMs perform well on static financial benchmarks, but the capability does not transform into effective financial decision-making ability, which leads to failure in dynamic financial games.

Examples

Here are game examples in FinBoardBench.

Fig. 2. Cash Flow.

Fig. 3. Acquire.

Fig. 4. Monopoly.

BibTeX

@article{
          coming soon...
        }