FinBoardBench: Benchmarking Dynamic Wealth Management and Strategic Financial Reasoning of LLMs via Board Game Simulations
Abstract
Recently, large language models (LLMs) have achieved superior performance in static financial reasoning and simple dynamic trading tasks. However, existing static financial benchmarks are insufficient to assess the dynamic wealth management and financial decision-making capabilities of LLMs in real-world environments. To bridge this gap, we present FinBoardBench, an evaluation suite based on three classic financial board games: Cashflow, Acquire, and Monopoly. FinBoardBench assesses a comprehensive set of financial skills, including personal cash flow management with debt balancing, corporate investment and acquisition forecasting, and competitive trade negotiations with asset auctions. Our experiments with 9 advanced LLMs reveal that while exhibiting basic long-term planning and investment logic, they fail to effectively leverage complex interactions for profit, and their strong static reasoning performance does not transform into successful dynamic decision-making. Notably, they tend to prioritize immediate asset acquisition over maintaining sufficient liquidity, making them vulnerable to financial crises triggered by random events. We hope that FinBoardBench can provide a valuable reference for more intelligent LLM-based decision-making systems in the future.
Fig. 1. Illustration of FinBoardBench.
Key Conclusion
We evaluate 9 advanced LLMs in FinBoardBench: GPT-5.4, Gemini-3.1-Pro Preview, GLM-5.1, HY3 Preview, DeepSeek V4 Pro, Doubao-2.0-Pro, Kimi K2.6, Qwen-3.6-Plus, and Mimo V2.5 Pro. Our key conclusions are as follows:
- LLMs exhibit basic long-term planning and management logic, enabling them to play financial games, but lack the awareness and ability to use complex interactions to generate profits.
- LLMs prioritize immediate acquisition of assets rather than maintaining sufficient liquidity, making them vulnerable to financial crises caused by random events.
- The advanced LLMs perform well on static financial benchmarks, but the capability does not transform into effective financial decision-making ability, which leads to failure in dynamic financial games.
Examples
Here are game examples in FinBoardBench.
Fig. 2. Cash Flow.
Fig. 3. Acquire.
Fig. 4. Monopoly.
BibTeX
@article{
coming soon...
}