FinBoardBench: Benchmarking Dynamic Wealth Management and Strategic Financial Reasoning of LLMs via Board Game Simulations
Abstract
Recently, large language models (LLMs) have achieved superior performance in static financial reasoning and simple dynamic trading tasks. However, current financial evaluations largely separate quantitative reasoning from dynamic constraints and social interactions, limiting the assessment of comprehensive financial decision-making. To bridge this gap, we propose FinBoardBench, an evaluation suite based on three financial board games: Cash Flow, Acquire, and Monopoly. This benchmark can evaluate a broad range of required financial skills, from personal cash flow management and corporate investment forecasting to open-ended trade negotiations. Experimental results show that LLMs exhibit basic long-term planning and investment logic, but struggle with liquidity constraints under uncertainty. Specifically, multiple LLMs often prioritize immediate asset acquisition or debt repayment over liquidity reserves, leaving them vulnerable to financial crises caused by random events. By exposing these behavioral gaps in controlled environments, FinBoardBench facilitates the development of more robust, strategically capable, and socially interactive financial agents.
Fig. 1. Illustration of FinBoardBench.
Key Conclusion
- LLMs exhibit basic long-term planning and management logic, enabling them to play dynamic financial games, but lack the awareness and ability to use complex interactions to generate profits.
- LLMs prioritize immediate acquisition of assets or repayment of debts rather than maintaining sufficient liquidity, making them vulnerable to financial crises caused by random events.
- LLMs achieve superior performance on static benchmarks, but cannot effectively manage cash flow to participate in dynamic financial investment activities.
Examples
Here are game examples in FinBoardBench.
Fig. 2. Cash Flow.
Fig. 3. Acquire.
Fig. 4. Monopoly.
BibTeX
@article{
coming soon...
}