May 2025
·
6 Reads
A deep reinforcement learning framework is presented for strategy generation and profit forecasting based on large-scale economic behavior data. By integrating perturbation-based augmentation, backward return estimation, and policy-stabilization mechanisms, the framework facilitates robust modeling and optimization of complex, dynamic behavior sequences. Experimental evaluations on four distinct behavior data subsets indicate that the proposed method achieved consistent performance improvements over representative baseline models across key metrics, including total profit gain, average reward, policy stability, and profit–price correlation. On the sales feedback dataset, the framework achieved a total profit gain of 0.37, an average reward of 4.85, a low-action standard deviation of 0.37, and a correlation score of R2=0.91. In the overall benchmark comparison, the model attained a precision of 0.92 and a recall of 0.89, reflecting reliable strategy response and predictive consistency. These results suggest that the proposed method is capable of effectively handling decision-making scenarios involving sparse feedback, heterogeneous behavior, and temporal volatility, with demonstrable generalization potential and practical relevance.