无论是机构投资者还是个人投资者,都迫切需要探索能够适应非平稳、低信噪比市场的自主模型。本研究旨在探索量化投资组合管理中的两个独特挑战:(1)表示的难度和(2)环境的复杂性。在这项研究中,我们提出了一种基于马尔可夫决策过程模型的深度强化学习模型,包括用于执行策略优化的深度学习方法,称为 SwanTrader。为了从两个不同的角度实现投资组合管理过程的更好决策,即基于市场观察的时间模式分析和稳健性信息捕获,我们建议在我们的模型中使用一个最佳的深度学习网络,该网络包含一个堆叠稀疏去噪自动编码器 (SSDAE) 和一个基于长短期记忆的自动编码器 (LSTM-AE)。COVID-19 时期的研究结果表明,与四种标准机器学习模型和两种最先进的 Sharpe 强化学习模型相比,使用两种深度学习模型的建议模型具有更好的结果和诱人的性能配置文件比率、卡尔马比率以及 beta 和 alpha 值。此外,我们分析了哪些深度学习模型和奖励函数在优化代理的管理决策方面最有效。我们为投资者建议的模型的结果可以帮助降低投资损失的风险,并帮助他们做出正确的决定。COVID-19 时期的研究结果表明,与四种标准机器学习模型和两种最先进的 Sharpe 强化学习模型相比,使用两种深度学习模型的建议模型具有更好的结果和诱人的性能配置文件比率、卡尔马比率以及 beta 和 alpha 值。此外,我们分析了哪些深度学习模型和奖励函数在优化代理的管理决策方面最有效。我们为投资者建议的模型的结果可以帮助降低投资损失的风险,并帮助他们做出正确的决定。COVID-19 时期的研究结果表明,与四种标准机器学习模型和两种最先进的 Sharpe 强化学习模型相比,使用两种深度学习模型的建议模型具有更好的结果和诱人的性能配置文件比率、卡尔马比率以及 beta 和 alpha 值。此外,我们分析了哪些深度学习模型和奖励函数在优化代理的管理决策方面最有效。我们为投资者建议的模型的结果可以帮助降低投资损失的风险,并帮助他们做出正确的决定。
Whether for institutional investors or individual investors, there is an urgent need to explore autonomous models that can adapt to the non-stationary, low-signal-to-noise markets. This research aims to explore the two unique challenges in quantitative portfolio management: (1) the difficulty of representation and (2) the complexity of environments. In this research, we suggest a Markov decision process model-based deep reinforcement learning model including deep learning methods to perform strategy optimization, called SwanTrader. To achieve better decisions of the portfolio-management process from two different perspectives, i.e., the temporal patterns analysis and robustness information capture based on market observations, we suggest an optimal deep learning network in our model that incorporates a stacked sparse denoising autoencoder (SSDAE) and a long–short-term-memory-based autoencoder (LSTM-AE). The findings in times of COVID-19 show that the suggested model using two deep learning models gives better results with an alluring performance profile in comparison with four standard machine learning models and two state-of-the-art reinforcement learning models in terms of Sharpe ratio, Calmar ratio, and beta and alpha values. Furthermore, we analyzed which deep learning models and reward functions were most effective in optimizing the agent’s management decisions. The results of our suggested model for investors can assist in reducing the risk of investment loss as well as help them to make sound decisions.