Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures

Marzban, Saeed; Delage, Erick; Li, Jonathan Y.

Recently equal risk pricing, a framework for fair derivative pricing, was extended to consider dynamic risk measures. However, all current implementations either employ a static risk measure that violates time consistency, or are based on traditional dynamic programming solution schemes that are impracticable in problems with a large number of underlying assets (due to the curse of dimensionality) or with incomplete asset dynamics information. In this paper, we extend for the first time a famous off-policy deterministic actor-critic deep reinforcement learning (ACRL) algorithm to the problem of solving a risk averse Markov decision process that models risk using a time consistent recursive expectile risk measure. This new ACRL algorithm allows us to identify high quality time consistent hedging policies (and equal risk prices) for options, such as basket options, that cannot be handled using traditional methods, or in context where only historical trajectories of the underlying assets are available. Our numerical experiments, which involve both a simple vanilla option and a more exotic basket option, confirm that the new ACRL algorithm can produce 1) in simple environments, nearly optimal hedging policies, and highly accurate prices, simultaneously for a range of maturities 2) in complex environments, good quality policies and prices using reasonable amount of computing resources; and 3) overall, hedging strategies that actually outperform the strategies produced using static risk measures when the risk is evaluated at later points of time.

Published December 2021 , 22 pages

This cahier was revised in August 2022

Research Axis

Axis 1: Data valuation for decision making

Research application

Economy and finance

Publication

Oct 2023

Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures

Saeed Marzban, Erick Delage, and Jonathan Y. Li

Quantitative Finance, 23(10), 1411–1430, 2023 BibTeX reference

Document

G2181R.pdf (1 MB)

GERAD

G-2021-81

Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures

Saeed Marzban, Erick Delage, and Jonathan Y. Li

Research Axis

Research application

Publication

Document