Approaching Exposure Therapy as a Sequential Decision-making Problem


  • Aral Cimcim University of Vienna



Exposure Therapy (ET) is a variant of cognitive behavioral therapy (CBT) whose theoretical foundations have been widely explored in the domain of psychology to treat anxiety disorders. ET treatment regimens rely on correcting the maladaptive beliefs [1] of patients and adjusting the therapy conditions accordingly. However, the learning-based experimental designs [1] in ET have not yet been studied in detail for improving therapy protocols and remain suboptimal in practice. Theory of Mind (ToM) which refers to the human ability of inferring the intentions and beliefs of others, is a principal constituent of human cognition [2]. An extension of ToM in the machine learning (ML) field has been the involvement of meta learning [2] for multi-agent reinforcement learning (MARL) [3] to discover latent variables in simulation environments. Disruptions in ToM parameters such as the failure of a belief update function result typically in aberrant behavior [1]. ET protocols, therefore, are essentially founded on finding solutions for sequential decision-making problems. RL in the psychology literature has been identified as (i) habituation and (ii) goal-directed learning [3]. The difference between each type is the feedback from the environment with which humans can adapt to update their action-state pairs [3]. In ML, the corresponding behavior is formulated in terms of (i) model-based and (ii) model-free RL algorithms.


Our in silico MARL study is grounded on the premise of model-based and model-free RL algorithms which will be deployed in an RL therapist for adaptive design optimization (ADO). The therapist agent will function as a stand-in for the human therapist in agent-to-agent therapy sessions. The actions of the patient agents will be parameterized on the prediction error (PE) estimation capability of a mathematical model such as the Rescorla-Wagner model (RWM) while the actions of the therapist agent will be investigated with algorithms such as Dyna-Q or DQN [3] for learning control policies.

Expected Results

We hypothesize that an RL therapist agent with ToM capabilities [3] will be able to learn to accurately predict the mental states of patient agents and fine-tune the therapy environment. The results of the previously mentioned ML methods are expected to demonstrate viable insights to optimize ET and allocate quantifiable intervention components for affected populations.


[1] M. Moutoussis, N. Shahar, T. U. Hauser, and R. J. Dolan, “Computation in Psychotherapy, or How Computational Psychiatry Can Aid Learning-Based Psychological Therapies,” vol. 2, no. 0, Art. no. 0, Feb. 2018.

[2] N. C. Rabinowitz, F. Perbet, H. F. Song, C. Zhang, S. M. A. Eslami, and M. Botvinick, “Machine Theory of Mind.” arXiv, Mar. 12, 2018.

[3] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 2018.