TY - JOUR
T1 - Differential reinforcement encoding along the hippocampal long axis helps resolve the explore–exploit dilemma
AU - Dombrovski, Alexandre Y.
AU - Luna, Beatriz
AU - Hallquist, Michael N.
N1 - Funding Information:
This work was funded by K01 MH097091, R01 MH067924, and R01MH10095 from the National Institute of Mental Health. The authors thank Jiazhou Chen (data processing) and Kai Hwang and Rajpreet Chahal (data collection). The authors also thank Vishnu Murty and Brad Wyble for helpful comments on an earlier draft of the manuscript.
Publisher Copyright:
© 2020, The Author(s).
PY - 2020/12/1
Y1 - 2020/12/1
N2 - When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.
AB - When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.
UR - http://www.scopus.com/inward/record.url?scp=85093919124&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093919124&partnerID=8YFLogxK
U2 - 10.1038/s41467-020-18864-0
DO - 10.1038/s41467-020-18864-0
M3 - Article
C2 - 33106508
AN - SCOPUS:85093919124
SN - 2041-1723
VL - 11
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 5407
ER -