Discussion and input from the members of J.L.’s laboratory are warmly acknowledged. “
“An important and pervasive idea in the psychology of decision making and choice is
that there is more than one class of possible strategy for acting. A key division is between forms of reflective control, which depend on the more or less explicit consideration of possible prospective future courses of actions and consequent outcomes, and forms of reflexive control a term we use in the restricted sense to describe how retrospective experience with good and bad outcomes sculpts present choice. This apparent dichotomy is so intuitively obvious that it has been realized in many, slightly different, and only partly compatible, ways (Dickinson, 1985, Kahneman, 2011 and Stanovich and West, 2002). Here, we single out one particular strand that has arguably been the most fecund in cognitive and theoretical neuroscience, learn more providing a set of behaviorally rigorous criteria for separating out the two classes of control. In turn, this has led to a set of important studies into the partly distinct neural realizations of these classes and thence to an understanding of their computational and statistical characteristics. The latter
provides a normative 17-AAG chemical structure rationale for their coexistence as offering efficient solutions to the demands of complex and changing environments and has also underpinned the design and interpretation of a collection of targeted empirical studies. We review the evolution of this strand by considering five generations of studies. We use the term “generation” as a frame of reference for our discussion and apply a
liberal semantic license in our use of the term, using it to describe a sequential evolution of ideas, as opposed to an orderly sequence in epochs of time. The zeroth generation represents some of the earliest intellectual battles in psychology between advocates of cognitive maps and stimulus-response theories (Thorndike, 1911 and Tolman, 1948). The fallout aminophylline from this debate was a first generation of behaviorally rigorous studies of goal-directed and habitual instrumental control, which in turn provided the foundation for investigation of their neural realizations (Balleine and Dickinson, 1998, Balleine, 2005, Dickinson and Balleine, 2002 and Killcross and Coutureau, 2003). In the second generation, these paradigms were carefully adapted for human neuroimaging studies, validating and amplifying the results from rodents (Tanaka et al., 2008, Liljeholm et al., 2011, Tricomi et al., 2009 and Valentin et al., 2007). In the third and fourth generations, an analysis of the two forms of control in terms of model-based and model-free reinforcement learning (Doya, 1999, Doya et al., 2002, Sutton and Barto, 1998 and Daw et al., 2005) was used to realize new tasks and to provide powerful methods for interpreting the ensuing results.