ECC3800 Essay Abstract Richard Bellmans main contribution to the field of economics was the Bellman equation. It is a crucial condition of optimisation that enables the method known as dynamic programming. This revolutionised game theory and is extensively used in recursive economics, being used to calculate the expected cumulative reward over time. There has been recent discussion around the usefulness of the equation given its computational difficulties and its assumptions like perfect knowledge and rationality. However, via the use of other mathematical techniques and the rise of heterodox economics, this equation has seen even more use and utility. Assignment Richard Bellman's key idea is the Bellman equation. It states that the value of a decision problem at a time period is equal to the maximum of the payoff from initial choices and the value of the remaining decision problem that results from those initial choices. It is a condition for the optimization method dynamic programming. In this method, the value of a decision in a time period is written in terms of pay off from the value of initial choices, with the value of the remaining decisions resulting from the initial choices. In doing so, a multi-period planning optimisation problem is broken into a series of simpler problems. The Bellman equation can be used to calculate an agents expected reward if it follows optimal policy and study optimal behavior of agents such as firms and consumers. There are three components to the bellman equation. The objective function is the optimisation problem that describes the main objective. This objective may be utility maximisation, cost minimisation etc. Given that dynamic programming reduces multi time period problems into simpler steps at different time periods, it is necessary to track how the problem evolves over time. The information about the situation in any time period is called the state variable. As an example, todays cost of labour would be one of many state variables for a company deciding how much expansion in operations is needed. Control variables are factors that influence the state variable, chosen at any given time period. As an example, labour supply is one of many control variables that will influence the state variable cost of labour.1 The Bellman Equation demonstrates an optimisation problem can be stated in a recursive form known as backward induction, stating the relationship between the value function in one period and the next. Starting from the last time period, work backwards until the first time period rule is derived. The Bellman equation can be expressed: V(s) = max_a [R(s, a) + γ V(s')] where: V(s) is the value of state s a is an action R(s, a) is the immediate reward for taking action a in state s γ is a discount factor, typically between 0 and 1 V(s') is the value of the next state s', which is reached by taking action a in state s 1 Bellman, R.E. (2003) [1957]. Dynamic Programming. Dover. The discount factor γ is used to balance the importance of immediate rewards versus future rewards. A higher discount factor will place more weight on immediate rewards, while a lower discount factor will place more weight on future rewards. The Bellman equation has been modified by contemporary debate. As economics moved away from the neoclassical assumptions of perfectly rational and knowledgeable agents, some fundamental criticisms of the bellman equation arose: A) It assumes that the agent knows the state of the world with certainty, in other words, the agent has perfect knowledge of their environment. B) It assumes that the environment in which the agent is stationary, i.e the reward/utility function and probabilities do not change over time. C) It assumes agents can perfectly optimise their behaviour under high stress or rapidly changing conditions, meaning it assumes that an agent will be able to formulate optimal solutions to every problem. D) It assumes that there is a single decision-maker who is trying to maximize their own utility even though its multiple decision-makers interacting with each other. E) It relies on the agent making optimal decisions in every step of the way. F) Using dynamic programming to solve concrete problems is complicated by informational difficulties, such as choosing the unobservable discount rate. There are computational issues due to the volume of of possible actions and potential state variables which have not been resolved. These criticisms lead to economists using other tools in conjunction with the bellman equation, thus modifying the way the bellman equation is used. Some of the major tools that are used by economists in tandem with the bellman equation include: A) Stochastic programming: Since stochastic dynamic programming deals with problems where the current and future period states are random, it is used where the decision-maker faces uncertainty. The Bellman equation can be modified to incorporate uncertainty using stochastic dynamic programming by taking the expectation over the possible future states of the world. 22 B) Game theory: The Bellman equation can be modified to incorporate the interaction of multiple decision-makers using game theory by formulating the dynamic optimization problem as a sequential game. 4 The Bellman equation and its consequential formulation of the dynamic programming method of solving optimisation problems has been transformative, being used by many contemporary economists. Lars Ljungqvist and Thomas Sargent use dynamic programming to study many theoretical questions in monetary policy, fiscal policy, taxation, economic growth and labor economics.5 Avinash Dixit and Robert Pindyck also used it to theoretically analyse capital budgeting.6 2 Başar, Tamer; Olsder, George J. (1999) Stochastic Dynamic Games: Foundations and Applications. SIAM. Stokey, Nancy; Lucas, Robert E.; Prescott, Edward (1989). Recursive Methods in Economic Dynamics. Harvard University Press. 4 Stokey, Nancy; Lucas, Robert E.; Prescott, Edward (1989). Recursive Methods in Economic Dynamics. Harvard University Press. 5 Ljungqvist, Lars; Sargent, Thomas (2012) Recursive Macroeconomic Theory (3rd ed.). MIT Press. 6 Dixit, Avinash; Pindyck, Robert (1994) Investment under Uncertainity. Princeton University Press. 2 Accounting for irrationality in economic decision making is difficult as expressing irrational behaviour in mathematical formalism makes for complex (or even impossible) modelling. Though the bellman equation has furthered our understanding of the “rational” agent, modelling actual economic agents accurately has seen significantly slower progress. Even so, emerging heterodox fields like neuroeconomics (study of the relationship between the nervous system and economic decision making) that are more grounded in the analysis of behaviour have made great use of the bellman equation, reinforcing its applicability and differing interpretations. One use of the Bellman equation in neuroeconomics is to model the neural basis of intertemporal choice. Researchers have used the Bellman equation to develop models of how the brain represents and evaluates the value of intertemporal choices including changes in brain's response to intertemporal choices with development and effect of neurological disorders on brain's response to intertemporal choices. In a study by Montague et al. (2004), functional magnetic resonance imaging (fMRI) was used to measure the brain activity of participants during intertemporal choices. The researchers found that the activity in the brain region ventromedial prefrontal cortex (VMPFC) was correlated with the value of the participants' intertemporal choices. The researchers developed a computational model of intertemporal choice based on the Bellman equation, finding that the model could accurately predict the activity in the VMPFC. This suggests that the VMPFC may play a role in representing and evaluating the value of intertemporal choices.7 REFERENCE LIST 1. Bellman, R.E. (2003) [1957]. Dynamic Programming. Dover. 2. Başar, Tamer; Olsder, George J. (1999) Stochastic Dynamic Games: Foundations and Applications. SIAM. 3. Stokey, Nancy; Lucas, Robert E.; Prescott, Edward (1989). Recursive Methods in Economic Dynamics. Harvard University Press. 4. Ljungqvist, Lars; Sargent, Thomas (2012) Recursive Macroeconomic Theory (3rd ed.). MIT Press. 5. Dixit, Avinash; Pindyck, Robert (1994) Investment under Uncertainity. Princeton University Press. 6. Montague, P. R., Hyman, S. E., & Cohen, J. D. (2004). Computational dopamine and the mesolimbic system. Neuron, 43(2), 759-767. 7. Peter J. Hammond. Rationality in Economics https://web.stanford.edu/~hammond/ratEcon.pdf 7 Montague, P. R., Hyman, S. E., & Cohen, J. D. (2004). Computational dopamine and the mesolimbic system. Neuron, 43(2), 759-767.