- About Us
- Helpful Links
AMS 556, Dynamic Programming
Stochastic and deterministic multistage optimization problems. Stochastic path problems. Principle of optimality. Recursive and functional equations. Method of successive approximations and policy iteration. Applications to finance, economics, inventory control, maintenance, inspection, and replacement problems.
Prerequisite: AMS 507
3 credits, ABCF grading
Text: Lecture Notes available on Blackboard
1) Understanding of finite-horizon dynamic programming problems and general principles
of sequential optimization:
* Being able to formulate dynamic programming problems;
* Understanding of definitions of various classes of policies;
* Familiarity with definitions of optimal and nearly optimal policies;
* Knowledge of basic transformation and representation methods such as equivalence of general and memoryless policies, equivalence of randomized and mixed policies, and sufficiency of pure Markov policies;
* Understanding of the principle of optimality and optimality equations.
2) Demonstrate abilities to analyze infinite-horizon problems with total reward criteria:
* Formulate properties of positive, negative, discounted, and convergent dynamic programming models;
* Being able to formulate necessary and sufficient conditions for optimality;
* Ability to solve problems by using value iterations and linear programming;
* Ability to produce numerical solutions by using PCs.
3) Demonstrate abilities to analyze infinite-horizon problems with average-reward
* Solving problems with finite state and action sets by using linear programming and policy iteration methods;
* Analyzing problems with infinite-state spaces;
* Knowledge of optimality equations and inequalities;
* Solving problems with multiple criteria and constraints by linear programming.
4) Being familiar with major application areas:
* Modeling and optimization of production and service operations including inventory and queueing control;
* Applications to quantitative finance;
* Statistical applications (multi-armed bandits and optimal selection);
* Neuro-dynamic programming and other methods for approximations.