Feedback control

Introductory Control Theory CS 659 Kris Hauser Control Theory • The use of feedback to regulate a signal Desired signal xd Controller Control input u Signal x Plant Error e = x-xd (By convention, xd = 0) x’ = f(x,u) What might we be interested in? • Controls engineering • Produce a policy u(x,t), given a description of the plant, that achieves good performance • Verifying theoretical properties • Convergence, stability, optimality of a given policy u(x,t) Agenda • PID control • LTI multivariate systems & LQR control • Nonlinear control & Lyapunov funcitons • Control is a huge topic, and we won’t dive into much detail Model-free vs model-based • Two general philosophies: • Model-free: do not require a dynamics model to be provided • Model-based: do use a dynamics model during computation • Model-free methods: • Simpler • Tend to require much more manual tuning to perform well • Model-based methods: • • • • • Can achieve good performance (optimal w.r.t. some cost function) Are more complicated to implement Require reasonably good models (system-specific knowledge) Calibration: build a model using measurements before behaving Adaptive control: “learn” parameters of the model online from sensors PID control • Proportional-Integral-Derivative controller • A workhorse of 1D control systems • Model-free Proportional term Gain • u(t) = -Kp x(t) • Negative sign assumes control acts in the same direction as x x t Integral term Integral gain • u(t) = -Kp x(t) - Ki I(t) • I(t) = 𝑡 𝑥 0 𝑡 𝑑𝑡 (accumulation of errors) x t Residual steady-state errors driven asymptotically to 0 Instability • For a 2nd order system (momentum), P control Divergence x t Derivative term Derivative gain • u(t) = -Kp x(t) – Kd x’(t) x Putting it all together • u(t) = -Kp x(t) - Ki I(t) - Kd x’(t) • I(t) = 𝑡 𝑥 0 𝑡 𝑑𝑡 Parameter tuning Example: Damped Harmonic Oscillator • Second order time invariant linear system, PID controller • x’’(t) = A x(t) + B x’(t) + C + D u(x,x’,t) • For what starting conditions, gains is this stable and convergent? Stability and Convergence • System is stable if errors stay bounded • System is convergent if errors -> 0 Example: Damped Harmonic Oscillator • x’’ = A x + B x’ + C + D u(x,x’) • PID controller u = -Kp x –Kd x’ – Ki I • x’’ = (A-DKp) x + (B-DKd) x’ + C - D Ki I Homogenous solution • • • • • Instable if A-DKp > 0 Natural frequency w0 = sqrt(DKp-A) Damping ratio z=(DKd-B)/2w0 If z > 1, overdamped If z < 1, underdamped (oscillates) Example: Trajectory following • Say a trajectory xdes(t) has been designed • E.g., a rocket’s ascent, a steering path for a car, a plane’s landing • Apply PID control • u(t) = Kp (xdes(t)- x(t)) - Ki I(t) + Kd (x’des(t)-x’(t)) • I(t) = 𝑡 𝑥 0 𝑑𝑒𝑠 𝑡 − 𝑥 𝑡 𝑑𝑡 • The designer of xdes needs to be knowledgeable about the controller’s behavior! x(t) xdes(t) x(t) Controller Tuning Workflow • Hypothesize a control policy • Analysis: • Assume a model • Assume disturbances to be handled • Test performance either through mathematical analysis, or through simulation • Go back and redesign control policy • Mathematical techniques give you more insight to improve redesign, but require more work Multivariate Systems • • • • x’ = f(x,u) x X  Rn u U  Rm Because m  n, and variables are coupled, this is not as easy as setting n PID controllers Linear Time-Invariant Systems • Linear: x’ = f(x,u,t) = A(t)x + B(t)u • LTI: x’ = f(x,u) = Ax + Bu • Nonlinear systems can sometimes be approximated by linearization Convergence of LTI systems • x’ = A x + B u • Let u = - K x • Then x’ = (A-BK) x • The eigenvalues li of (A-BK) determine convergence • Each li may be complex • Must have real component between (-∞,0] Linear Quadratic Regulator • x’ = Ax + Bu • Objective: minimize quadratic cost  xTQ x + uTR u dt Error term “Effort” penalization Over an infinite horizon Closed form LQR solution • Closed form solution u = -K x, with K = R-1BP • Where P is a symmetric matrix that solves the Riccati equation • ATP + PA – PBR-1BTP + Q = 0 • Derivation: calculus of variations • Packages available for finding solution Nonlinear Control • General case: x’ = f(x,u) • Two questions: • Analysis: How to prove convergence and stability for a given u(x)? • Synthesis: How to find u(t) to optimize some cost function? Toy Nonlinear Systems Cart-pole Mountain car Acrobot Proving convergence & stability with Lyapunov functions • Let u = u(x) • Then x’ = f(x,u) = g(x) • Conjecture a Lyapunov function V(x) • V(x) = 0 at origin x=0 • V(x) > 0 for all x in a neighborhood of origin V(x) Proving stability with Lyapunov functions • Idea: prove that d/dt V(x)  0 under the dynamics x’ = g(x) around origin V(x) t g(x) t d/dt V(x) Proving convergence with Lyapunov functions • Idea: prove that d/dt V(x) < 0 under the dynamics x’ = g(x) around origin V(x) t g(x) t d/dt V(x) Proving convergence with Lyapunov functions • d/dt V(x) = dV/dx(x) dx/dt(x) = V(x)T g(x) < 0 V(x) t g(x) t d/dt V(x) How does one construct a suitable Lyapunov function? • Typically some form of energy (e.g., KE + PE) • Some art involved Direct policy synthesis: Optimal control • Input: cost function J(x), estimated dynamics f(x,u), finite state/control spaces X, U • Two basic classes: • Trajectory optimization: Hypothesize control sequence u(t), simulate to get x(t), perform optimization to improve u(t), repeat. • Output: optimal trajectory u(t) (in practice, only a locally optimal solution is found) • Dynamic programming: Discretize state and control spaces, form a discrete search problem, and solve it. • Output: Optimal policy u(x) across all of X Discrete Search example • • • • Split X, U into cells x1,…,xn, u1,…,um Build transition function xj = f(xi,uk)dt for all i,k State machine with costs dt J(xi) for staying in state I Find u(xi) that minimizes sum Value function for 1-joint acrobot of total costs. • Value iteration: repeated dynamic programming over V(xi) = sum of total future costs Receding Horizon Control (aka model predictive control) ... horizon 1 horizon h Controller Hooks in RobotSim • • • • Given a loaded WorldModel sim = Simulator(world) c = sim.getController(0) By default, a trajectory queue, PID controller • c.setMilestone(qdes) – moves smoothly to qdes • c.addMilestone(q1), c.addMilestone(q2), … – appends a list of milestones and smoothly interpolates between them. • Can override behavior to get a manual control loop. At every time step, do: • Read q,dq with c.getSensedConfig(), c.getSensedVelocity() • For torque commands: • Compute u(q,dq,t) • Send torque command via c.setTorque(u) • OR for PID commands: • Compute qdes(q,dq,t), dqdes(q,dq,t) • Send PID command via c.setPIDCommand(qdes,dqdes) Next class • Motion planning • Principles Ch 2, 5.1, 6.1

Feedback control

Related documents

Products

Support

Feedback control

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib