Posted on

Adaptive Dynamic Programming for Control: Algorithms and by Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

By Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

There are many tools of sturdy controller layout for nonlinear structures. In looking to transcend the minimal requirement of balance, Adaptive Dynamic Programming in Discrete Time methods the demanding subject of optimum keep an eye on for nonlinear structures utilizing the instruments of adaptive dynamic programming (ADP). the variety of platforms taken care of is broad; affine, switched, singularly perturbed and time-delay nonlinear structures are mentioned as are the makes use of of neural networks and strategies of worth and coverage generation. The textual content positive aspects 3 major points of ADP within which the tools proposed for stabilization and for monitoring and video games enjoy the incorporation of optimum keep an eye on tools:
• infinite-horizon keep watch over for which the trouble of fixing partial differential Hamilton–Jacobi–Bellman equations without delay is conquer, and evidence only if the iterative price functionality updating series converges to the infimum of all of the price features acquired via admissible regulate legislations sequences;
• finite-horizon keep watch over, applied in discrete-time nonlinear structures exhibiting the reader the way to receive suboptimal regulate recommendations inside of a hard and fast variety of keep an eye on steps and with effects extra simply utilized in genuine structures than these often received from infinite-horizon regulate;
• nonlinear video games for which a couple of combined optimum regulations are derived for fixing video games either while the saddle aspect doesn't exist, and, while it does, heading off the life stipulations of the saddle element.
Non-zero-sum video games are studied within the context of a unmarried community scheme during which regulations are received making certain method balance and minimizing the person functionality functionality yielding a Nash equilibrium.
In order to make the assurance appropriate for the coed in addition to for the professional reader, Adaptive Dynamic Programming in Discrete Time:
• establishes the basic thought concerned in actual fact with every one bankruptcy dedicated to a in actual fact identifiable keep watch over paradigm;
• demonstrates convergence proofs of the ADP algorithms to deepen knowing of the derivation of balance and convergence with the iterative computational tools used; and
• indicates how ADP equipment will be placed to exploit either in simulation and in actual functions.
This textual content should be of substantial curiosity to researchers drawn to optimum keep watch over and its purposes in operations learn, utilized arithmetic computational intelligence and engineering. Graduate scholars operating up to the mark and operations learn also will locate the guidelines offered the following to be a resource of robust tools for furthering their study.

Show description

Read Online or Download Adaptive Dynamic Programming for Control: Algorithms and Stability PDF

Similar system theory books

System Identification, Theory for Users

Lennart Ljung's approach identity: idea for the person is an entire, coherent description of the idea, method, and perform of procedure identity. This thoroughly revised moment version introduces subspace tools, equipment that make the most of frequency area info, and basic non-linear black field equipment, together with neural networks and neuro-fuzzy modeling.

Software Engineering for Experimental Robotics (Springer Tracts in Advanced Robotics)

This booklet stories at the suggestions and concepts mentioned on the good attended ICRA2005 Workshop on "Principles and perform of software program improvement in Robotics", held in Barcelona, Spain, April 18 2005. It collects contributions that describe the cutting-edge in software program improvement for the Robotics area.

Phase Transitions (Primers in Complex Systems)

Section transitions--changes among diversified states of association in a fancy system--have lengthy helped to give an explanation for physics innovations, reminiscent of why water freezes right into a stable or boils to turn into a gasoline. How may part transitions make clear very important difficulties in organic and ecological advanced structures?

Modeling Conflict Dynamics with Spatio-temporal Data

This authored monograph offers using dynamic spatiotemporal modeling instruments for the identity of advanced underlying tactics in clash, corresponding to diffusion, relocation, heterogeneous escalation, and volatility. The authors use principles from information, sign processing, and ecology, and supply a predictive framework which can assimilate information and provides self assurance estimates at the predictions.

Extra info for Adaptive Dynamic Programming for Control: Algorithms and Stability

Example text

Such a function is easy to find, and one example is the hyperbolic tangent function ϕ(·) = tanh(·). It should be noticed that, by the definition above, W (u(i)) is ensured to be positive definite because ϕ −1 (·) is a monotonic odd function and R is positive definite. According to Bellman’s principle of optimality, the optimal value function J ∗ (x) should satisfy the following HJB equation: J ∗ (x(k)) = min u(·) ∞ u(i) x T (i)Qx(i) + 2 ϕ −T (U¯ −1 s)U¯ Rds 0 i=k u(k) = min x T (k)Qx(k) + 2 u(k) ϕ −T (U¯ −1 s)U¯ Rds 0 + J ∗ (x(k + 1)) .

10. Stop. As stated in the last subsection, the iterative algorithm will be convergent with λi (x) → λ∗ (x) and the control sequence vi (x) → u∗ (x) as i → ∞. However, in practical applications, we cannot implement the iteration till i → ∞. Actually, we iterate the algorithm for a max number imax or with a pre-specified accuracy ε0 to test the convergence of the algorithm. In the above procedure, there are two levels of loops. The outer loop starts from Step 3 and ends at Step 8. There are two inner c loops in Steps 5 and 6, respectively.

Vi → J ∗ as i → ∞. 8), we can conclude that the corresponding control law sequence {vi } converges to the optimal control law u∗ as i → ∞. It should be mentioned that the value function Vi (x) we constructed is a new function that is different from ordinary cost function. 4, we have showed that for any x(k) ∈ Ω, the function sequence {Vi (x(k))} is a nondecreasing sequence, which will increase its value with an upper bound. , [5], where the value functions are constructed as a nonincreasing sequence with lower bound.

Download PDF sample

Rated 4.66 of 5 – based on 33 votes