### Michael Estrada

###
EECS Department

University of California, Berkeley

Technical Report No. UCB/EECS-2021-54

May 12, 2021

### http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-54.pdf

The control of non-linear systems has historically been a difficult task due primarily to the need for additional analysis of the specific system being controlled in absence of the standard techniques of linear systems theory. While Feedback Linearization can be employed to leverage the techniques of linear systems theory for non-linear systems, error in the dynamic model used to construct the controller can lead to erroneous behavior. Similarly, in the case of a non-minimum phase system, even an exactly accurate linearizaing controller can render the zero dynamics of the system unstable; A condition that must be actively considered and accounted for in the design of the controller and path to be followed. In this work, we present three example cases of non-linear, and later, non-minimum phase systems. The first, strictly non-linear example is that of racing a car. The technique of Learning Feedback Linearization is used to learn an exactly linearizing controller for a 2D racecar, represented by a dynamically extended bicycle model. A preprocessing network is further posited to convert the outputs of the linearizing controller to a seperate set of inputs for the racecar. The latter two examples, both demonstrating non-minimum phase behavior, are those of Planar Vertical Take-Off and Landing in aircraft, and the control of a two-wheeled bicycle. In both cases, these problems are formulated and posed as Reinforcement Learning problems. The formulation of our solution for the latter examples attempts to learn how to correct a nominal path, provided as input to the algorithm, in an effort to maintain the stability of the zero dynamics. Our progress towards these goals is reported; some demonstrative example cases of all three systems are shown, and the progress and results of these experiments are discussed in depth. Finally, possible avenues of future work are discussed.

**Advisor:** S. Shankar Sastry

BibTeX citation:

@mastersthesis{Estrada:EECS-2021-54, Author = {Estrada, Michael}, Title = {Toward the Control of Non-Linear, Non-Minimum Phase Systems via Feedback Linearization and Reinforcement Learning}, School = {EECS Department, University of California, Berkeley}, Year = {2021}, Month = {May}, URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-54.html}, Number = {UCB/EECS-2021-54}, Abstract = {The control of non-linear systems has historically been a difficult task due primarily to the need for additional analysis of the specific system being controlled in absence of the standard techniques of linear systems theory. While Feedback Linearization can be employed to leverage the techniques of linear systems theory for non-linear systems, error in the dynamic model used to construct the controller can lead to erroneous behavior. Similarly, in the case of a non-minimum phase system, even an exactly accurate linearizaing controller can render the zero dynamics of the system unstable; A condition that must be actively considered and accounted for in the design of the controller and path to be followed. In this work, we present three example cases of non-linear, and later, non-minimum phase systems. The first, strictly non-linear example is that of racing a car. The technique of Learning Feedback Linearization is used to learn an exactly linearizing controller for a 2D racecar, represented by a dynamically extended bicycle model. A preprocessing network is further posited to convert the outputs of the linearizing controller to a seperate set of inputs for the racecar. The latter two examples, both demonstrating non-minimum phase behavior, are those of Planar Vertical Take-Off and Landing in aircraft, and the control of a two-wheeled bicycle. In both cases, these problems are formulated and posed as Reinforcement Learning problems. The formulation of our solution for the latter examples attempts to learn how to correct a nominal path, provided as input to the algorithm, in an effort to maintain the stability of the zero dynamics. Our progress towards these goals is reported; some demonstrative example cases of all three systems are shown, and the progress and results of these experiments are discussed in depth. Finally, possible avenues of future work are discussed.} }

EndNote citation:

%0 Thesis %A Estrada, Michael %T Toward the Control of Non-Linear, Non-Minimum Phase Systems via Feedback Linearization and Reinforcement Learning %I EECS Department, University of California, Berkeley %D 2021 %8 May 12 %@ UCB/EECS-2021-54 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-54.html %F Estrada:EECS-2021-54