Model reference adaptive impedance control for physical human-robot interaction

2016-12-22 05:19:07BakurALQAUDIHamidrezaMODARESIsuraRANATUNGAShaikhTOUSIFFrankLEWISDanPOPA

Control Theory and Technology 2016年1期

Bakur ALQAUDI,Hamidreza MODARES,Isura RANATUNGA,Shaikh M.TOUSIF,Frank L.LEWIS,4,Dan O.POPA

1.The University of Texas at Arlington Research Institute(UTARI),Fort Worth,TX 76118,U.S.A.;

2.Department of Electrical Engineering,College of Engineering,The University of Texas at Arlington,Arlington,TX 76019,U.S.A.;

3.Electronics and Instrumentation Engineering Technology Department,Yanbu Industrial College,P.O.Box 30436,41912,Yanbu al-Sinaiyah,K.S.A.;

4.State Key Laboratory of Synthetical Automation for Process Industries,Northeastern University,Shenyang Liaoning 110819,China Received 18 December 2015;revised 7 January 2016;accepted 7 January 2016

Model reference adaptive impedance control for physical human-robot interaction

Bakur ALQAUDI1,2,3?,Hamidreza MODARES1,2,Isura RANATUNGA1,2,Shaikh M.TOUSIF1,2,Frank L.LEWIS1,2,4,Dan O.POPA1,2

1.The University of Texas at Arlington Research Institute(UTARI),Fort Worth,TX 76118,U.S.A.;

2.Department of Electrical Engineering,College of Engineering,The University of Texas at Arlington,Arlington,TX 76019,U.S.A.;

3.Electronics and Instrumentation Engineering Technology Department,Yanbu Industrial College,P.O.Box 30436,41912,Yanbu al-Sinaiyah,K.S.A.;

4.State Key Laboratory of Synthetical Automation for Process Industries,Northeastern University,Shenyang Liaoning 110819,China Received 18 December 2015;revised 7 January 2016;accepted 7 January 2016

This paper presents a novel enhanced human-robot interaction system based on model reference adaptive control.The presented method delivers guaranteed stability and task performance and has two control loops.A robot-specific inner loop,which is a neuroadaptive controller,learns the robot dynamics online and makes the robot respond like a prescribed impedance model.This loop uses no task information,including no prescribed trajectory.A task-specific outer loop takes into account the human operator dynamics and adapts the prescribed robot impedance model so that the combined human-robot system has desirable characteristics for task performance.This design is based on model reference adaptive control,but of a nonstandard form.The net result is a controller with both adaptive impedance characteristics and assistive inputs that augment the human operator to provide improved task performance of the human-robot team.Simulations verify the performance of the proposed controller in a repetitive point-to-point motion task.Actual experimental implementations on a PR2 robot further corroborate the effectiveness of the approach.

Human-robot interaction,model reference adaptive control,model reference neuroadaptive,impedance control

1 Introduction

Physical human-robot interaction(HRI)and cooperation has become significantly more important in recent years and is now of a major focus in robotics and control society.The empirical evidence suggests that physically embodied interactions are preferred by human operators over virtual or remote teleconference interactions[1].Unlike ordinary industrial robotics where the environment is structured and known,in HRI systems,the robots interact with humans who have very different skills and capabilities.Therefore,it is of paramount importance for robots to adjust themselves to the level of the skills and capability of the human and compensate for possible human mistakes due to fatigue,stress,etc.

Control of industrial robots has often focused on following a desired trajectory in a well-known and structured environment.For robot manipulators with unknown nonlinear dynamics,modeling inaccuracies,and disturbances,nonlinear adaptive robot controllers have often been designed based on computed torque control[2]and/or feedback linearization[3,4]to yield guaranteed trajectory following.Adaptive control using neural networks(NNs)has been successfully employed for control of uncertain robot systems in the literature.These mentioned adaptive control methods,however,do not consider the interaction between the robot and the environment or the human.When the robot is in contact with an object or a human,it must be able to control not only positions,but also forces.

Impedance control has been widely studied in robotics as a control technique to perform robotic contact tasks.The purpose of impedance control is to provide stable tracking during robot contact with the external environment[5-9]by regulating the mechanical impedance response of a robot to a desired reaction according to a given task.In trajectory following,the importantfeatureisthetrackingerrordynamics.Therefore,impedance control in these applications has focused on making the tracking error dynamics behave like a prescribed impedance model[9-15].Adaptive impedance control can be used to guarantee stable contact with unknown environments and specify the desirable response of the robot to an external force profile.This can potentially be used to regulate the interactions between a robot and a human operator while dynamically performing a task.Various considerations have been taken into account to tune the impedance parameters.In[11],an adaptive impedance feedforward term was used based ontaskrequirements.In[13],adaptivecontrollersbased on neural networks were designed in which the error dynamics parameters were tuned to become closer to a prescribed error dynamics model.

Most existing adaptive NN-based controllers and adaptive impedance controllers focus on tracking error dynamics,and/or make the tracking error dynamics have a prescribed impedance characteristic.Moreover,the control torques derived in most work has been done in the literature depends on the prescribed impedance model parameters.The objective of trajectory following with an error dynamics having prescribed impedance properties often restricts the applications of these approaches in human-robot interactive systems.Modern human-robotic interactive systems must be capable of performing a wide range of tasks.Applications in industry,military,aerospace and the gaming industry focus on semi-autonomous features of robotic systems in interacting with humans.This requires that task-specific controls include the effects of both the robot dynamics and the human dynamics,and their interactions.In this setting,trajectory following design for robot torque controllers is not suitable and limits system performance to a narrow range of tasks.In HRI systems,any trajectory tracking objectives cannot be implemented solely by the inner robot control loop,because the human dynamics must be included in task trajectory following objectives.

This paper is motivated by the human factor studies,and as opposed to most existing results does not design a robot torque controller for trajectory following.The purpose of this paper is(1)to avoid the need for the human to learn robot-specific models,so he can focus on the task and(2)to adapt the robot performance to assist the human-robot system in performing the task.The contributions of this paper are as follows.An inner-loop torque controller is first designed to make the robot behave like a prescribed impedance model from the human force input to the robot motion coordinates.This means the human does not need to learn an inverse dynamics model to compensate for robot nonlinearities and is a completely different philosophy than making a trajectory error dynamics follow a prescribed impedance model[11].Then,a task-specific outer-loop controller is designed,taking into account the human transfer characteristics,to tune the robot impedance model to assist the human in effectively performing the task.The outer-loop task-specific controller is designed to make the combined transfer function of the human and the robot resemble a desirable performance model basedontaskrequirements.Techniquesfrommodelreference adaptive control are modified to accommodate the fact that the tunable impedance model appears after the plant,not before as in standard model-reference adaptive control(MRAC).This task control loop incorporates a human dynamics system identifier.Adaptive tuning algorithms are given for the robot impedance model parameters and proofs of performance are formally presented.Novel extensions to MRAC are made in the design of both the robot-specific inner loop and the task-specific outer loop controller design.

This paper is organized as follows.Section2 provides an overview of the design philosophy in this paper.Section 3 designs a neural network adaptive torque controller that makes a robot dynamics appear like a prescribedrobotimpedancemodel.Thisdesignisnotbased on trajectory following.In Section4 an outer-loop controller is designed using a novel MRAC structure that takes into account both the human dynamics model and the prescribed robot impedance model to ensure the effective performance of a task.Adaptive methods are given for tuning the robot impedance model to assist the human in the performance of the task.Section5 gives simulation results and implementation results on a PR2 robot are given in Section6.

2 Structure of adaptive human-robot interaction

In this section,we preview the overall control architecture developed in this paper.Two control loops are designed.These control loops are motivated by human factors studies[16-19]that show a human operator learns two components in performing tasks with a robotic system.He learns a robot-specific inverse dynamics model to compensate for the nonlinearities of the robot.This appears to occur in the cerebellum,where supervised learning is used to learn the environment[16].Simultaneously,he learns a task-specific feedback control component that is particular to the successful performance of the task.Some recent work in adaptive impedance control follows this approach of robot-specific impedance control inner loop design followed by a task-specific outer loop design that includes the human dynamics[20].

In this paper,a robot-specific inner loop is first designed to make the robot dynamics from the human operator input to the robot motion appear as a prescribed robot impedance model.The robot-specific inner-loop controllerappearsinFig.1andisdevelopedinSection3.The objective in this loop is to design the controller torque τ to make the error between the robot positon,i.e.,q,andtheprescribedimpedancemodelpositon,i.e.,qm,go to zero.That is to design τ to make em=q-qmgo to zero.The input to both robot and impedance model is the human torque τh.This is not the same as the bulk of the work in robot impedance control[6]and neural network adaptive control[9-22]which is directed towards making a robot follow a prescribed trajectory,and causing the trajectory error dynamics to follow a prescribed impedance model[11].In our approach,no trajectory information and no information of the prescribed impedance model is needed for the inner loop design.Thisleavesthefreedomtoincorporatealltaskinformation in an outer loop design.It will be seen that the robot torque input does not depend on the impedance model parameters.This is in contrast to other adaptive impedance control approaches which have a trajectory following objective[11].

Fig.1 Inner-loop robot-specific model reference neuroadaptive control.

An outer task-specific loop is next designed that considers the human operator dynamics.All task performance details are relegated to this outer-loop design.The task-specific outer loop design is shown in Fig.2 and designed in Section4.The objective is to tune the robot impedance model,which is performed by designing the control input u,as described later,to make the position of impedance model tracks the position of a reference model,i.e.,qr.It is a novel form of MRAC of a different sort than Fig.1.The application of MRAC must be modified since the tunable parameter robot impedance model appears after the unknown human plant model,not before it as in standard MRAC design.Human-robot interactive systems can perform a variety of quite general tasks.In this paper,weconsider the task to be following a desired trajectory,as in point-to-point motion control by a human operator[20,21].Then,the task reference input uc(t)in Fig.2 is interpreted as the desired task trajectory to be followed by the combined man-robot system.The outer-loop design has two components.An assistive input is generated that helps the human in task performance and the prescribed robot impedance model in Fig.1 is adapted to enhance the human in task performance.This design must take into account the unknown human dynamics as well as the desired overall dynamics of the human-robot system,which depends on the task.

Fig.2 Outer-loop task-specific MRAC for adaptive humanrobot interaction.

3 Inner-loop control design

In this section,the inner-loop torque controller forthe robot manipulator shown in Fig.1 is derived to make the robotdynamicsfromhumanoperatorinputtorobotmotion appear like a prescribed robot impedance model.A neural network approximator is used to compensate for the unknown nonlinear robot dynamics.We call this approach neuroadaptive control.The detailed result of this design is shown in Fig.3.No task trajectory information is needed in this design,so that this work is different from most existing work in robot control and neural network control[23].

Fig.3 Model reference neuroadaptive controller.

3.1 Robot impedance model and model-following error dynamics

In this section,we formulate a novel control objective for an inner-loop robot controller that does not involve trajectory tracking.

The robot dynamics equation is adapted from[2]

whereq∈Rnare the robot positions,M(q)is the inertia matrix,V(q,˙q)is the Coriolis/centripetal forces,G(q)is the gravity vector,andF(˙q)is the friction term.The disturbance is τd∈ Rnand the human operator input is τh.Control torque τ is to be designed to fulfil the control objective outlined above and detailed below.

Equation(1)can be considered as being either in joint space or Cartesian operational space.If it is in the joint space,the inputs τd,τhare torques.If it is in Cartesian space,the inputs τd,τhare forces.Forcesfand torques τ are related by τ =JTfwhereJis the robot Jacobian matrix.The Cartesian inertia,Coriolis/centripetal forces,friction and gravity terms are likewise determined from their joint space counterparts by using the Jacobian matrix,according to standard techniques[2].

Select the prescribed robot impedance model whose dynamics are to be followed by the robot as

whereqm(t)is the model trajectory,Mmis the desired mass matrix,Dmis the desired damping matrix,andKmisthedesiredspringconstantmatrix.TheimpedanceparametersMm,Dm,andKmwill be designed in Section4 in an outer task-specific loop that takes into account both the human operator dynamics and the task objectives.

Robot-loop control design objectiveDesign a robot torque controller that makes the robot dynamics(1)from the human input τhto the manipulator motionq(t)behave like the prescribed impedance model(2).To this end,define the model-following error

and the sliding mode error

where Λ is a symmetric,positive definite design parameter matrix.Since(4)is a stable system,consideringr(t)as input andem(t)as output,the control torque τ in(1)is now designed to guarantee thatr(t)is bounded.This guarantees bounded model-following errorem(t).

Using(1),(3),and(4)the dynamics of the sliding mode error are given by

is a nonlinear function of robot parameters which is assumed unknown.It is important to note thatf(x)does not depend on the impedance model parametersMm,Dm,andKmin(2).This is in contrast to impedance control robot controllers that have a trajectory following objective[11]where a tracking error is used instead of the model-following error(3).

3.2 Neuroadaptive model-following controller

In this section,a control structure is given which uses a neural network(NN)to approximate the unknown functionf(x)in(7)and guarantees the stability of the model-following error(3).Therefore,the robot dynamics(1)with human input τhappears as the prescribed impedance model(2).We call this a neuroadaptive model-following controller.The use of NN in robot control is a standard approach used by many prior works[23].In contrast to almost all thesestandard approaches,thereisnotrajectory-followingobjectivehere,so that a desired reference trajectory is not needed by the neuroadaptive controller.

To provide an approximation for the unknown functionf(x)in(7),a neural network(NN)is introduced.According to the NN approximation property[24-30]the nonlinear function in(7)can be approximated by

whereWandVare unknown ideal NN weights and σ(·)is a vector of activation functions.The NN input vector isIt is known that the NN approximation error ε is bounded on a compact set.Assume the ideal weights are bounded by a constant positive scalarZBaccording to

with‖·‖F(xiàn)the Frobenius norm.Define matrix?Zcommensurately with the definition ofZ.

To make the model-following error defined in(3)stable and consequently make the robot dynamics(1)behave like the prescribed impedance model,the control torque is designed as

whereKvris a proportional-plus-derivative loop withKv=KTva gain matrix,and

is the NN approximation for the unknown functionf(x),and

withKz＞0 a scalar gain is a robustifying signal that compensates for unmodeled and unstructured disturbances.

It is shown in Theorem 1 how to tune the NN weights ?Vand?Wsuch that the control torque in(10)makes the model-following error(3)bounded and consequently the robot dynamics(1)from human input τhto the outputq(t)behaves like the prescribed robot impedance model(2).

Remark 1The structure of the robot controller designed here is given in Fig.3.It is important to note that this controller guarantees model-following behavior of the robot dynamics(1)given the prescribed robot impedance model(2),based on the model-following error(3).There is no objective for tracking a desired trajectory.This is in contrast to almost all existing work in robot control[23].Second,the impedance model parametersMm,Dm,andKmdo not appear in the control law(10)or in the functionf(x)in(7),so that the NN does not need to identify the already-known impedance model parameters.This is reflected in Fig.3,where the prescribed impedance model(2)does not appear.This is contrast to the work on adaptive impedance control based on a trajectory tracking error dynamics[11].As a result,the approach given here cleanly decouples the robot-specific control design given here from the taskspecificcontroldesignwhichisgiveninthenextsection.This is in keeping with human factor studies[17]which indicate that the human learns two control components in task performance,one to compensate for nonlinear robot dynamics and one to assure task performance.

4 Outer-loop model reference adaptive HRI controller

In this paper,the task-specific outer loop controller is designed using extensions of model-reference adaptive control.The pioneering research work for model reference adaptive control(MRAC)was carried on during the 1960s by H.P.Whitaker,P.V.Osburn and A.Keze.Initial work in MRAC depended on gradient descent algorithms,including the MIT rule[31].More rigorous Lyapunov designs for MRAC were proposed by P.C.Parks[32].In[33-38],general approaches to MRAC design and its applications were developed.Seminal work was done by[37],and others.

The objective in this section is to design the humanrobot interaction task-specific controller in Fig.2 that takes into account the human dynamics,which are unknown,and the task objectives.The detailed result is in Fig.4.It will be seen that this task-loop controller performs two functions.It adapts the parameters of the robotimpedancemodel(2)sothatthetaskperformance of the human-robot system is improved,and also provides assistive inputs that enhance the human’s task performance.No robot-specific information is needed in the task loop design presented in this section.This decoupling of control objectives goes along with human factors studies in[17].

Fig.4 Overall system of model reference adaptive control.

4.1 Model reference adaptive control(MRAC)formulation of adaptive HRI

The problem of adapting the robot impedance model in Fig.2 to assist the human in performing a task is now formulated as a nonstandard MRAC problem.The challenge to be overcome is that the tunable parameter compensator is the prescribed robot impedance model in Figs.2 and 4,which occurs after the unknown plant(thehumandynamics),notbefore,asinstandardMRAC.This problem is overcome by adding a system identifier for the human dynamics.

The prescribed robot impedance model(2)hasqm(t)∈Rn,withnthe number of degrees of freedom of the robot.It is assumed here that the robot dynamics(1)are in Cartesian task space,so that forn=6 degrees of freedom,the vectorq(t)has three position components and three angular rotation components[2].Regarding the human transfer characteristic,it is known from human factors neurocognitive studies[17]that,in human-robot interactive task performance,the human adapts itself to compensate for robot dynamics nonlinearities and also learns task-specific controls.However,after learning,it has been observed that the expert operator exhibits the transfer characteristics of a simple linear model with a time delay.For many tasks,this human operator model is a first-order linear system of the form[21].

In this section,it is assumed that the human transfer matricesAandBare unknown.It is observed in human task studies there is a reaction time delay τ that is independent of the particular operator once the task has been learned,and is almost constant at 0.4s[20,21].Therefore,it can be compensated for,so that,without loss of generality,the delay τ can be taken as zero in(13)by shifting the measured time signals.

Regarding the task reference model in Fig.2,it is fur-ther observed in human-robot interactive task learning studies that the human operator adapts to make the overall transfer characteristic of the human-robot system appear as a simple linear first-order system with high bandwidth.This is known as the crossover model.Specifically[21],the skilled operator in a man-machine system adapts his own dynamics to make the total system transfer characteristic of the human-plus-robot remain unchanged over wide variations in the robot dynamics.The total man-robot transfer characteristic is therefore prescribed here as the task reference model

with prescribed matricesAmandBm.These parameters are selected based on the specific task.

The class of tasks depicted in Fig.2 includes trajectory-following tasks where the human operates the robot to follow a prescribed trajectory.This includes point-to-point motion tasks in force fields as studied in[20,21].This class of tasks can be considered as having a model-following objective based on the overall task reference model(14),and given the unknown human dynamics(13)and the robot response detailed by the robot impedance model(2).

Remark 2It is noted that if the task is trajectory following by the man-machine system,the parameters of the task reference model(14)should be selected so thatAm=Bm.This has low-frequency gain of 1,so that the trajectory is followed with zero steady-state error.MatrixAmshould be selected based on desired transient response characteristics of the man-machine system.This choice of task reference model does not restrict the objective to following constant trajectories.If the trajectory is time varying,suitable choice of the time-constant matrixwill still result in good trajectory following.

In Section3,it was assumed that the prescribed robot impedance model(2)is of second order.However,the model does not appear in the control design given in Theorem 1.Only the model motion trajectoryqm,˙qm,and¨qmis needed in the design of the robot-specific controllerthere.Therefore,inthissectionwetakeanominal prescribed robot impedance model as

whereAnandBnareinitialnominalmatrices.Itisshown in the following how the overall prescribed impedance model will be changed and tuned by MRAC design to assist the human to perform a task.

Based on the above and referring to Figs.2 and 4,consider the dynamics for the human,nominal robot impedance model,and task reference model,given respectively by

Here,theprescribedtasktrajectoryisuc(t)andanMRAC control law is to be designed for the control inputu(t)in(17).

4.2 Adaptive impedance controland humanassistive inputs using Lyapunov design

Given this setup,the basic concept of model reference adaptive control(MRAC)[33-37]can be used in this section to confront the design of the task loop of Fig.2.The dynamics for the human,robot impedance model,and task reference model,given respectively by(16)-(18).

Unfortunately,applying MRAC to this problem is complicated by the fact that in standard MRAC,the tunable controller appears before the unknown plant dynamics and provides its control input so that the plant has the transfer characteristics of the reference model.By contrast,in adaptive impedance control for humanrobot interaction(Fig.2),the tunable impedance model occurs after the unknown human dynamics.This causes some complications and requires the introduction of a system identifier for the human dynamics.The overall setup for adaptive HRI using MRAC approach is given in Fig.4.The approach given here provides a formal model-following stability proof using Lyapunov techniques,and formalizes the human dynamics identifier approach used in[21].

Task-loop control design objectiveDesign an MRAC for control inputu(t)so that the combined human-robot transfer function is equal to the prescribed task reference model(18).See Fig.2.

It will be seen that the MRAC foru(t)has two components.One component tunes the parameters of the robot impedance model(17).Then,the robot impedance model(17)provides the model reference trajectoryqm,,andused in the inner-loop torque controller of Fig.1 and Theorem 1,through the sliding mode error(4)and the NN input vectorx=The second component of the MRAC provides assistive inputs that augment the oper-ator’s output τh(t)to enhance his task performance.See comments at the end of Theorem 1.

The human transfer function(16)is unknown.Therefore,a system identifier is introduced as

for the human response.Define the human response estimation errorThen,the estimation error dynamics becomes

where the identifier parameter errors areandNow,consider the control law

where θ1,θ2,θ3and θ4are tunable matrices of appropriate dimension.Then,the overall system is illustrated in Fig.4.To derive tuning laws for the parameters θ1,θ2,θ3,θ4,andsuch that the control objective is achieved,define the model-following output error as

Substituting the control law(21)into this equation and manipulating yields the model-following error dynamics as

The next result provides tuning laws for the control parameters in(21),the human dynamics identifier(19)and the neural network weights for the inner-loop controller in(10)that make the overall human-robot system behave like prescribed reference model(14).

Theorem 1Consider the prescribed impedance model(2),and the robot dynamics(1)with control input(10)for the inner-loop controller.Consider the unknown human dynamics(16),the robot impedance model(17),and the outer-loop control input(21).Tune the NN weights in the inner-loop controller(10)as

whereFandGare symmetric positive definite matrices and κ＞0 is a small design parameter.Tune the outer-loop control parameters in(21)according to

withPm＞0 andPh＞0,and the parameters in the human system identifier(19)according to

Then,the inner-loop model-following errorem(t),the outer-loop model following errore(t)and the human response estimation error are bounded,so that the product of the human dynamics and robot dynamics follows the task reference model(18).

ProofDefine a Lyapunov function as

where the weight estimation errors areDifferentiating this Lyapunov function and using(24)yields

Since-Amis Hurwitz,there exists a Qm＞0 such that

The robot manipulator dynamics(1)is assumed to be unknown and therefore the function f in(7)is unknown and approximated online by(11).Then,the closed-loop filtered error dynamics(6)becomes

Using(25),(26),(32)and(33)into(30)gives

Noting that since-A is stable there exists a Qh＞0 such that ATPh+PhA=Qh,usingtrand using the tuning rules for the inner-and outer-loop controllers gives

where Kvminis minimum singular value of Kvin the last inequality.˙L is negative as long as the term in braces is positive.Defining C3=ZB+C1/k and completing the square yields

which is guaranteed positive as long as either

This result provides a method for tuning the robot impedance model(17)to provide a desired model reference output qr(t)such that the human(16)plus robot impedance model follows the prescribed task reference model(18).This output is sent as qm(t)to the inner robot control loop in Fig.3 to compute the inner-loop model following error(3).The human input in Fig.3 and in(10)is τh(t).These relationships are shown in Figs.2 and 4.

It is interesting to examine the operation of the control input(21).After convergence of the human system identifier,one hasThen,the closed-loop robot impedance model is

Therefore,control parameter θ2modifies the robot impedance model time constant,whereas control parameters θ1, θ3,and θ4provide a proportional-plusderivative controller that augments the human force signal τh.This can be viewed as an assistive term that aids the human so that task performance is improved.In fact,it is observed in[21]that the expert human operator,after learning to accomplish a task,incorporates a PD controller that seems to come from a task model learned in the cerebellum[17].

5 Simulation

In this section,the results from simulating the proposed controllers on a 2-link robotic arm in MATLAB are presented.The 2-link robot arm is a revolute-revolute planar arm described in[2,Example 3.2-2].First are shown the simulation results for the outer-loop controller in Fig.2 and Fig.4 that adapts the parameters of the prescribed robot impedance model.Next are shown thesimulationresultsfortheinner-loopmodelreference neuroadaptive controller in Figs.1 and 3.

5.1 Outer-loop simulation

This simulation is for the outer task loop shown in Figs.2 and 4.In this simulation the prescribed robot impedance model(2)is chosen to haveqm(t)∈R2,withn=2 the number of degrees of freedom of the robot.It is assumed here that the robot dynamics(1)are in Cartesian task space.The nominal robot impedance model(17)is chosen for each degree of freedom as=-3qm+3u.These nominal time constant and gain parameters are modified through the action of the adaptivecontrol(21).SeethediscussionafterTheorem1.The task reference model(18)is taken as=-12qr+12uc,whereucis the desired trajectory to be reached in a point-to-point motion task.The unknown human dynamics model(16)is chosen as=-1τh+0.5uc.The human dynamics model is unknown,and the human system identifier model(19)is designed to adaptively identify the human in the loop.

TheperformanceoftheoutertaskloopMRACinFig.4 is shown in Figs.5-7.A square wave is selected for the task reference inputuc(t).This is interpreted as a pointto-point motion task where the human-robot system is required to cycle from one point to another point repetitively.Fig.5 shows the output of the robot impedance modelqm(t)and the outputqr(t)of the task reference model.It is seen thatqm(t)closely followsqr(t),with performance improving after several cycles.This shows the adaptive improvement of the controller as the robot impedancemodelistunedandtheassistiveinputstothe human are learned.See discussion after Theorem1.The effectiveness of the human system identifier is revealed in Fig.6,which shows the output of the human transfer function τh(t)and the output of the human identifierwhichfollowsτh(t)morecloselywitheachmotion cycle.The convergence of the human identifier parameters to the actual human model parameters is shown in Fig.7.

Fig.5 Robot impedance model qm(t)output and prescribed task reference output qr(t).

Fig.6 Human output τh(t)and human identifier output

Fig.7 Parameter convergence of adaptive human identifier model.

5.2 Inner-loop simulation

This simulation is for the inner robot control loop of Figs.1 and 3.The outer-loop design just described generates the human operator signal τh(t)and the robot impedance model trajectoryqm(t).Two parallel outer loops were used,one for each joint of the 2-link robot arm simulated here.

The robot dynamics(1)used for this simulation was the 2-link revolute-revolute planar robot arm described in[2,Example 3.2-2].The arm parameters are selected asm1=0.8kg,m2=2.3kg,l1=1m,l2=1m and g=9.8m/s2.The controller parameters used in Theorem 1 wereKv=I2,Λ=5I2,F=100I2,G=20I2,κ=0.07,Kz=5,andZB=100,whereI2is the 2×2 identity matrix.A two-layer neural network was used with 10 inputs,including a constant bias input,20 hidden layer neurons and 2 outputs.The sigmoid functionwas used for the activation functions.The weightsof the network were randomly initialized.

The simulation results for both links are shown in Fig.8,whereq1d(t),q2d(t)denote the 2 components of the task trajectoryuc(t).It is observed that,after a short transient learning period of a few cycles of the square wave task trajectory,the motionqm(t)generated by the robot impedance model and the robot motionq(t)are identical.This verifies the performance of the model reference neuroadaptive controller in making the robot arm behave like the robot impedance model.

Fig.8 Inner-loop simulation.(a)Joint angles for Joint 1.(b)Joint angles for Joint 2.(c)Joint velocity for Joint 1.(d)Joint velocity for Joint 2.

OverallperformanceoftheproposedcontrollerIt canbeseenfromthesimulationresultsthatthetwocontrollers,inner robot loop and outer task loop,achieve the objectives of the design.The outer loop assists the human in achieving the task by providing two assistive components and tuning the robot impedance model.The robot specific inner-loop controller compensates for the robot nonlinearities and makes the robot behave like this robot impedance model.

6 Experimental case study

In this section,a case study of a practical experiment to evaluate the controllers of the Human-Robot interaction system is presented.The experiments were conducted at the University of Texas at Arlington Research Institute on a PR2 robot.Fig.9 shows the experimental layout and Fig.10 shows the PR2 robot.The controller was implemented in real-time using the realtime controller manager framework of the PR2 in ROS Groovy.The real-time loop on the PR2 runs at 1000Hz and communicates with the sensors and actuators on an EtherCAT network.Human force is measured using an ATI Mini40 FT Sensor attached between the gripper and forearm of the PR2.

Theexperimentinvolvesthesevendegree-of-freedom arm of the PR2 robot in a point-to-point motion(PTP)task.PTP manipulation is an increasingly popular task,both in the game industry and in industrial applications.

In this experiment a human applies a force on the right arm of the PR2 to follow the PTP motion trajectory,as shown in Fig.9.The experiment is setup with a human operator and the PR2 arm across from each other as seen in Fig.10.The human operator was then asked to hold the gripper of the PR2 to perform PTP motion between point A and point B along theyaxis.The human is assumed to be working in open-loop without considering the visual feedback of the current location and the target location of the gripper.The desired target location to be reached is switched every 5 seconds.

Fig.9 Experiment layout.

Fig.10 PR2 robot at UTARI.

The controller parameters used wereKv=5I6,Λ=20I6,F=100I6,G=200I6,κ=0.3,Kz=0.001,andZB=100,whereI6is the 6×6 identity matrix.A twolayer neural network was used with 35 inputs,including the bias input,10 hidden layer neurons,and 7 outputs.The sigmoid functionwas used for the activation functions.The weightsof the network were randomly initialized.

The result of the whole human-robot interaction system is shown in Fig.11.The task trajectory(with a dashed line)gives the target point locations,which cycle every 5 seconds.The task reference model output is shown(with a dotted line)followed by the robot impedance model output(with a dash-dot line)and the real robot output(with a solid line).It is seen that theinner-loopneuroadaptivecontrollermakestherobot(with a dashed line)follow the robot admittance model output(with a dash-dot line),and the outer-loop MRAC makes the human-robot interactive team follow the prescribed task model.This is accomplished after a short transientlearningtimewheretheadaptationmechanism tunes the whole system in the first 6 seconds.There is a small time delay of 0.4s due to the human reaction time.

Fig.11 Simulation of the human-robot interactive system.

7 Conclusions

This paper presents a novel method of enhancing human-robot interaction based on model reference adaptivecontrol.Themethodpresenteddeliversguaranteed stability and task performance and has two control loops.A robot-specific inner loop is a model reference neuroadaptive controller that learns the robot dynamics online and makes the robot responds like a prescribed impedance model.This loop uses no task information,including no prescribed trajectory.A task-specific outer loop takes into account the human operator dynamics and adapts the prescribed robot impedance model so that the combined human-robot system has desirable characteristics for task performance.This design is also basedonmodelreferenceadaptivecontrol,butofanonstandard form.The net result is a controller with both adaptive impedance characteristics and assistive inputs that augment the human operator to provide improved task performance of the human-robot team.Simulations verify the performance of the proposed controller in a repetitive point-to-point motion task.Actual experimental implementations on a PR2 robot further corroborate the effectiveness of the approach.

[1]J.Wainer,D.J.Feil-Seifer,D.A.Shell,et al.The role of physical embodiment in human-robot interaction.The 15thIEEE International Symposium on Robot and Human Interactive Communication,Hatfield:IEEE,2006:117-122.

[2]F.L.Lewis,D.Dawson,M.Abdallah,et al.Robot Manipulator Control:Theory and Practice.Boca Raton:CRC Press,2003.

[3]J.J.E.Slotine,W.Li.Applied Nonlinear Control,Englewood Cliffs:Prentice Hall,1991.

[4]M.Jamshidi,B.J.Oh,H.Seraji.Two adaptive control structures of robot manipulators.Journal of Intelligent and Robotic Systems,1992,6(2/3):203-218.

[5]N.Hogan.Impedance control:an approach to manipulation.ProceedingsoftheAmericanControlConference,SanDiego:IEEE,1984:304-313.

[6]R.Anderson,M.W.Spong.Hybrid impedance control of robotic manipulators.Journal Robot Automation,1988,4(5):549-556.

[7]H.Kawasaki,R.Taniuchi.Adaptive controlforrobotic manipulators executing multilateral constrained task.Asian Journal of Control,2003,5(1):1-11.

[8]H.Wu,W.Xu,C.Cai.Adaptive impedance control in robotic cell injection system.The 17th International Conference on Methods and Models in Automation and Robotics,Miedzyzdrojie:IEEE,2012:268-275.

[9]S.S.Ge,C.C.Hang,T.H.Lee,et al.Stable Adaptive Neural Network Control.Boston:Kluwer Academic.2001.

[10]M.K.Vukobratovic,A.G.Rodic,Y.Ekalo.Impedance control as a particular case of the unified approach to the control of robots interacting with a dynamic known environment.Journal of Intelligent&Robotic Systems,1992,18(2):191-204.

[11]E.Gribovskaya,A.Kheddar,A.Billard.Motion learning and adaptive impedance for robot control during physical interaction with humans.IEEE International Conference on Robotics and Automation,Shanghai:IEEE,2011:4326-4332.

[12]L.Huang,S.S.Ge,T.H.Lee.Neural network adaptive impedance control of constrained robots.IEEE International Symposium on Intelligent Control,Vancouver:IEEE,2002:615-619.

[13]C.Wang,Y.Li,S.Ge,et al.Continuous critic learning for robot control in physical human-robot interaction.Proceedings of the 13th International Conference on Control,Automation and Systems,Gwangju:IEEE,2013:833-838.

[14]Y.Li,S.S.Ge,C.Yang.Impedance control for multi-point human-robot interaction.Proceedings of the 8th Asian Control Conference,Kaohsiung:IEEE,2011:1187-1192.

[15]T.Tsuji,Y.Tanaka.Tracking control properties of human-robotic systems based on impedance control.IEEE Transactions on Systems,Man and Cybernetics-Part A:Systems and Humans,2005,35(4):523-535.

[16]K.Doya,H.Kimura,M.Kawato.Neural mechanisms of learning and control.IEEE Control Systems Magazine,2001,21(4):42-54.

[17]D.Wolpert,M.Miall,R.Chris,et al.Internal models in the cerebellum.Trends in Cognitive Sciences,1998,2(9):338-347.

[18]D.Kleinman,L.Baron,S.Baron,et al.An optimal control model of human response-Part I:Theory and validation.Automatica,1970,6(3):357-369.

[19]R.C.Miall,D.J.Weir,D.M.Wolpert,et al.Is the cerebellum a smith predictor.Journal of Motor Behavior,1993,25(3):203-216.

[20]S.Suzuki,K.Kurihara,K.Furuta,et al.Variable dynamic assist control on haptic system for human adaptive mechatronics.Proceedings of the 44th IEEE Conference on Decision Control/European Control Conference,New York:IEEE,2005:4596-4600.

[21]S.Suzuki,K.Furuta.Adaptive impedance control to enhance human skill on a haptic interface system.Journal of Control Science and Engineering,2012:DOI 10.1155/2012/365067.

[22]F.C.Chen,H.K.Khalil.Adaptive control of nonlinear systems usingneuralnetworks.InternationalJournalControl,1992,55(6):1299-1317.

[23]F.L.Lewis,S.Jagannathan,A.Yesildirek.Neural Network Control of Robot Manipulators and Nonlinear Systems.London:Taylor and Francis,1992.

[24]S.S.Ge,T.H.Lee,C.J.Harris.Adaptive Neural Network Control of Robotic Manipulators.Singapore:World Scientific,1998.

[25]G.A.Christodoulou,M.A.Rovithakis.Adaptive control of unknown plants using dynamical neural networks.IEEE Transactions on Systems,Man,and Cybernetics-Part A:Systems and Humans,1994,24(3):400-412.

[26]A.Ye?sildirek,F.L.Lewis.Feedback linearization using neural networks.Automatica,1995,31(11):1659-1664.

[27]M.M.Polycarpou.Stable adaptive neural control scheme for nonlinearsystems.IEEETransactionsonAutomaticControl,1996,14(3):447-451.

[28]A.S.Poznyak,W.Yu,E.N.Sanchez,et al.Nonlinear adaptive trajectory tracking using dynamic neural networks.IEEE Transactions on Neural Networks,1996,10(6):1402-1411.

[29]G.A.Rovithakis.Performance of a neural adaptive tracking controller for multi-input nonlinear dynamical systems.IEEE Transactions on Systems,Man,Cybernetics-Part A:Systems and Humans,2000,30(6):720-730.

[30]R.Hunt,K.J.Zbikowski.Neural Adaptive Control Technology.Singapore:World Scientific,1996.

[31]P.V.Osburn,H.P.Whitaker,A.Kezer.New Developments in the Design of Model Reference Adaptive Control Systems.Easton:Institute of Aeronautical Sciences,1961.

[32]P.Parks.Lyapunov redesign of model reference adaptive control systems.IEEE Transactions on Automatic Control,1966,11(3):362-367.

[33]K.J.?Astr¨om,B.Wittenmark.A survey of adaptive control applications.Proceedingsofthe34thIEEEConferenceonDecision and Control,New Orleans:IEEE,1995:649-654.

[34]J.Aseltine,A.Mancini,C.Sarture.A survey of adaptive control systems.IEEE Transactions on Automatic Control,1958,6(1):102-108.

[35]H.Unbehauen.Adaptivedualcontrolsystems:Asurvey.Adaptive Systems for Signal Processing,Communications,and Control Symposium,Lake Louise:IEEE,2000:171-180.

[36]N.M.Filatov,H.Unbehauen.Survey of adaptive dual control methods.IEEE Proceedings Control Theory and Applications,2000,147(1):118-128.

[37]K.J.?Astr¨om,B.Wittenmark.Adaptive Control.India:Pearson Education,2001.

[38]I.D.Landau.A survey of model reference adaptive techniques theory and applications.Automatica,1974,10(4):353-379.

DOI10.1007/s11768-016-5138-2

?Corresponding author.

E-mail:balqaudi@yic.edu.sa.

The work was supported by the National Science Foundation(No.IIS-1208623),the Office of Naval Research grant(No.N00014-13-1-0562),the AFOSR(Air Force Office of Scientific Research)EOARD(European Office of Aerospace Research and Development)grant(No.13-3055),the U.S.Army Research Office grant(No.W911NF-11-D-0001).

the B.Sc.degree in Electronics Commination and Electrical Automation from the Yanbu Industrial College,Yanbu,Saudi Arabia,and M.Sc.degree in Electrical Engineering focusing in Biorobotics,Control and Cybernetics from Rochester Institute of Technology,Rochester,NY,U.S.A.,in 2008 and 2012,respectively.He is currently pursuing the Ph.D.degree with the University of Texas at Arlington,Arlington,TX,U.S.A.He joined Yanbu Industrial College as an Instructor,from 2008 to 2009,and

the King’s scholarship for Gas and Petroleum track in 2009.His current research interests include physical human-robot interaction,adaptive control,reinforcement learning,robotics,and cognitive-psychological inspired learning and control.E-mail:balqaudi@yic.edu.sa.

Hamidreza MODARESreceived the B.Sc.degree from the University of Tehran,Tehran,Iran,and the M.S.degree from the Shahrood University of Technology,Shahrud,Iran,in 2004 and 2006,respectively.He is currently pursuing the Ph.D.degree with the University of Texas at Arlington,Arlington,TX,U.S.A.He joined the ShahroodUniversityofTechnologyasaUniversity Lecturer,from 2006 to 2009.Since 2012,he has been a Research Assistant with the University of Texas at Arlington Research Institute,Fort Worth,TX,U.S.A.His current research interests include optimal control,reinforcement learning,distributed control,robotics,and pattern recognition.E-mail:modares@uta.edu.

Isura RANATUNGA(S’09)received the B.Sc.degree from the University of Texas at Arlington,Arlington,TX,U.S.A.,in 2010,where he is currently pursuing the Ph.D.degree with a focus on robotics and automation,both in Electrical Engineering.He is a Graduate Research Assistant with the University of Texas at Arlington Research Institute,Fort Worth,TX,U.S.A.,and the Next Generation Systems Research Group.His current research interests include force control,physical human-robot interaction,bipedal walking,adaptive robot control,and autonomous navigation.E-mail:isura.ranatunga@mavs.uta.edu.

Shaikh M.TOUSIFreceived the B.Sc.degree in Electrical and Electronic Engineering from American International University-Bangladesh,Bangladesh in 2009,and the M.Sc.degree in Electrical Engineering from University of Texas at Arlington in 2014.From 2009 to 2012 he was a lecturer in the Department of Electrical Engineering in American International University-Bangladesh and responsible for teaching many engineering courses.He is currently pursuing his Ph.D.at University of Texas at Arlington,Texas,U.S.A.E-mail:shaikh.tousif@mavs.uta.edu.

Frank L.Lewis(S’70-M’81-SM’86-F’94)received the B.Sc.degree in Physics and Electrical Engineering and the M.S.E.E.degree,both from Rice University,Houston,TX,U.S.A.,the M.Sc.degree in Aeronautical Engineering from the University of West Florida,Pensacola,FL,U.S.A.,and the Ph.D.degree from the Georgia Institute of Technology,Atlanta,GA,U.S.A.He is a University of Texas at Arlington Distinguished Scholar Professor,a Teaching Professor,and the Moncrief-O’Donnell Chair with the University of Texas at Arlington Research Institute,Fort Worth,TX,U.S.A.He is the Qian Ren Thousand Talents Consulting Professor with Northeastern University,Shenyang,China.He is a Distinguished Visiting Professor with the Nanjing University of Science and Technology,Nanjing,China,and the Project 111 Professor with Northeastern University.His current research interests include feedback control,intelligent systems,cooperative control systems,and nonlinear systems.He has authored numerous journal special issues,journal papers,20 books,including Optimal Control,Aircraft Control,Optimal Estimation,and Robot Manipulator Control,which are used as university textbooks worldwide and he holds six U.S.patents.Dr.Lewis was a recipient of the Fulbright Research Award,the National Science Foundation Research Initiation Grant,the American Society for Engineering Education Terman Award,the International Neural Network Society Gabor Award,the U.K.Institute of Measurement and Control Honeywell Field Engineering Medal,the IEEE Computational Intelligence Society Neural Networks Pioneer Award,the Outstanding Service Award from Dallas IEEE Section,and selected as an Engineer of the Year by the Fort Worth IEEE Section.He was listed in Fort Worth Business Press Top 200 Leaders in Manufacturing and Texas Regents Outstanding Teaching Award in 2013.He is a PE of Texas and a U.K.Chartered Engineer.He is a member of the National Academy of Inventors and a fellow of International Federation of Automatic Control and the U.K.Institute of Measurement and Control.He is a Founding Member of the Board of Governors of the Mediterranean Control Association.Board of Governors of the Mediterranean Control Association.E-mail:lewis@uta.edu.

Dan O.POPA(M’93)received the B.A.degree in Engineering,Mathematics,and Computer Science and the M.S.degree in Engineering,both from Dartmouth College,Hanover,NH,U.S.A.,and the Ph.D.degree in Electrical,Computer and Systems Engineering from Rensselaer Polytechnic Institute(RPI),Troy,NY,U.S.A.,in 1998,focusing on control and motion planning for nonholonomic systems and robots.He is an Associate Professor with the Department of Electrical Engineering,University of Texas at Arlington,and the Head of the Next Generation Systems Research Group.He joined the Center for Automation Technologies at RPI,where he

was a Research Scientist until 2004,for over 20 industry-sponsored projects.He was an Affiliated Faculty Member of the University of Texas at Arlington Research Institute,Fort Worth,TX,U.S.A.,and a Founding Member of the Texas Microfactory Initiative,in 2004.His current research interests include the simulation,control,packaging of microsystems,the design of precision robotic assembly systems,and control and adaptation aspects of human-robot interaction.He has authored over 100 refereed publications.Dr.Popa was a recipient of several prestigious awards,including the University of Texas Regents Outstanding Teaching Award.He serves as an Associate Editor of the IEEE Transaction On Automation Science and Engineering and the Journal of Micro and Bio Robotics(Springer).He is an active member of the IEEE Robotics and Automation Society Conference Activities Board,the IEEE Committee on Micro-Nano Robotics,and the ASME Committee on Micro-Nano Systems and a member of ASME.E-mail:popa@uta.edu.

Control Theory and Technology2016年1期

Control Theory and Technology的其它文章: A robust online path planning approach in cluttered environments for micro rotorcraft drones; New results in global stabilization for stochastic nonlinear systems; Output regulation problem for discrete-time linear time-delay systems by output feedback control; Matrix expression and vaccination control for epidemic dynamics over dynamic networks; Observer-based leader-following tracking control under both fixed and switching topologies; Consensus of single integrator multi-agent systems with directed topology and communication delays