WORKSHOP ON MACHINE LEARNING IN CONTROL (LEAC)

Technical Program Schedule

Time

(CET, UTC+2)

TIME

(PDT, UTC-7)

Speaker/ Authors

Title

08:15 - 08:30 23:15(-1 day) - 23:30 Opening Remarks  

08:30 - 09:15

23:30(-1 day) - 00:15

Radu Grosu

Neural circuit policies

09:15 - 09:30

00:15 - 00:30

Jorge Val Ledesma

Feature Selection for Reinforcement Learning Control with Periodic Disturbances

09:30 - 10:15

00:30 - 01:15

Cristina Seceleanu

Reinforcement Learning for Mission Plan Synthesis of Autonomous Vehicles

10:15 - 10:30

01:15 - 01:30

Victor Bolbot, Gerasimos Theotokatos

Automatically generating collision scenarios for testing ship collision avoidance system using sampling techniques

BREAK

10:45 - 11:30

01:45 - 02:30

Michael Fisher

Verifying Autonomous Systems - We need help from you!

11:30 - 11:45

02:30 - 02:45

Aysegul Kivilcim

A Safe Reinforcement Learning Algorithm to Control of Covid-19 Spread

11:45 - 12:30

02:45 - 03:30

Henk Blom

Sociotechnical Safety Verification of Future Air Traffic Designs

BREAK

13:30 - 14:15

04:30 - 05:15

Joost-Pieter Katoen

Learning Controllers under Partial Observations

14:15 - 14:30

05:15 - 05:30

Swantje Plambeck, Jakob Schyga, Johannes Hinckeldeyn, Jochen Kreutzfeldt, Görschwin Fey

Automata Learning for Automated Test Generation of Real Time Localization Systems

14:30 - 14:45

05:30 - 05:45

Rahul Misra

Multi-agent Reinforcement Learning for solving Non-zero sum Dynamic games

BREAK

15:45 - 16:30

05:30 - 06:15

Akshay Rajhans

Engineering Learning-Enabled Cyber-Physical Systems: Challenges and Opportunities

16:30 - 17:15

07:30 - 08:15

Maryam Kamgarpour

A framework for safe exploration and learning, and its applications to control

BREAK

18:00 - 18:45

09:00 - 09:45

Claire Tomlin

Safe Learning in Robotics

18:45 - 19:00

09:45 - 10:00

Christina Selby, Paul Wood, Jaime A. Arribas Starkey-El, Tamim Sookoor

Estimating the Confidence of RL-based Controllers

19:00 - 19:45

10:00 - 10:45

Draguna Vrabie

Differentiable Predictive Control

BREAK

20:15 - 21:00

11:15 - 12:00

Aaron Ames

Learning for Safety-Critical Control

21:00 - 21:15

12:00 - 12:15

Ján Drgoňa, Elliott Skomski, Soumya Vasisht, Aaron Tuor, Draguna Vrabie

Stability Analysis of Deep Neural Dynamical Systems

21:15 - 22:00

12:15 - 13:00

Sayan Mitra

Interfaces for combining models and data for verification and synthesis

Invited speakers

 

Details of the Technical program

Radu Grosu

TITLE:  Neural circuit policies

ABSTRACT:

A central goal of artificial intelligence is to design algorithms that are both generalisable and interpretable. We combine brain-inspired neural computation principles and scalable deep learning architectures to design compact neural controllers for task-specific compartments of a full-stack autonomous vehicle control system. We show that a single algorithm with 19 control neurons, connecting 32 encapsulated input features to outputs by 253 synapses, learns to map high-dimensional inputs into steering commands. This system shows superior generalisability, interpretability and robustness compared with orders-of-magnitude larger black-box learning systems. The obtained neural agents enable high-fidelity autonomy for task-specific parts of a complex autonomous system.

Jorge Val Ledesma

TITLE:  Feature Selection for Reinforcement Learning Control with Periodic Disturbances

ABSTRACT:

This work presents a reinforcement learning algorithm for control of industrial applications that, in some cases, do not have adequate experimental conditions. Therefore, the measured data might lead to failures in the approximation method. This method selects the approximation parameters that update iteratively based on the quality of the data, this analysis is performed with Singular Value Decomposition of the measured data. In this way, the method discards the features or basis functions which do not provide information to the estimation. The method is validated in a laboratory setup that emulates a water distribution network with periodic disturbances.

Cristina Seceleanu

TITLE:  Reinforcement Learning for Mission Plan Synthesis of Autonomous Vehicles

ABSTRACT:

Computing a mission plan for an autonomous vehicle consists of path planning and task scheduling, which are two outstanding problems existing in the design of multiple autonomous vehicles. Both problems can be solved by the use of exhaustive search techniques such as model checking and algorithmic game theory. However, model checking suffers from the infamous state-space-explosion problem that makes it inefficient at solving the problem when the number of vehicles is large, which is often the case in realistic scenarios. In this talk, we show how reinforcement learning can help model checking to overcome these limitations, such that mission plans can be synthesized for a larger number of vehicles if compared to what is feasibly handled by model checking alone. Instead of exhaustively exploring the state space, the reinforcement-learning-based method randomly samples the state space within a time frame and then uses these samples to train the vehicle models so that their behavior satisfies the requirements of tasks. Since every state of the model needs not be traversed, state-space explosion is avoided. Additionally, the method also guarantees the correctness and completeness of the synthesis results. The method is implemented in UPPAAL STRATEGO and integrated with a toolset to facilitate path planning and task scheduling in industrial use cases. We also discuss the strengths and weaknesses of using reinforcement learning for synthesizing strategies of autonomous vehicles.

Victor Bolbot, Gerasimos Theotokatos

TITLE:  Automatically generating collision scenarios for testing ship collision avoidance system using sampling techniques

ABSTRACT:

We live in an era when autonomous systems are being designed and introduced in the maritime. A critical system on the autonomous ships is the collision avoidance system, which is responsible for safe vessel navigation. There is a great challenge in identifying encountering scenarios which could be fed for testing the collision avoidance system and ensuring situational coverage. The aim of this paper is to propose an automatic way for developing hazardous scenarios for testing the ship collision
avoidance system. In the suggested methodology sampling techniques are used to develop encountering situations. Then geometrical metrics are used to determine whether the condition is hazardous or not. The effectiveness of the approach is investigated using a number of sampling techniques. The results demonstrate that the Sobol quasi-random sequence have more robust results in identifying scenarios, even though the Latin hypercube and random sampling can outperform at times the Sobol pseudorandom sequence. Based on the findings suggestions for further enhancement and automatic development of testing scenarios are provided.

Michael Fisher

TITLE:  Verifying Autonomous Systems - We need help from you!

ABSTRACT:

I work on the verification of autonomous systems, such as robots and vehicles. Within these hybrid autonomous systems architectures there are a range of verification techniques we use, for a variety of components, each with differing requirements. In addition, the autonomous systems themselves may have representations of each component’s capabilities. Both of these aspects require more help from developers of learning components and control components. Otherwise, both the verification and the system self-awareness will be weak.

Aysegul Kivilcim

TITLE:  A Safe Reinforcement Learning Algorithm to Control of Covid-19 Spread

ABSTRACT:

In this work, we aim to control Covid-19 disease spread when the dynamic of the disease model is not fully known and when some states are constrained due to the limited capacity of intensive care units (ICUs). To cope with this problem, we have proposed a safe reinforcement learning algorithm. During the application of safe reinforcement learning algorithm, we have faced several challenges, such as adding constraints to the algorithm, formulating the practicable reward function, and finding appropriate basis functions for the reward function. The constraint problem is solved with the help of the barrier certificates. We will also use an agent based model together with reinforcement learning to provide better control tool by taking advantages of agent based model.We aim to provide more time efficient control algorithm applying reinforcement learning with agent based model.

Henk Blom

TITLE:  Sociotechnical Safety Verification of Future Air Traffic Designs

ABSTRACT:

Commercial aviation critically depends on air traffic management (ATM). ATM forms a complex Cyber Physical Human System (CPHS) that involves dynamic interactions under uncertainties (e.g. weather) between distributed human decision makers (e.g. pilots and air traffic controllers), dynamical systems (e.g. aircraft control and decision-support systems) and infrastructure (e.g. airports). Over decades, this complex CPHS has evolved to its current form through systematic learning from incident and accident investigations. This has made current ATM resilient and safe under a broad spectrum of disturbances. Novel technology and growing demands in commercial air transport form strong drivers for the design and implementation of significant changes in ATM’s complex CPHS. Due to dynamic interactions between distributed CPHS entities, such changes may trigger unforeseen emergent behaviour. The challenge is to identify such behavior already during the early design phase of future ATM, and to feedback such performance learning to the CPHS design.

This presentation gives an overview of state-of-art in socio-technical safety verification of future ATM designs. Key questions to be addressed are:

- Which are the safety-relevant agents and their interactions ?

- How to learn and capture various non-nominal disturbances ?

- How to address all of this through formal reach probability analysis ?

The approaches are illustrated for an example CPHS design of future ATM.

Joost-Pieter Katoen

TITLE:  Learning Controllers under Partial Observations

ABSTRACT:

Synthesizing controllers under partial information is hard, and in many cases undecidable. We will present automated synthesis techniques for learning randomized finite-state controllers for partially observable Markov chains by exploiting parameter synthesis in parametric Markov chains. We will present the core algorithmic ideas and present first experimental results.

 

Swantje Plambeck, Jakob Schyga, Johannes Hinckeldeyn, Jochen Kreutzfeldt, Görschwin Fey

TITLE:  Automata Learning for Automated Test Generation of Real Time Localization Systems

ABSTRACT:

Cyber Physical Systems (CPSs) are often black box systems for which no exact model exists. Automata learning allows to build abstract models of CPSs and is used in several scenarios, i.e. simulation, monitoring, and test case generation. Real time localization systems (RTLSs) are an example of particularly complex and often safety critical CPSs. We present a procedure for automatic test case generation with automata learning and apply this approach in a case study to a localization system.

Akshay Rajhans

TITLE:  Engineering Learning-Enabled Cyber-Physical Systems: Challenges and Opportunities

ABSTRACT:

In today’s technology landscape, ubiquitous sensors are generating ever larger amounts of data, pervasive network connectivity is making this data available at rapidly increasing speeds and size, and proliferation of compute power is enabling increasingly sophisticated computational applications. As a result, learning algorithms that leverage these reams of data and data-intensive compute resources are providing a unique value proposition. These trends are challenging the status quo in traditional system development approaches. Yet, can the CPS community really transition from model-based to data-based approaches as the premise of the workshop suggests, or do we need a combination of the two approaches? How can Model-Based Design principles, tools, and workflows enable engineers to design, implement, and operate these complex CPS and IoT systems in today’s data-driven age? Let us review some challenges and opportunities for research and practice in engineering such learning-enabled complex systems as I present a perspective from the vantage point of a research scientist in industry. 

Rahul Misra

TITLE:  Multi-agent Reinforcement Learning for solving Non-zero sum Dynamic games

ABSTRACT:

In recent years, Multi-agent Reinforcement Learning (MARL) has been applied in solving Non Zero Sum Dynamic Games (NZSDG) with some empirical success. The aim of this work is to analyze the approximate dynamic programming approach for solving NZSDG in the setting of discrete time dynamical systems defined by difference equations. A function approximator is used for approximating the state-action Q-functions. The function approximator is linear in parameters and estimates Q-functions via least squares regression. Simulation results on a water network are also presented.

Maryam Kamgarpour

TITLE:  A framework for safe exploration and learning, and its applications to control

ABSTRACT:

With increasing deployment of data-driven and learning-based approaches in safety-critical control systems, it is increasingly important to ensure safety of these approaches. I will discuss our proposed framework to safe exploration and learning in an optimization problem. This framework helps understand fundamental challenges in safe learning. I will illustrate our solution approach on control problems and discuss open directions.

Claire Tomlin

TITLE:  Safe Learning in Robotics

ABSTRACT:

I will discuss how model-based reachability tools and real-time data may be combined to achieve learning with safety guarantees.  Then I will discuss cases which don't fall into this framework, like learning-based perception in navigation, and how we might move towards developing assurances of safety in these cases.

Christina Selby, Paul Wood, Jaime A. Arribas Starkey-El, Tamim Sookoor

TITLE:  Estimating the Confidence of RL-based Controllers

ABSTRACT:

Deep Reinforcement Learning (RL) algorithms are emerging as candidates for controllers in complex real-time systems such as traffic light management infrastructures. These algorithms can suffer failures due to over and under fitting and extrapolations to regions of operation that were not part of the training data. The Simplex architecture, an approach that switches between a high-performance complex controller and a simpler verifiable controller, is widely used to ensure the safe operation of fault-tolerant systems. When the complex controller is RL based, the decision module switching between the two controllers need to know when to switch over to
the simpler controller. In order to aid in this decision, we developed a Safety Controller Indicator Score that combines a confidence estimate with an impact score to inform the decision module that the RL agent is degraded, enabling safer system operations.

Draguna Vrabie

TITLE:  Differentiable Predictive Control

ABSTRACT:

I will introduce differentiable predictive control, a data driven predictive control approach that uses physics-informed deep learning representations for modeling unknown dynamic systems and synthesizing control policies. Differentiable Predictive Control presents an approximate data-driven solution approach to the explicit Model Predictive Control problem as a scalable alternative to multiparametric programming solvers. The optimization of the neural control policy is based on automatic differentiation of the MPC-inspired loss function through a differentiable closed-loop system model. This approach can optimize adaptive neural control policies for time-varying references while satisfying state and input constraints without the prior need of an MPC controller. I will illustrate the methodology through some application examples pertaining to energy efficient buildings.

Aaron Ames

TITLE:  Learning for Safety-Critical Control

ABSTRACT:

Safety is critical on a wide variety of dynamic robotic systems.  Yet, when deploying controllers that have guarantees of safety on these systems, uncertainties in the model and environment can violate these guarantees in practice.    This talk will approach safety-critical control from the perspective of control barrier functions (CBFs)- describing the basic theory and existing applications of this optimization-based control methodology.  To enable guarantees on CBF controllers realized in practice, we will present an approach that fuses learning with CBFs.  Experimental results on robotic systems will be used to illustrate the approach. 

Ján Drgoňa, Elliott Skomski, Soumya Vasisht, Aaron Tuor, Draguna Vrabie

TITLE:  Stability Analysis of Deep Neural Dynamical Systems

ABSTRACT:

In this work, we analyze the eigenvalue spectra and stability of discrete-time dynamical systems modeled by deep neural networks. We leverage a characterization of deep neural networks as pointwise affine maps, representing network’s Jacobians, making them accessible to classical system analytic methods. As the paper’s main results, we provide necessary and sufficient conditions for local asymptotic stability of deep neural dynamical systems. Further, we identify links between the spectral properties of layer-wise weight parametrizations, different activation functions, and their effect on the overall network’s eigenvalue spectra. We analyze state-space dynamics, attractors, and eigenvalue spectra of neural dynamics s with varying weight initializations, activation functions, bias terms, and depths. The presented method can aid in the principled design of stable neural dynamics for modeling and optimal control.

Sayan Mitra

TITLE:  Interfaces for combining models and data for verification and synthesis

ABSTRACT:

Successes of deep learning and software 2.0 are shifting the interfaces for data and model usage in control design and verification. In this talk, I will discuss problem definitions from purely model-based approaches to more recent variations that allow the use of a combination of models and execution data. One of the findings is that adding model knowledge, such as symmetries, can help boost data-driven verification performance by a factor of 15x. Conversely, using learned abstractions from data, we can expand the reach of model-based verification and synthesis. These findings lead a new set of questions on data-efficiency and robustness in verification and synthesis problems.