University of Electro-Communications e-Bulletin: Innovative automated control systems: Control-theoretic approach for fast online reinforcement learning


University of Electro-Communications publishes the February 2021 issue of UEC e-Bulletin

The February 2021 issue of the UEC e-Bulletin includes an informative video of a UEC researcher describing his activities on innovative control theory for control, reinforcement learning, and power systems.

Research highlights ‘Innovative automated control systems: Control-theoretic approach for fast online reinforcement learning,’ Tomonori Sadamoto; ‘Computing in close proximity: Edge intelligence with deep reinforcement learning,’ Celimuge Wu.

News and Events page is on the ‘7th UEC Seminar in ASEAN, 2020 and the 2nd ASEAN – UEC Workshop’

February 2021 issue of UEC e-Bulletin

Research Highlights

Innovative automated control systems: Control-theoretic approach for fast online reinforcement learning

Reinforcement Learning (RL) is an effective way of designing model-free linear quadratic regulators (LQRs) for linear time-invariant networks with unknown state-space models. RL has wide ranging applications including industrial automation, self-driving automobiles, power grid systems, and even forecasting stock prices for financial markets.

However, conventional RL can result in unacceptably long learning times when network sizes are large. This can pose a serious challenge for real-time decision-making.

Tomonori Sadamoto at the University of Electro-Communications, Aranya Chakrabortty at North Carolina State UniversityUSA, and Jun-ichi Imura at the Tokyo Institute of Technology have proposed a fast RL algorithm that enables online control of large-scale network systems.

Their approach is to construct a compressed state vector by projecting the measured state through a projection matrix. This matrix is constructed from online measurements of the states in a way that it captures the dominant controllable subspace of the open-loop network model. Next, a RL-controller is learned using the reduced-dimensional state instead of the original state such that the resultant cost is close to the optimal LQR cost.

The lower dimensionality of this approach enables a drastic reduction in the computational complexity for learning. Moreover, stability and optimality of the control performance are theoretically evaluated using robust control theory by treating the dimensionality-reduction error as an uncertainty. Numerical simulations through a 100-dimensional large-scale power grid model showed that the learning speed improved by almost 23 times while maintaining control performance.

The main contribution of the paper is to show how two individually well-known concepts in dynamical system theory and machine learning, namely, model reduction and reinforcement learning, can be combined to construct a highly efficient real-time control design for extreme-scale networks.

Caption: Transient response of frequency deviation of generators in the power system(a) without control, (b) by the optimal controller designed by an existing RL method, and (c) by the controller designed by the proposed algorithm.


Tomonori SadamotoAranya Chakrabortty, and Jun-ichi Imura, Fast Online Reinforcement Learning Control using State-Space Dimensionality Reduction, IEEE Transactions on Control of Network Systems (Early Access)

DOI: 10.1109/TCNS.2020.3027780

Computing in close proximity: Edge intelligence with deep reinforcement learning

Mobile edge computing (MEC) is a promising paradigm to improve the quality of computation experience for mobile devices by providing computing capabilities in close proximity. MEC finds applications in homes, factories, and transport modes including trains and airplanes. However, the design of computation offloading policies for an MEC system, specifically, the decision of executing a computation task at the mobile device or at the remote MEC server, should adapt to the network randomness and uncertainties.

Now, Celimuge Wu at the University of Electro-Communications, Tokyo and colleagues in FinlandUSA, and China, report on the Deep-SARL, a double deep Q-network (DQN)-based online strategic computation offloading algorithm to learn the optimal policy without knowing a priori knowledge of network dynamics (Fig. 1).

The computation offloading problem is modeled as a Markov decision process, where its objective is to maximize the long-term utility performance whereby an offloading decision is made based on the task queue state, the energy queue state, and the channel qualities between mobile users and base stations. The researchers describe the adoption of a Q-function decomposition technique to enhance the learning performance.

Numerical experiments based on TensorFlow show that their proposed learning algorithm achieves a significant improvement in computation offloading performance compared with existing baselines, showing an optimal tradeoff among the computation task execution delay, task drops, task queuing delay, task failure penalty, and MEC service payment. Deep-SARL provides a novel and effective approach to facilitate intelligence in edge computing under time-varying network dynamics.

Caption: Deep-SARL-based strategic computation offloading in an MEC system.


Xianfu ChenHonggang Zhang, Celimuge Wu, Shiwen MaoYusheng JiMehdi Bennis, Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning,” IEEE Internet of Things Journal, Vol.6, no.3, pp. 4005-4018, June 2019.

DOI: 10.1109/JIOT.2018.2876279

Researcher Video Profiles

Tomonori Sadamoto Assistant Professor, Department of Mechanical Engineering and Intelligent Systems, UEC Tokyo.

Innovative control theory: Bridging the gap between research on control, reinforcement learning, and power systems

Assistant Professor Tomonori Sadamoto is an expertise in control theory, currently focusing on integrating control theory with reinforcement learning and power engineering.

Reinforcement learning is a key methodology for controlling large-scale complex systems such as power grids and transportation networks. However, the major contemporary learning theories currently used are unsuitable for real-time control because designers must repeat trials just for acquiring data. Instead, it is necessary to develop a methodology that is capable of real-time decision making.


News and Events

UEC holds the 7th UEC Seminar in ASEAN, 2020 and the 2nd ASEAN – UEC Workshop

On November 21, 2020, the University of Electro-Communications (UEC) held the 7th UEC Seminar in ASEAN, 2020 and the 2nd ASEAN – UEC Workshop on Energy and AI online in collaboration with Bundung Institute of Technology (ITB), Indonesia, and the ECTI Association.