Skip to main content
Thesis defences

PhD Oral Exam - Yuanliang Li, Information and Systems Engineering

Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks


Date & time
Friday, October 25, 2024
2 p.m. – 5 p.m.
Cost

This event is free

Organization

School of Graduate Studies

Contact

Dolly Grewal

Wheel chair accessible

Yes

When studying for a doctoral degree (PhD), candidates submit a thesis that provides a critical review of the current state of knowledge of the thesis subject as well as the student’s own contributions to the subject. The distinguishing criterion of doctoral graduate research is a significant and original contribution to knowledge.

Once accepted, the candidate presents the thesis orally. This oral exam is open to the public.

Abstract

The smart grid is a highly complex cyber-physical system of heterogeneous components with sensory, control, computation, and communication. Due to its complexity, dimensionality, uncertainty, and strong cyber-physical coupling, manually identifying critical vulnerabilities against cyberattacks at infrastructure levels has proven to be challenging. In the information and communication technology (ICT) industry, penetration testing (PT) has demonstrated its efficacy in pinpointing vulnerabilities within information systems through authorized cyberattacks. Building upon the principles of PT, this study delves into exploring effective and efficient PT approaches to discover vulnerabilities for cyber-physical smart grids based on deep reinforcement learning (DRL) methods, with the aim of enhancing the security of smart grids.

To overcome the poor efficiency of common PT in identifying critical vulnerabilities for smart grids caused by its complex structure and strong cyber-physical coupling, we first propose a DRL-based PT framework and formulate the PT as a Markov decision process (MDP) specifically for smart grids. This framework considers the cyber-physical coupling, realistic cyberattacks, and the physical impacts of smart grids comprehensively. The framework is applied to model a replay attack scheme on an active distribution network (ADN) of smart grids with conservation voltage reduction (CVR) control as the study case, which aims to identify critical attack paths that lead to system voltage violations. Additionally, a co-simulation platform named GridBattleSim is developed specifically for DRL-based PT on ADNs of smart grids, which integrates dedicated simulators for different parts of the ADN. The simulation results show the efficacy of DRL-based PT in learning optimal attack paths under varying system conditions and different levels of attack difficulty.

To overcome the limitation of the MDP formulation that requires full-state observation in practical PT scenarios, a partially observable Markov decision process (POMDP) formulation is proposed, which allows the PT agent to learn PT policies under partially observable conditions. To solve the POMDP and obtain the optimal PT policy, we apply the physical model of the power grid to estimate its full state based on the local observable data captured by the PT agent, and then transform the POMDP to an MDP that can be solved by DRL.

Furthermore, to address the sparse reward issue, the poor sampling efficiency, and the poor interpretability of DRL-based PT, a knowledge-informed AutoPT framework (RM-PT) is proposed, which incorporates cybersecurity domain knowledge based on Reward Machine (RM) to decompose PT objectives into a set of subtasks. We use lateral movement of PT as a case study, where two RMs are designed based on MITRE ATT&CK knowledge base. Finally, the DQRM algorithm is applied to train the PT policies. The proposed RM-PT is evaluated under the CyberBattleSim platform. The experimental results show that the knowledge-informed PT exhibits a higher training efficiency compared to the PT without knowledge embedding. Furthermore, RMs that incorporate more detailed domain knowledge exhibit superior PT performance compared to RMs with simpler knowledge.

Finally, we also discuss the future directions of this study in terms of domain knowledge integration for AI-powered PT. We anticipate that the methodologies and findings presented in this study can inspire efforts in securing critical infrastructure and closing research gaps for the cybersecurity of smart grids.

Back to top

© Concordia University