TY - JOUR
T1 - Design Synthesis through a Markov Decision Process and Reinforcement Learning Framework
AU - Ororbia, Maximilian E.
AU - Warn, Gordon P.
N1 - Funding Information:
The authors gratefully acknowledge support of the Penn State ROCKET Seed Grant. All opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the sponsor.
Publisher Copyright:
© 2021 American Society of Mechanical Engineers (ASME). All rights reserved.
PY - 2022/4
Y1 - 2022/4
N2 - This article presents a framework that mathematically models optimal design synthesis as a Markov Decision Process (MDP) that is solved with reinforcement learning. In this context, the states correspond to specific design configurations, the actions correspond to the available alterations modeled after generative design grammars, and the immediate rewards are constructed to be related to the improvement in the altered configuration's performance with respect to the design objective. Since in the context of optimal design synthesis the immediate rewards are in general not known at the onset of the process, reinforcement learning is employed to efficiently solve the MDP. The goal of the reinforcement learning agent is to maximize the cumulative rewards and hence synthesize the best performing or optimal design. The framework is demonstrated for the optimization of planar trusses with binary cross-sectional areas, and its utility is investigated with four numerical examples, each with a unique combination of domain, constraint, and external force(s) considering both linear-elastic and elastic-plastic material behaviors. The design solutions obtained with the framework are also compared with other methods in order to demonstrate its efficiency and accuracy.
AB - This article presents a framework that mathematically models optimal design synthesis as a Markov Decision Process (MDP) that is solved with reinforcement learning. In this context, the states correspond to specific design configurations, the actions correspond to the available alterations modeled after generative design grammars, and the immediate rewards are constructed to be related to the improvement in the altered configuration's performance with respect to the design objective. Since in the context of optimal design synthesis the immediate rewards are in general not known at the onset of the process, reinforcement learning is employed to efficiently solve the MDP. The goal of the reinforcement learning agent is to maximize the cumulative rewards and hence synthesize the best performing or optimal design. The framework is demonstrated for the optimization of planar trusses with binary cross-sectional areas, and its utility is investigated with four numerical examples, each with a unique combination of domain, constraint, and external force(s) considering both linear-elastic and elastic-plastic material behaviors. The design solutions obtained with the framework are also compared with other methods in order to demonstrate its efficiency and accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85111150445&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111150445&partnerID=8YFLogxK
U2 - 10.1115/1.4051598
DO - 10.1115/1.4051598
M3 - Article
AN - SCOPUS:85111150445
VL - 22
JO - Journal of Computing and Information Science in Engineering
JF - Journal of Computing and Information Science in Engineering
SN - 1530-9827
IS - 2
M1 - 021002
ER -