This article illustrates that structural design synthesis can be achieved through a sequential decision process, whereby a sparsely connected seed configuration is sequentially altered through discrete actions to generate the best design solution, with respect to a specified objective and constraints. Specifically, the generative design synthesis is mathematically formulated as a finite Markov Decision Process. In this context, the states correspond to a specific structural configuration, the actions correspond to the available alterations that can be made to a given configuration, and the immediate rewards are constructed to be proportional to the improvement in the altered configuration’s performance. In the context of generative structural design synthesis, since the immediate rewards are not known at the onset of the process, reinforcement learning is employed to obtain an approximately optimal policy by which to alter the seed configuration to synthesize the best design solution. The approach is applied for the optimization of planar truss structures and its utility is investigated with three numerical examples, each with unique domains and constraints.