Learning to Locomote with Deep Neural-Network and CPG-based Control in a Soft Snake Robot

[MDP]Markov Decision Process [POMDP]Partially Observable Markov Decision Process

[Semi-MDP]Semi-Markov decision process [RL]reinforcement learning [MCTS]Monte Carlo tree search [UCT]Upper Confidence Bound 1 applied to trees [scLTL]syntactically co-safe LTL [SSP]Stochastic Shortest Path [SG(2)]Two-player Stochastic Game [DOF]degree of freedom [CPG]Central Pattern Generator [NN]Neural Network [SNN]Spiking Neural Net [R-STDP]Reward-Modulated Spike-Timing-Dependent Plasticity [GP]Genetic Programming [PPOC]Proximal Policy Optimization Option-Critics [DR]Domain Randomization [BIBO]Bounded-input, Bounded-Output

[DOFs]degrees of freedom [CPGs]Central Pattern Generators