We present an analysis for calculating the frequency of out-of-sequence reassembly in DNA shuffling experiments. Out-of-sequence annealing events are undesirable since they typically encode non-functional proteins with missing or repetitive regions. The approach builds on the eShuffle framework for the prediction of crossover formation using equilibrium thermodynamics and complete sequence information to model the reassembly process. An in silico case study of a set of subtilases reveals that, as expected, the presence of significant sequence identity between distant portions of the parental sequences gives rise to out-of-sequence annealing events that upon reassembly generate sequences with missing or repetitive DNA segments. The frequency of these events increases as the fragment length decreases. Interestingly, out-of-sequence annealing events are at a minimum near the annealing temperature of 55°C used in the original DNA shuffling protocol. Neither parental sequence identity nor number of shuffled parents significantly alter the extent of out-of-sequence reassembly.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Modeling and Simulation
- Biochemistry, Genetics and Molecular Biology(all)
- Immunology and Microbiology(all)
- Agricultural and Biological Sciences(all)
- Applied Mathematics