We introduce a quantitative framework for assessing the generation of crossovers in DNA shuffling experiments. The approach uses free energy calculations and complete sequence information to model the annealing process. Statistics obtained for the annealing events then are combined with a reassembly algorithm to infer crossover allocation in the reassembled sequences. The fraction of reassembled sequences containing zero, one, two, or more crossovers and the probability that a given nucleotide position in a reassembled sequence is the site of a crossover event are estimated. Comparisons of the predictions against experimental data for five example systems demonstrate good agreement despite the fact that no adjustable parameters are used. An in silico case study of a set of 12 subtilases examines the effect of fragmentation length, annealing temperature, sequence identity and number of shuffled sequences on the number, type, and distribution of crossovers. A computational verification of crossover aggregation in regions of near-perfect sequence identity and the presence of synergistic reassembly in family DNA shuffling is obtained.
|Original language||English (US)|
|Number of pages||6|
|Journal||Proceedings of the National Academy of Sciences of the United States of America|
|State||Published - Mar 13 2001|
All Science Journal Classification (ASJC) codes