CRISPR-Cas systems typically consist of a CRISPR array and <i>cas</i> genes that are organized in one or more operons. However, a substantial fraction of CRISPR arrays are not adjacent to <i>cas</i> genes. Definitive identification of such isolated CRISPR arrays runs into the problem of false-positives, with unrelated types of repetitive sequences mimicking CRISPR. We developed a computational pipeline to eliminate false CRISPR predictions and found that up to 25% of the CRISPR arrays in complete bacterial and archaeal genomes are located away from <i>cas</i> genes. Most of the repeats in these isolated arrays are identical to repeats in <i>cas</i>-adjacent CRISPR arrays in the same or closely related genomes, indicating an evolutionary relationship between isolated arrays and arrays in typical CRISPR-<i>cas</i> loci. The spacers in isolated CRISPR arrays show nearly as many matches to viral genomes as spacers from complete CRISPR-<i>cas</i> loci, suggesting that the isolated arrays were either functionally active recently or continue to function. Reconstruction of evolutionary events in closely related bacterial genomes suggests three routes of evolution of isolated CRISPR arrays: (1) loss of <i>cas</i> genes in a CRISPR-<i>cas</i> locus, (2) <i>de novo</i> generation of arrays from off-target spacer integration into sequences resembling the corresponding repeats, and (3) transfer by mobile genetic elements. Both combination of <i>de novo</i> emerging arrays with <i>cas</i> genes and regain of <i>cas</i> genes by isolated arrays via recombination likely contribute to functional diversification in CRISPR-Cas evolution.
Ana Moya-Beltrán, Kira S. Makarova, Lillian G. Acuña, Yuri I. Wolf, Paulo C. Covarrubias, Sergey Shmakov, Cristian Silva, Igor Tolstoy, D. Barrie Johnson, Eugene V Koonin, Raquel Quatrini
Discussion(0)
No comments yet. Be the first to comment.