Friday, 3 October 2008

evolution - Is there variation of AT/CG ratio along species?

That ratio is essentially, as WYSIWYG pointed out, called GC-content. In actuality, GC-content is reported as $(G+C)/(A+C+G+T)$, converted to percent; i.e., what percent of the genome is G or C.



There is vast variation in GC-content, both amongst species and within a given species' own genome. For example, in humans the first intron and exon are generally more GC-rich than following introns/exons.1 Genes themselves are often found in higher GC areas,2,11 in particular CpG islands are found near a large number of (mammalian) promoters.3



Across species, there can be a big difference. Yeast and Arabidopsis are both around 35%4,5 whereas Plasmodium falciparum is around 24%;6Carsonella are even lower, at around 16.5%.14 On the other hand, the plankton Emiliania huxleyi is around 65%7. We can use these differences to study genomic history. Bacteria often have genes from all over the place thanks to horizontal gene transfer, and GC-content can be used to differentiate between their own genes and those from horizontal gene transfer;8 a good example is the CRISPR-Cas system,9 even in a virus!10



Here's a list of a few things genomic GC-content is correlated with:15



  • genome size

  • whether the bacterium is free-living or not

  • the environment

  • aerobiosis

  • nitrogen utilization

In the lab, high GC-content often means a harder region to work with, as the presence of three instead of two bonds (between A and T) requires more energy to break;12 anything involving primers can be made more annoying, including (especially, to some) sequencing. There is a theory that high GC-content would be an adaptation to high temperatures, to avoid DNA damage, but that is controversial.13,16,17,18

No comments:

Post a Comment