24小时热门版块排行榜

返回列表

当前主题已经存档。

zisi

新虫 (初入文坛)

应助: 0 (幼儿园)
金币: 1.9
帖子: 24
在线: 27分钟
虫号: 250587
注册: 2006-05-13
性别: GG
专业: 纳米技术，生物，材料

[交流] The DNA sequence and biological annotation of human chromosome

The DNA sequence and biological annotation of human chromosome 1
S. G. Gregory1,2, K. F. Barlow1, K. E. McLay1, R. Kaul3, D. Swarbreck1, A. Dunham1, C. E. Scott1, K. L. Howe1, K. Woodfine4, C. C. A. Spencer5, M. C. Jones1, C. Gillson1, S. Searle1, Y. Zhou3, F. Kokocinski1, L. McDonald1, R. Evans1, K. Phillips1, A. Atkinson1, R. Cooper1, C. Jones1, R. E. Hall1, T. D. Andrews1, C. Lloyd1, R. Ainscough1, J. P. Almeida1, K. D. Ambrose1, F. Anderson1, R. W. Andrew1, R. I. S. Ashwell1, K. Aubin1, A. K. Babbage1, C. L. Bagguley1, J. Bailey1, H. Beasley1, G. Bethel1, C. P. Bird1, S. Bray-Allen1, J. Y. Brown1, A. J. Brown1, D. Buckley3, J. Burton1, J. Bye1, C. Carder1, J. C. Chapman1, S. Y. Clark1, G. Clarke1, C. Clee1, V. Cobley1, R. E. Collier1, N. Corby1, G. J. Coville1, J. Davies1, R. Deadman1, M. Dunn1, M. Earthrowl1, A. G. Ellington1, H. Errington1, A. Frankish1, J. Frankland1, L. French1, P. Garner1, J. Garnett1, L. Gay1, M. R. J. Ghori1, R. Gibson1, L. M. Gilby1, W. Gillett3, R. J. Glithero1, D. V. Grafham1, C. Griffiths1, S. Griffiths-Jones1, R. Grocock1, S. Hammond1, E. S. I. Harrison1, E. Hart1, E. Haugen3, P. D. Heath1, S. Holmes1, K. Holt1, P. J. Howden1, A. R. Hunt1, S. E. Hunt1, G. Hunter1, J. Isherwood1, R. James3, C. Johnson1, D. Johnson1, A. Joy1, M. Kay1, J. K. Kershaw1, M. Kibukawa3, A. M. Kimberley1, A. King1, A. J. Knights1, H. Lad1, G. Laird1, S. Lawlor1, D. A. Leongamornlert1, D. M. Lloyd1, J. Loveland1, J. Lovell1, M. J. Lush6, R. Lyne1, S. Martin1, M. Mashreghi-Mohammadi1, L. Matthews1, N. S. W. Matthews1, S. McLaren1, S. Milne1, S. Mistry1, M. J. F. M oore1, T. Nickerson1, C. N. O'Dell1, K. Oliver1, A. Palmeiri3, S. A. Palmer1, A. Parker1, D. Patel1, A. V. Pearce1, A. I. Peck1, S. Pelan1, K. Phelps3, B. J. Phillimore1, R. Plumb1, J. Rajan1, C. Raymond3, G. Rouse3, C. Saenphimmachak3, H. K. Sehra1, E. Sheridan1, R. Shownkeen1, S. Sims1, C. D. Skuce1, M. Smith1, C. Steward1, S. Subramanian3, N. Sycamore1, A. Tracey1, A. Tromans1, Z. Van Helmond1, M. Wall1, J. M. Wallis1, S. White1, S. L. Whitehead1, J. E. Wilkinson1, D. L. Willey1, H. Williams1, L. Wilming1, P. W. Wray1, Z. Wu3, A. Coulson1, M. Vaudin1, J. E. Sulston1, R. Durbin1, T. Hubbard1, R. Wooster1, I. Dunham1, N. P. Carter1, G. McVean4, M. T. Ross1, J. Harrow1, M. V. Olson3, S. Beck1, J. Rogers1 and D. R. Bentley1,7

Top of pageAbstractThe reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.

The sequence of each human chromosome underpins an extremely broad range of biological, genetic and medical studies. Sequence annotation—the process of gathering all of the available information and relating it to the sequence assembly—is essential to develop our understanding of the information stored in human DNA. Initially, there was a strong focus on annotating genes that allowed us to define the genetic information that determines biochemical function and to characterize the functional consequence of genetic aberrations. More recently, we have undertaken systematic identification and annotation of single nucleotide polymorphisms (SNPs) on genomic sequence. This has enabled us to measure the genetic diversity of the genome in geographically distinct population groups, to estimate recombination at a new high-level of resolution, and to identify signals of selection that may reveal new functions encoded in the genome. In parallel, reagents provided by chromosome mapping and sequencing have provided the basis for acquiring additional experimental data: for example, on gene expression and replication timing. These data sets may be used to elucidate the mechanisms that are used by the cell to regulate the use of chromosomal sequences—at the level of transcription, epigenetic modification or gross chromosomal behaviour—during replication and cell division.

Chromosome 1 is the largest of the human chromosomes, containing approximately 8% of all human genetic information. Because of its size, we can expect it to be more representative of the human genome than some other chromosomes with respect to genomic landscape and genetic properties. It is medically important: over 350 human diseases are associated with disruptions in the sequence of this chromosome—including cancers, neurological and developmental disorders, and mendelian conditions—for which many of the corresponding genes are unknown. There are also important biological implications of the size of chromosome 1: it is approximately six times longer than the smallest human chromosomes (21, 22 and Y), which raises the question of how all human genetic information is replicated in a coordinated manner before each cell division. This study reports the finished sequence of human chromosome 1, and provides a detailed annotation of the landscape, gene index and sequence variations of the chromosome. Our annotation also brings together information from a wide range of additional genetic and biological studies to describe features such as profiles of recombination, signals of natural selection and replication timing, and their relation to each other along the chromosome sequence. In turn, we show that this level of annotation reveals clues to the location of functionally important sequences that are currently unknown and merit targeted investigation.

Genomic sequence and landscape
We determined the sequence of a set of 2,220 minimally overlapping clones representing the euchromatic portion of chromosome 1 (Supplementary Table S1). The sequence comprises 223,875,858 base pairs (bp) at >99.99% accuracy1 (Supplementary Table S2); 120,405,438 bp lie in 14 contigs on the short arm (1p) and 103,470,420 bp lie in 13 contigs on the long arm (1q). The sequence reaches telomeric repetitive motifs (TTAGGG)n on both chromosome arms and pericentromeric alpha-satellite sequence at the proximal end of the short arm (1pcen). There are 18 megabases (Mb) of heterochromatin on 1q adjacent to the centromere that has not been sequenced.

Twenty-six gaps remain after exhaustive screening of bacterial and yeast-derived clone libraries with a combined coverage of 90 genomic equivalents (Supplementary Table S3). Eight gaps are clustered in 1p36 and eight in 1q21.1 (Fig. 1). These regions are GC-rich and contain low-copy repeats, which we believe contribute to the absence of clones in these regions. Seventeen gaps, measured using fluorescent in situ hybridization (FISH) of flanking clones to chromosomal DNA, cover a total of 0.8 Mb (data not shown). By aligning the human contigs to the genome sequences of mouse, rat and chimpanzee, we estimated that the remaining nine gaps total 0.53 Mb (Supplementary Table S2). Therefore, the euchromatic fraction of chromosome 1 is 225.2 Mb, and 99.4% is available as finished sequence

回复此楼