| 查看: 130 | 回复: 1 | |||
| 当前主题已经存档。 | |||
[交流]
[转贴]Computational methods for identifying non-coding RNA in genomic sequences
|
|||
|
Sequence similarities As mentioned previously several known ncRNA genes have already been found. By searching for similar sequences in other genomes more such sequences can be found. Several such projects have already been performed to locate tRNA genes (Lowe and Eddy, 1997), tmRNA genes (Zwieb et al., 1999) and snoRNA genes (Lowe and Eddy, 1999). This strategy does however lack generality and new families of ncRNA cannot be discovered. However, if any new ncRNA genes are found by other means, this strategy can be used to determine whether this is a new gene family or not. It is also useful to employ this strategy when working with new genomes to see whether they contain any of the known ncRNA genes. Comparative genomics Sequences which code for the same protein in closely related organisms are conserved. This would also be the same for regions which code for ncRNA genes. Such genes could therefore be found by examining intergenic regions of closely related organisms and finding regions which are more conserved than the area surrounding them. Several such ``Comparative genomics'' projects have proved successful already (Rivas et al., 2001; Wassarman et al., 1999; Rivas and Eddy, 2001). This approach becomes especially attractive when considering the increase in available genomes. Transcription signals ncRNA genes have to be transcribed to produce functional RNA, and are thus surrounded by sequences which regulate transcription. Specific sequences also help regulate translation from RNA to protein, these sequences should therefore not be present in ncRNA gene sequences. New candidate ncRNA genes could be found by searching for sequences which are transcribed, but not translated. Due to variations in transcription and translation signals between organisms, this approach has to be flexible as to what signal sequences to search for. Such methods have previously been used with success to locate ncRNA genes in E.coli (Argaman et al., 2001) and yeast (Olivas et al., 1997). Statistical analysis By using statistics to analyse non-coding areas in genomes systematic variations can be found. Such variations can then be used to separate ncRNA genes from ``junk'' DNA. One such variation, the usage variation of the nucleotide pair CG in M.jannaschii (Schattner, 2002) has already helped find many more ncRNA in that organism. Other such variations could possibly used as a locating mechanism. One measure of variation could for instance be the Shannon entropy measure (Shannon, 1948). The entropy of a sequence indicates the amount of information available in that region. ncRNA genes should, since they are more ordered than other regions, show less entropy than ``junk'' DNA. Combining criteria If several of the methods mentioned above indicate the presence of a ncRNA gene in a genomic sequence, this would strengthen the belief that the sequence does indeed comprise a ncRNA gene. It should therefore be possible to combine and evaluate the results from the different methods which are developed. Verification of predicted ncRNA genes The accuracy of the methods proposed above can be assessed by analysing their ability to find already known ncRNA genes. New ncRNA genes which might be discovered can be verified by DNA hybridisation experiments (Northern blots). It would also be possible to do further laboratory studies on any ncRNA genes which seem especially interesting with the help of different groups within the institute. |
» 猜你喜欢
膀胱癌靶向治疗新选择:厄达替尼作用机制与耐药研究进展
已经有0人回复
从水母到实验室:腔肠素的发现历程与生物发光奥秘
已经有1人回复
化学工程及工业化学论文润色/翻译怎么收费?
已经有117人回复
线粒体氧化磷酸化的新靶点:S-Gboxin的发现与研究进展
已经有0人回复
PROTAC药物开发选择VHL配体:MDK-7526以87%出口向量占有率成首选
已经有0人回复
MCC950:NLRP3炎症小体特异性抑制剂的科研应用与前景
已经有0人回复
康替唑胺研发17年:如何解决多重耐药菌感染难题
已经有0人回复
伊曲莫德(Etrasimod)从“肠道免疫失控”到精准靶向干预
已经有1人回复
西达本胺:创新HDAC抑制剂的抗肿瘤之路
已经有0人回复
金色抗生素单硫酸卡那霉素的物化性质、制备与前景
已经有0人回复











回复此楼