24小时热门版块排行榜    

查看: 5077  |  回复: 32
当前只显示满足指定条件的回帖,点击这里查看本话题的所有回帖

cnlics

木虫 (小有名气)

[交流] 【分享】蛋白质结构预测流程已有23人参与

我慢慢翻译慢慢贴

这里贴的内容是以前收集的,应该是来自EMBL,我粗略浏览了下内容,还没有过时。

WORD文档可以在这里下载:
http://ifile.it/dwzy278

蛋白质结构预测一般流程见下图:


内容目录:

•相关实验数据
•序列数据和初步分析
•搜索序列数据库
•识别结构域
•多序列比对
•比较或同源建模
•二级结构预测
•折叠的识别
•折叠分析与二级结构比对
•序列与结构的比对

[ Last edited by cnlics on 2010-9-16 at 08:24 ]
回复此楼

» 收录本帖的淘贴专辑推荐

蛋白质生物学实验经验 分子生物实验及蛋白纯化结晶相关链接 生物信息学 生物化学和分子生物学
精品收藏 待下载 蛋白质 交叉知识
比偶长大 蛋白 分析软件 生物信息学

» 本帖已获得的红花(最新10朵)

» 猜你喜欢

» 本主题相关价值贴推荐,对您同样有帮助:

已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

cnlics

木虫 (小有名气)

Analysis of protein folds and alignment of secondary structure elements
________________________________________
If you have predicted that your protein will adopt a particular fold within the database, then an important thing to consider to which fold your protein belongs, and other proteins that adopt a similar fold. To find out, look at one of the following databases:
•        SCOP (MRC Cambridge)
•        CATH (University College, London)
•        FSSP (EBI, Cambridge)
•        3 Dee (EBI, Cambridge)
•        HOMSTRAD (Biochemistry, Cambridge)
•        VAST (NCBI, USA)
(Note that these databases don't always agree as to what constitutes a similar fold, so I would recommend looking at as many of them as possible).
If your predicted fold has many "relatives", then have a look at what they are. Ask:
•        Do any of members show functional similarity to your protein? If there is any functional similarity between your protein and any members of the fold, then you may be able to back up your prediction of fold (possibly by the conservation of active site residues, or the approximate location of active site residues, etc.)
•        Is this fold a superfold? If so, does this superfold contain a supersite? Certain folds show a tendancy to bind ligands in a common location, even in the absense of any functional or clear evolutionary relationships. For an explanation of this, please see our work on supersites.
•        Are there core secondary structure elements that should really be present in any member of the fold?
•        Are there non-core secondary structure elements that might not be present in all members of the fold?
Core secondary structure elements, such as those comprising a beta-barrel, should really be present in a fold. If your predicted secondary structures can't be made to match up with what you think is the core of the protein fold, then your prediction of fold may be wrong (but be careful, since your secondary structure prediction may contain errors). You can also use your prediction together with the core secondary structure elements to derive an alignment of of predicted and observed secondary structures.
For example, we predicted that the glutamyl tRNA reductases (hemA family) would adopt an alpha-beta barrel fold using a combination of fold recognition and secondary structure prediction methods. We aligned the secondary structures of diverse members of the alpha-beta barrel fold using a structural alignment program, and aligned the secondary structures to the core (boxed below) secondary structure elements.

In the alignment above, each alpha and beta character refers to an entire secondary structure element. Those that are boxed are core secondary structure elements found in most members of the fold. The alignment of predicted secondary structures to the core elements appears at the bottom of the figure. Note that I have had to delete several alpha helices and beta strands from our prediction to allow for alignment. This is not surprising, because insertions or deletions of secondary structure elements are common across the diverse set of proteins that adopt this fold.
10楼2010-09-14 01:58:01
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
查看全部 33 个回答

cnlics

木虫 (小有名气)

实验数据

许多实验数据可以辅助结构预测过程,包括:
•二硫键,固定了半胱氨酸的空间位置
•光谱数据,可以提供蛋白的二级结构内容
•定位突变研究,可以发现活性或结合位点的残基
•蛋白酶切割位点,翻译后修饰如磷酸化或糖基化提示了残基必须是暴露的
•其他
预测时,必须清楚所有的数据。必须时刻考虑:预测与实验结果是否一致?如果不是,就有必要修改做法。

[ Last edited by cnlics on 2010-9-14 at 19:31 ]
2楼2010-09-14 01:41:00
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

cnlics

木虫 (小有名气)

蛋白序列数据

对蛋白序列的初步分析有一定价值。例如,如果蛋白是直接来自基因预测,就可能包含多个结构域。更严重的是,可能会包含不太可能是球形或可溶性的区域。此流程图假设你的蛋白是可溶的,可能是一个结构域并不包含非球形结构域。

需要考虑以下方面:
•是跨膜蛋白或者包含跨膜片段吗?有许多方法预测这些片段,包括:

    o TMAP (EMBL)
    o PredictProtein (EMBL/Columbia)
    o TMHMM (CBS, Denmark)
    o TMpred (Baylor College)
    o DAS (Stockholm)

•如果包含卷曲(coiled-coils)可以在COILS server 预测coiled coils 或者下载 COILS 程序(最近已经重写,注意GCG程序包里包含了COILS的一个版本)

•蛋白包含低复杂性区域?蛋白经常含有数个聚谷氨酸或聚丝氨酸区,这些地方不容易预测。可以用SEG(GCG程序包里包含了一个版本的SEG程序)检查 。

如果出现以上一种情况,就应该将序列打成碎片,或忽略序列中的特定区段,等等。这个问题与细胞定位结构域相关。

[ Last edited by cnlics on 2010-9-16 at 08:25 ]
3楼2010-09-14 01:41:58
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

cnlics

木虫 (小有名气)

搜索序列数据库

分析任何新序列的第一步显然是搜索序列数据库以发现同源序列。这样的搜索可以在任何地方或者在任何计算机上完成。而且,有许多WEB服务器可以进行此类搜索,可以输入或粘贴序列到服务器上并交互式地接收结果。

序列搜索也有许多方法,目前最有名的是BLAST程序。可以容易得到在本地运行的版本(从 NCBI 或者 Washington University),也有许多的WEB页面允许对多基因或蛋白质序列的数据库比较蛋白质或DNA序列,仅举几个例子:
•National Center for Biotechnology Information (USA) Searches
•European Bioinformatics Institute (UK) Searches
•BLAST search through SBASE (domain database; ICGEB, Trieste)
•还有更多的站点

最近序列比较的重要进展是发展了gapped BLAST 和PSI-BLAST (position specific interated BLAST),二者均使BLAST更敏感,后者通过选取一条搜索结果,建立模式(profile),然后用再它搜索数据库寻找其他同源序列(这个过程可以一直重复到发现不了新的序列为止),可以探测进化距离非常远的同源序列。很重要的一点是,在利用下面章节方法之前,通过PSI-BLAST把蛋白质序列和数据库比较,找寻是否有已知结构。
将一条序列和数据库比较的其他方法有:
•FASTA软件包 (William Pearson, University of Virginia, USA)
•SCANPS (Geoff Barton, European Bioinformatics Institute, UK)
•BLITZ (Compugen's fast Smith Waterman search)
•其他方法.

It is also possible to use multiple sequence information to perform more sensitive searches. Essentially this involves building a profile from some kind of multiple sequence alignment. A profile essentially gives a score for each type of amino acid at each position in the sequence, and generally makes searches more sentive. Tools for doing this include:
•PSI-BLAST (NCBI, Washington)
•ProfileScan Server (ISREC, Geneva)
•HMMER 隐马氏模型(Sean Eddy, Washington University)
•Wise package (Ewan Birney, Sanger Centre;用于蛋白质对DNA的比较)
•其他方法.

A different approach for incorporating multiple sequence information into a database search is to use a MOTIF. Instead of giving every amino acid some kind of score at every position in an alignment, a motif ignores all but the most invariant positions in an alignment, and just describes the key residues that are conserved and define the family. Sometimes this is called a "signature". For example, "H-[FW]-x-[LIVM]-x-G-x(5)-[LV]-H-x(3)-[DE]" describes a family of DNA binding proteins. It can be translated as "histidine, followed by either a phenylalanine or tryptophan, followed by an amino acid (x), followed by leucine, isoleucine, valine or methionine, followed by any amino acid (x), followed by glycine,... [etc.]".

PROSITE (ExPASy Geneva) contains a huge number of such patterns, and several sites allow you to search these data:
•ExPASy
•EBI

It is best to search a few different databases in order to find as many homologues as possible. A very important thing to do, and one which is sometimes overlooked, is to compare any new sequence to a database of sequences for which 3D structure information is available. Whether or not your sequence is homologous to a protein of known 3D structure is not obvious in the output from many searches of large sequence databases. Moreover, if the homology is weak, the similarity may not be apparent at all during the search through a larger database.

One last thing to remember is that one can save a lot of time by making use of pre-prepared protein alignments. Many of these alignments are hand edited by experts on the particular protein families, and thus represent probably the best alignment one can get given the data they contain (i.e. they are not always as up to date as the most recent sequence databases). These databases include:
•SMART (Oxford/EMBL)
•PFAM (Sanger Centre/Wash-U/Karolinska Intitutet)
•COGS (NCBI)
•PRINTS (UCL/Manchester)
•BLOCKS (Fred Hutchinson Cancer Research Centre, Seatle)
•SBASE (ICGEB, Trieste)

通常把蛋白质序列和数据比较都有很多的方法,这些对于识别结构域非常有用。

[ Last edited by cnlics on 2010-9-14 at 19:54 ]
4楼2010-09-14 01:42:52
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
普通表情 高级回复(可上传附件)
最具人气热帖推荐 [查看全部] 作者 回/看 最后发表
[基金申请] 评审专家会不会很在意申请人的单位啊 +8 lancet0903 2024-06-24 8/400 2024-06-24 20:47 by War3_Master
[考博] 对象没有,还非常想读博,难以抉择 +19 pvrw0224 2024-06-23 41/2050 2024-06-24 20:43 by zeolitess
[硕博家园] 回家两天,不想呆了 +4 368ghnf 2024-06-22 8/400 2024-06-24 19:03 by 我是王小帅
[基金申请] 今年什么时候会评啊 +7 lancet0903 2024-06-24 7/350 2024-06-24 18:48 by Pickfoot
[基金申请] 这样的说辞是上会了吗 +9 学员d3zYCz 2024-06-24 12/600 2024-06-24 16:46 by yingxz
[金属] 寻找钛合金热压缩代做 +4 liuyang358 2024-06-23 4/200 2024-06-24 13:50 by 搬砖狗不放弃
[找工作] 高校两个offer选择 +23 cowox2021 2024-06-18 24/1200 2024-06-24 12:15 by 半生梦君
[论文投稿] Pattern Recognition期刊,二审审稿邀请发出一个月了,但有一个审稿人一直未接受审稿 5+4 PLVS_VLTRA 2024-06-19 9/450 2024-06-24 00:16 by holypower
[职场人生] 在化工设计院,厂里工作的,你得学会积累 +3 还是回家好啊 2024-06-18 3/150 2024-06-23 22:13 by cangxiong1
[有机交流] 求助 10+7 脂质纳米粒 2024-06-20 9/450 2024-06-23 07:52 by buhui7829
[硕博家园] 夏至,要不要硕博联谊 +3 我是王小帅 2024-06-21 5/250 2024-06-22 22:51 by 我是王小帅
[考博] 有机化学迷茫学生 +6 佛系摸鱼5 2024-06-18 11/550 2024-06-22 15:47 by yuanjijoy
[博后之家] 在国内某高校做全职博士后2年,现在找到新的单位,出站或退站对新工作有什么影响? +10 nxplfcc 2024-06-20 10/500 2024-06-22 07:52 by 徐长安
[论文投稿] 水果保鲜投稿 5+4 zhengjiandong 2024-06-19 6/300 2024-06-21 22:27 by 宋小爷
[基金申请] 教育部基金 +5 m1393 2024-06-21 5/250 2024-06-21 21:13 by odes
[论文投稿] ACS 编辑的意见 10+3 哈哈妞1993 2024-06-20 3/150 2024-06-21 17:06 by 投个论文
[有机交流] 跑板能跑开,过柱过不纯怎么办 +8 小胡在努力 2024-06-18 10/500 2024-06-21 08:22 by hptianyan
[有机交流] 想要用氢化钠拔掉吲哚N上的氢取代酰氯 50+3 光敏剂 2024-06-19 4/200 2024-06-20 18:41 by HF111001
[基金申请] 青年基金会评专家到底是怎么会评的呀?主审专家是不是一般不会改动系统按函评给的顺序 5+4 他山攻玉之石 2024-06-18 18/900 2024-06-20 16:33 by 他山攻玉之石
[论文投稿] 投稿求助 +4 平凡的日子 2024-06-19 5/250 2024-06-20 16:24 by yueyueyue@
信息提示
请填处理意见