²é¿´: 2281  |  »Ø¸´: 6
µ±Ç°Ö»ÏÔʾÂú×ãÖ¸¶¨Ìõ¼þµÄ»ØÌû£¬µã»÷ÕâÀï²é¿´±¾»°ÌâµÄËùÓлØÌû

ldy2140

½ð³æ (СÓÐÃûÆø)

[½»Á÷] ÌÖÂÛÏÂÔõôͨ¹ýgiºÅÅúÁ¿»ñµÃÎïÖÖµÄdefinition ÒÑÓÐ4È˲ÎÓë

×öÉúÎïÐÅϢѧµÄ´ó¶¼±ÜÃâ²»ÁËÒªblast ÓÐʱ¾¡¹ÜÎÒÃÇblast³öÀ´µÄ½á¹ûºÜ¶àºÜÏÅÈË µ«»¹ÊÇÒª½«ÕâЩ½á¹û»ã×ܳÉexcel±í¸ñ
×î½ü¾ÍÓöµ½Á˺ÜÈÃÎÒÍ·ÌÛµÄÊÂÇé ÎÒ×öÁ˺ܶàתÔ˵°°×µÄ΢ÉúÎïÈ«¿âµÄblast µ«µÃµ½µÄtableÀïÖ»ÓÐÆ¥ÅäÎïÖÖµÄgiºÅ ÔÚ»ã×ܽá¹ûµÄʱºòÎÒÏë°ÑgiºÅ»»³ÉÎïÖÖÐÅÏ¢ ±ÈÈçÏñGBFFÀïµÄdefinitionÕâÖÖÄÜ˵Ã÷ÎïÖÖÒÅ´«±³¾°µÄ×Ö·û´®
ËùÒÔÎÒ¿¼ÂÇÓÃperlµÄÕýÔò±í´ïÊ½Ìæ»» дÁËÈçϵijÌÐò
CODE:
#!/usr/bin/perl
use Bio::Seq;
use Bio::DB::GenBank;

$gb = new Bio::DB::GenBank;
$^I = ".bak";

while (<>) {
  $line = $_;
  if ( /gi\|(\d+)\|/ ) {
    $gi = $1;
    $seq_obj = $gb->get_Seq_by_gi ($1);
    $def = $seq_obj->desc;
  }
  $_ = $line;
  s#\t.*?$gi.*?\t#\t$def\t#;
  print;
}

µ«ÊÇÔËÐÐÆðÀ´ËٶȺÜÂý¶øÇÒºÜÀË·Ñ´ø¿í ÒòΪÓõ½µÄÄ£¿éÊǽ«giºÅ¶ÔÓ¦µÄÕû¸öÐòÁÐÐÅÏ¢¶¼ÏÂÔØÏÂÀ´ È»ºó´ÓÖÐÌáÈ¡definition ËùÒÔЧÂʺܲî ÕâÊÇÎÒ»¨ºÜ¶Ìʱ¼äѧϰperlºÍbioperl±àдµÄ¼±¹¦½üÀûµÄ³ÌÐò ÆÚ´ý¸ßÊÖÅÄש

[ Last edited by ldy2140 on 2012-8-28 at 21:55 ]
»Ø¸´´ËÂ¥

» ²ÂÄãϲ»¶

» ±¾Ö÷ÌâÏà¹Ø¼ÛÖµÌùÍÆ¼ö£¬¶ÔÄúͬÑùÓаïÖú:

ÉìÊÖÕªÐÇ£¬Î´±ØÄãÈçÔ¸£¬µ«²»»áŪÔàÄãµÄÊÖ¡£
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû

semiangle12

½ð³æ (ÕýʽдÊÖ)

¡ï
Сľ³æ: ½ð±Ò+0.5, ¸ø¸öºì°ü£¬Ð»Ð»»ØÌû
ÒýÓûØÌû:
5Â¥: Originally posted by wizardfan at 2012-08-29 22:54:33
You know my comments on how to deal with high throughput data analysis: download the genbank flat file and parse the local file, which can improves the efficiency dramatically.

About your code:
1 ...

ÇëÎÊÒÑÖªgbºÅ£¬ÔõôÔÚÅúÁ¿ÏÂÔØµÄʱºòÑ¡ÔñÁ¬GIÖµÒ»ÆðÏÂÔØÄØ£¬ÅúÁ¿ÏÂÔØµÄʱºòÎÞ·¨Ñ¡ÔñGI
7Â¥2017-04-26 10:47:59
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû
²é¿´È«²¿ 7 ¸ö»Ø´ð

libralibra

ÖÁ×ðľ³æ (ÖøÃûдÊÖ)

æôÆï½«¾ü

¡ï
Сľ³æ: ½ð±Ò+0.5, ¸ø¸öºì°ü£¬Ð»Ð»»ØÌû
Ìù¸öÀý×Ó¿´¿´

ÌùÒ»¸öblast³öÀ´µÄ½á¹û(δ´¦ÀíµÄ×Ö·û´®)
ÌùÒ»¸öÄãÏëÒªµÄ½á¹û(Ä¿±ê×Ö·û´®)
matlab/VB/python/c++/Javaд³ÌÐòÇë·¢QQÓʼþ:790404545@qq.com
2Â¥2012-08-28 22:41:10
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû

ldy2140

½ð³æ (СÓÐÃûÆø)

ÒýÓûØÌû:
2Â¥: Originally posted by libralibra at 2012-08-28 22:41:10
Ìù¸öÀý×Ó¿´¿´

ÌùÒ»¸öblast³öÀ´µÄ½á¹û(δ´¦ÀíµÄ×Ö·û´®)
ÌùÒ»¸öÄãÏëÒªµÄ½á¹û(Ä¿±ê×Ö·û´®)

sp|P23936|LACY_STRTR        gi|169822596|gb|ABJK02000022.1|        61.65        631        241        1        5        634        344705        342813        0.0         714
sp|P23936|LACY_STRTR        gi|223555729|gb|ACGH01000016.1|        57.01        628        260        1        2        619        65439        67322        0.0         692
Ìæ»»ºó
sp|P23936|LACY_STRTR        Streptococcus infantarius subsp. infantarius ATCC BAA-102 S_infantarius-2.0.1_Cont245, whole genome shotgun sequence.        61.65        631        241        1        5        634        344705        342813        0.0         714
sp|P23936|LACY_STRTR        Lactobacillus buchneri ATCC 11577 contig00018, whole genome shotgun sequence.        57.01        628        260        1        2        619        65439        67322        0.0         692
ÉìÊÖÕªÐÇ£¬Î´±ØÄãÈçÔ¸£¬µ«²»»áŪÔàÄãµÄÊÖ¡£
3Â¥2012-08-29 09:30:29
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû

libralibra

ÖÁ×ðľ³æ (ÖøÃûдÊÖ)

æôÆï½«¾ü

¡ï
Сľ³æ: ½ð±Ò+0.5, ¸ø¸öºì°ü£¬Ð»Ð»»ØÌû
ÄãÊDz»ÊÇÓÐgiºÍdescriptionµÄ¶ÔÓ¦¹ØÏµ,Èç¹ûÓÐ,Ö±½ÓÕýÔòÌæ»»giÄDz¿·Ö¼´¿É.
Èç¹û±ØÐëÈ¥ÍøÂçÉϲé,²é»ØÀ´¿Ï¶¨giÐòºÅºÍdescriptionͬʱÓеÄ,Äã´¦ÀíÍêÁËÔÙдÎļþ
ÉúÎï²»¶®,²»¹ýÓÐbioperl,ËÑÁËÏÂ,Ò²ÓÐbiopython,ÕÕ×ÅÀý×ӸĸÄ,¿ÉÒÔÖ±½Ó´òÓ¡giºÅºÍ¶ÔÓ¦µÄdescription,¿ÉÒÔ¿´¿´.
×¼±¸Ñ§¸ö½Å±¾ÓïÑÔµÄʱºò,¿´¹ýperlºÍpythonµÄÓï·¨,¹û¶ÏÑ¡ÁËpython,perl²»¶®°¡
biopython½Ì³Ì:
http://biopython.org/DIST/docs/tutorial/Tutorial.html

Àý×Ó´úÂë,²âÊÔ¹ýÁË
CODE:
# Import the modules for interfacing with BLAST and parsing the output
from Bio.Blast import NCBIWWW, NCBIXML

# Blast the sequence of interest (in this case using the accession number
result_handle = NCBIWWW.qblast("blastn", "nr", "8332116")

# Parse the resulting output
blast_record = NCBIXML.read(result_handle)

# Loop over the alignments printing some output of interest
E_VALUE_THRESH = 0.004
for alignment in blast_record.alignments:
    result = alignment.title
    print 'gi no.: '+result.split()[0]
    print 'gi-desc: '+' '.join(result.split()[1:])
    print
##    for hsp in alignment.hsps:
##        if hsp.expect < E_VALUE_THRESH:
##            print
##            print '****Alignment****'
##            print 'sequence:', alignment.title
##            print 'length:', alignment.length
##            print 'e value:', hsp.expect
##            print hsp.query[0:75] + '...'
##            print hsp.match[0:75] + '...'
##            print hsp.sbjct[0:75] + '...'

½á¹û,giºÅºÍdescription¿ÉÒÔ·Ö±ðÌáÈ¡´òÓ¡:
CODE:
gi no.: gi|224094601|ref|XM_002310151.1|
gi-desc: Populus trichocarpa predicted protein, mRNA

gi no.: gi|359495761|ref|XM_002274845.2|
gi-desc: PREDICTED: Vitis vinifera uncharacterized LOC100267774 (LOC100267774), mRNA

gi no.: gi|349709091|emb|FQ378501.1|
gi-desc: Vitis vinifera clone SS0AEB13YG07

gi no.: gi|255562758|ref|XM_002522339.1|
gi-desc: Ricinus communis COR413-PM2, putative, mRNA

gi no.: gi|358346403|ref|XM_003637210.1|
gi-desc: Medicago truncatula Cold acclimation protein-like protein (MTR_079s1009) mRNA, complete cds

gi no.: gi|358344000|ref|XM_003636035.1|
gi-desc: Medicago truncatula Cold acclimation protein-like protein (MTR_026s0005) mRNA, complete cds

gi no.: gi|356561272|ref|XM_003548859.1|
gi-desc: PREDICTED: Glycine max uncharacterized protein LOC100817084 (LOC100817084), mRNA

gi no.: gi|356502211|ref|XM_003519866.1|
gi-desc: PREDICTED: Glycine max uncharacterized protein LOC100810337 (LOC100810337), mRNA

gi no.: gi|225311746|dbj|AK326681.1|
gi-desc: Solanum lycopersicum cDNA, clone: LEFL2011M15, HTC in fruit

gi no.: gi|255762732|gb|GQ370517.1|
gi-desc: Salvia miltiorrhiza cold acclimation protein (COR) mRNA, complete cds

gi no.: gi|225428595|ref|XM_002284686.1|
gi-desc: PREDICTED: Vitis vinifera uncharacterized LOC100248690 (LOC100248690), mRNA

gi no.: gi|297819785|ref|XM_002877730.1|
gi-desc: Arabidopsis lyrata subsp. lyrata COR413-PM2, mRNA

gi no.: gi|86755971|gb|DQ359747.1|
gi-desc: Chimonanthus praecox cold acclimation protein COR413-PM1 mRNA, complete cds

gi no.: gi|145339339|ref|NM_114943.4|
gi-desc: Arabidopsis thaliana cold-regulated 413-plasma membrane 2 (COR413-PM2) mRNA, complete cds

gi no.: gi|15810634|gb|AY056356.1|
gi-desc: Arabidopsis thaliana putative cold acclimation protein (At3g50830) mRNA, complete cds

gi no.: gi|10121842|gb|AF283005.1|
gi-desc: Arabidopsis thaliana cold acclimation protein WCOR413-like protein beta form mRNA, complete cds

gi no.: gi|13430785|gb|AF360305.1|
gi-desc: Arabidopsis thaliana putative cold acclimation protein (At3g50830) mRNA, complete cds

gi no.: gi|60317457|gb|AY761065.1|
gi-desc: Gossypium barbadense cold-related protein Cor413 (Cor413) mRNA, complete cds

gi no.: gi|255556172|ref|XM_002519075.1|
gi-desc: Ricinus communis COR413-PM2, putative, mRNA

gi no.: gi|156567558|gb|EU077497.1|
gi-desc: Poncirus trifoliata cold acclimation WCOR413-like protein mRNA, complete cds

gi no.: gi|46577795|gb|AY587773.1|
gi-desc: Tamarix androssowii putative stress-responsive protein mRNA, complete cds

gi no.: gi|305690597|gb|HQ010041.1|
gi-desc: Corylus heterophylla COR413-PM1 mRNA, complete cds

gi no.: gi|224105476|ref|XM_002313788.1|
gi-desc: Populus trichocarpa predicted protein, mRNA

gi no.: gi|242389633|emb|FP100664.1|
gi-desc: Phyllostachys edulis cDNA clone: bphylf036p06, full insert sequence

gi no.: gi|242382816|emb|FP092058.1|
gi-desc: Phyllostachys edulis cDNA clone: bphyem114p22, full insert sequence

gi no.: gi|242382391|emb|FP097178.1|
gi-desc: Phyllostachys edulis cDNA clone: bphylf028m11, full insert sequence

gi no.: gi|242381728|emb|FP091375.1|
gi-desc: Phyllostachys edulis cDNA clone: bphyst020e14, full insert sequence

gi no.: gi|238007351|gb|BT084358.1|
gi-desc: Zea mays full-length cDNA clone ZM_BFb0105L06 mRNA, complete cds

gi no.: gi|195636267|gb|EU965484.1|
gi-desc: Zea mays clone 286348 cold acclimation protein COR413-PM1 mRNA, complete cds

gi no.: gi|54652523|gb|BT017742.1|
gi-desc: Zea mays clone EL01N0449E04.c mRNA sequence

gi no.: gi|162459269|ref|NM_001111732.1|
gi-desc: Zea mays LOC542099 (gpm455), mRNA >gi|27902672|gb|AY181208.1| Zea mays cold acclimation protein COR413-PM1 mRNA, complete cds

gi no.: gi|21209119|gb|AY106041.1|
gi-desc: Zea mays PCO103483 mRNA sequence

gi no.: gi|242037992|ref|XM_002466346.1|
gi-desc: Sorghum bicolor hypothetical protein, mRNA

gi no.: gi|255617390|ref|XM_002539789.1|
gi-desc: Ricinus communis COR413-PM2, putative, mRNA

gi no.: gi|30690903|ref|NM_119885.2|
gi-desc: Arabidopsis thaliana cold acclimation protein WCOR413 (AT4G37220) mRNA, complete cds

gi no.: gi|26449888|dbj|AK117399.1|
gi-desc: Arabidopsis thaliana At4g37220 mRNA for putative ap2 cold acclimation protein, complete cds, clone: RAFL16-98-J01

gi no.: gi|226504237|ref|NM_001155133.1|
gi-desc: Zea mays cold acclimation protein COR413-PM1 (LOC100282221), mRNA >gi|195620729|gb|EU960077.1| Zea mays clone 221611 cold acclimation protein COR413-PM1 mRNA, complete cds

gi no.: gi|166359605|gb|EU365626.1|
gi-desc: Thellungiella halophila stress responsive protein (COR) mRNA, complete cds

gi no.: gi|150172175|emb|CU406592.1|
gi-desc: Oryza rufipogon (W1943) cDNA clone: ORW1943C102K01, full insert sequence

gi no.: gi|115455578|ref|NM_001057925.1|
gi-desc: Oryza sativa Japonica Group Os03g0767800 (Os03g0767800) mRNA, complete cds

gi no.: gi|10121844|gb|AF283006.1|
gi-desc: Oryza sativa (japonica cultivar-group) cold acclimation protein WCOR413-like protein mRNA, complete cds

gi no.: gi|32976054|dbj|AK066036.1|
gi-desc: Oryza sativa Japonica Group cDNA clone:J013049B03, full insert sequence

gi no.: gi|32970924|dbj|AK060906.1|
gi-desc: Oryza sativa Japonica Group cDNA clone:001-035-F05, full insert sequence

gi no.: gi|32970018|dbj|AK060000.1|
gi-desc: Oryza sativa Japonica Group cDNA clone:006-301-G09, full insert sequence

gi no.: gi|28973358|gb|BT005584.1|
gi-desc: Arabidopsis thaliana clone U50435 putative cold acclimation protein homolog (At4g37220) mRNA, complete cds

gi no.: gi|326534181|dbj|AK358227.1|
gi-desc: Hordeum vulgare subsp. vulgare mRNA for predicted protein, complete cds, clone: NIASHv1071H11

gi no.: gi|160954667|emb|CU225096.1|
gi-desc: Populus EST from leave

gi no.: gi|160950966|emb|CU229055.1|
gi-desc: Populus EST from severe drought-stressed leaves

gi no.: gi|357114154|ref|XR_137736.1|
gi-desc: PREDICTED: Brachypodium distachyon uncharacterized LOC100844112 (LOC100844112), miscRNA

gi no.: gi|224035946|gb|BT070152.1|
gi-desc: Zea mays full-length cDNA clone ZM_BFc0138N11 mRNA, complete cds

matlab/VB/python/c++/Javaд³ÌÐòÇë·¢QQÓʼþ:790404545@qq.com
4Â¥2012-08-29 17:25:33
ÒÑÔÄ   »Ø¸´´ËÂ¥   ¹Ø×¢TA ¸øTA·¢ÏûÏ¢ ËÍTAºì»¨ TAµÄ»ØÌû
×î¾ßÈËÆøÈÈÌûÍÆ¼ö [²é¿´È«²¿] ×÷Õß »Ø/¿´ ×îºó·¢±í
[¿¼ÑÐ] 335Çóµ÷¼Á +4 yuyuÓî 2026-03-23 5/250 2026-03-23 23:49 by Txy@872106
[¿¼ÑÐ] 08¹¤Ñ§µ÷¼Á +7 Óû§573181 2026-03-20 11/550 2026-03-23 15:47 by ÎÒ°®Ñ§Ï°Ñ§Ï°Ê¹Î
[¿¼ÑÐ] 0854µç×ÓÐÅÏ¢Çóµ÷¼Á 324 +3 Promise-jyl 2026-03-23 3/150 2026-03-23 13:43 by wangkm
[¿¼ÑÐ] ÇóÀÏʦÊÕÎÒ +3 zzh16938784 2026-03-23 3/150 2026-03-23 12:56 by ztnimte
[¿¼ÑÐ] 070300£¬Ò»Ö¾Ô¸±±º½320Çóµ÷¼Á +3 Jerry0216 2026-03-22 5/250 2026-03-23 09:16 by ¡£¡£ÌÃÌÃ
[¿¼ÑÐ] 324Çóµ÷¼Á +6 luckyѽѽѽѼ 2026-03-20 6/300 2026-03-22 16:01 by ColorlessPI
[¿¼ÑÐ] 305·ÖÇóµ÷¼Á£¨Ê³Æ·¹¤³Ì£© +4 Sxy112 2026-03-21 6/300 2026-03-22 15:26 by ÎÞи¿É»÷111
[¿¼ÑÐ] 285Çóµ÷¼Á +6 ytter 2026-03-22 6/300 2026-03-22 12:09 by ÐÇ¿ÕÐÇÔÂ
[¿¼ÑÐ] »¯Ñ§µ÷¼Á +5 yzysaa 2026-03-21 5/250 2026-03-21 22:12 by peike
[¿¼ÑÐ] Çóµ÷¼Á +3 .m.. 2026-03-21 4/200 2026-03-21 16:25 by barlinike
[¿¼ÑÐ] 085601µ÷¼Á 358·Ö +3 zzzzggh 2026-03-20 4/200 2026-03-21 10:21 by luoyongfeng
[¿¼ÑÐ] 301Çóµ÷¼Á +10 yyÒªÉϰ¶Ñ½ 2026-03-17 10/500 2026-03-21 03:14 by JourneyLucky
[¿¼ÑÐ] Ò»Ö¾Ô¸Î÷ÄϽ»´ó£¬Çóµ÷¼Á +5 ²Ä»¯ÖðÃÎÈË 2026-03-18 5/250 2026-03-21 00:26 by JourneyLucky
[¿¼ÑÐ] Ò»Ö¾Ô¸Î人Àí¹¤²ÄÁϹ¤³Ìר˶µ÷¼Á +9 Doleres 2026-03-19 9/450 2026-03-20 22:36 by JourneyLucky
[¿¼ÑÐ] 317Çóµ÷¼Á +5 Éê×ÓÉêÉê 2026-03-19 9/450 2026-03-20 22:26 by JourneyLucky
[¿¼ÑÐ] Çóµ÷¼ÁÒ»Ö¾Ô¸ÄϾ©º½¿Õº½Ìì´óѧ289·Ö +3 @taotao 2026-03-19 3/150 2026-03-20 21:34 by JourneyLucky
[¿¼ÑÐ] ²ÄÁÏѧ˶318Çóµ÷¼Á +5 February_Feb 2026-03-19 5/250 2026-03-19 23:51 by 23Postgrad
[¿¼ÑÐ] 0703»¯Ñ§µ÷¼Á +4 18889395102 2026-03-18 4/200 2026-03-19 16:13 by 30660438
[¿¼ÑÐ] 0703»¯Ñ§µ÷¼Á +5 pupcoco 2026-03-17 8/400 2026-03-19 13:58 by houyaoxu
[˶²©¼ÒÔ°] ºþ±±¹¤Òµ´óѧ ÉúÃü¿ÆÑ§Ó뽡¿µÑ§Ôº-¿ÎÌâ×éÕÐÊÕ2026¼¶Ê³Æ·/ÉúÎï·½Ïò˶ʿ +3 1ϲ´º8 2026-03-17 5/250 2026-03-17 17:18 by ber´¨cool×Ó
ÐÅÏ¢Ìáʾ
ÇëÌî´¦ÀíÒâ¼û