Four components to a BLAST search
(1) Choose the sequence (query)
(2) Select the BLAST program
(3) Choose the database to search
(4) Choose optional parameters
Step 1: Choose your sequence
Sequence can be input in FASTA format or as accession number
Step 2: Choose the BLAST program
BLASTN - compares a nucleotide query to a nucleotide database
BLASTP - compares a protein query to a protein database
BLASTX - compares a nucleotide query sequence translated in all reading frames against a protein sequence database
TBLASTN - compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
TBLASTX - compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
Then click “BLAST”
Step 3: choose the database
nr = non-redundant (most general database)
dbest = database of expressed sequence tags
dbsts = database of sequence tag sites
gss = genomic survey sequences
htgs = high throughput genomic sequence
BLAST: optional parameters
You can...
• choose the organism to search
• turn filtering on/off
• change the substitution matrix
• change the expect (e) value
• change the word size
• change the output format
Step 4a: Select optional search parameters
About E-value
the number of alignments with a given score S that would be expected to occur at random in the database that has been searched (e.g. if E=10, 10 matches with score S will be expected to be found by chance).
A match will only be reported if its E value falls below the threshold set.
Lower E thresholds are more stringent, and report fewer matches
The key equation describing an E value is:
E = Kmn e-lS
This equation is derived from a description of the
extreme value distribution
S = the score
E = the expect value = the number of HSPs
expected to occur with a score of at least S
m, n = the length of two sequences
l, K = Karlin Altschul statistics
Some properties of the equation E = Kmn e-lS
The value of E decreases exponentially with increasing S (higher S values correspond to better alignments). Very high scores correspond to very low E values.
Parameter K describes the search space (database).
For E=1, one match with a similar score is expected to occur by chance. For a very much larger or smaller database, you would expect E to vary accordingly
你准备比对什么 看核酸还是蛋白 目的是什么 看特异性还是比对几个序列 不同的目的不同的选项 最好说的明白点儿
Four components to a BLAST search
(1) Choose the sequence (query)
(2) Select the BLAST program
(3) Choose the database to search
(4) Choose optional parameters
Step 1: Choose your sequence
Sequence can be input in FASTA format or as accession number
Step 2: Choose the BLAST program
BLASTN - compares a nucleotide query to a nucleotide database
BLASTP - compares a protein query to a protein database
BLASTX - compares a nucleotide query sequence translated in all reading frames against a protein sequence database
TBLASTN - compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
TBLASTX - compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
Then click “BLAST”
Step 3: choose the database
nr = non-redundant (most general database)
dbest = database of expressed sequence tags
dbsts = database of sequence tag sites
gss = genomic survey sequences
htgs = high throughput genomic sequence
BLAST: optional parameters
You can...
• choose the organism to search
• turn filtering on/off
• change the substitution matrix
• change the expect (e) value
• change the word size
• change the output format
Step 4a: Select optional search parameters
About E-value
the number of alignments with a given score S that would be expected to occur at random in the database that has been searched (e.g. if E=10, 10 matches with score S will be expected to be found by chance).
A match will only be reported if its E value falls below the threshold set.
Lower E thresholds are more stringent, and report fewer matches
The key equation describing an E value is:
E = Kmn e-lS
This equation is derived from a description of the
extreme value distribution
S = the score
E = the expect value = the number of HSPs
expected to occur with a score of at least S
m, n = the length of two sequences
l, K = Karlin Altschul statistics
Some properties of the equation E = Kmn e-lS
The value of E decreases exponentially with increasing S (higher S values correspond to better alignments). Very high scores correspond to very low E values.
Parameter K describes the search space (database).
For E=1, one match with a similar score is expected to occur by chance. For a very much larger or smaller database, you would expect E to vary accordingly
希望对你有所帮助,
我想问下你用BLAST目的是干什么。
BLAST简而言之是用来做序列比对(包括基因序列与蛋白质序列),然后评价参与比对的两条序列之间的相似性的。
如果用一条感兴趣的序列与整个序列数据库进行比对,你就可以大概了解到这条序列大概有什么生物特性,例如属于什么家族。
最后一句,你是要用BLAST还是要编一个BLAST出来...?