24小时热门版块排行榜    

北京石油化工学院2026年研究生招生接收调剂公告
查看: 796  |  回复: 0

emanlee

木虫 (小有名气)

[求助] 从两个不同fasta文件中寻找不重复的序列

题目:从两个不同fasta文件中寻找不重复的序列

第一个fasta文件aaa.fa中有40000条碱基序列或者氨基酸序列:
>gi|118600994|ref|NM_001079530.1| Homo sapiens cripto, FRL-1, cryptic family 1B (CFC1B), mRNA
ATGCCAAATACAGCCATGAAGAAAAAGGTGCTGCTGATGGGGAAGAGCGGGTCGGGGAAGACCAGCATGAGGTCGATAATCTT
>gi|57863286|ref|NM_006570.4| Homo sapiens Ras-related GTP binding A (RRAGA), mRNA
ACGCTCTACAAAGCCTGGTCCAGCATCGTCTACCAGCTGATTCCCAACGTTCAGCAGCTGGAGATGAACCTCAGGAATTTTG
>gi|254587897|ref|NM_178495.5| Homo sapiens inositol 1,4,5-trisphosphate receptor
CGCCAATTACATTGCTCGCGACACCCGGCGCCTGGGGGCCACCATTGACGTGGAACACTCCCACGTCCGATTCCTAGGGAACC
>gi|191252813|ref|NM_001128635.1| Homo sapiens RIMS binding protein 3B (RIMBP3B), mRNA
TGGTGCTGAACCTGTGGGACTGTGGCGGTCAGGACACCTTCATGGAAAATTACTTCACCAGCCAGCGAGACAATATCTTCCGTA
>gi|61656209|ref|NM_001013355.1| Homo sapiens olfactory receptor, family 2, subfamily G, member 6 (OR2G6), mRNA
ACGTGGAAGTTTTGATTTACGTGTTTGACGTGGAGAGCCGCGAACTGGAAAAGGACATGCATTATTACCAGTCGTGTCTGGAGG
第二个fasta文件bbb.fa中有40000条碱基序列或者氨基酸序列:
>gi|83267870|ref|NM_080431.4| Homo sapiens actin-related protein T2 (ACTRT2), mRNA
CCATCCTCCAGAACTCTCCTGACGCCAAAATCTTCTGCCTGGTGCACAAAATGGATCTGGTTCAGGAGGATCAGCGTGACCTGA
>gi|53828675|ref|NM_001001923.1| Homo sapiens olfactory receptor, family 5, subfamily C, member 1 (OR5C1), mRNA
TTTTTAAAGAGCGAGAGGAAGACCTGAGGCGTCTGTCTCGCCCGCTGGAGTGTGCTTGTTTTCGAACGTCCATCTGGGATGAG
>gi|52627150|ref|NM_001005276.1| Homo sapiens olfactory receptor, family 2, subfamily AE, member 1 (OR2AE1), mRNA
TTTTTAAAGAGCGAGAGGAAGACCTGAGGCGTCTGTCTCGCCCGCTGGAGTGTGCTTGTTTTCGAACGTCCATCTGGGATGAG
>gi|61656211|ref|NM_001013357.1| Homo sapiens olfactory receptor, family 8, subfamily U, member 9 (OR8U9), mRNA
ACGCTCTACAAAGCCTGGTCCAGCATCGTCTACCAGCTGATTCCCAACGTTCAGCAGCTGGAGATGAACCTCAGGAATTTTG
>gi|51871366|ref|NM_001004124.1| Homo sapiens olfactory receptor, family 4, subfamily P, member 4 (OR4P4), mRNA
ATGCCAAATACAGCCATGAAGAAAAAGGTGCTGCTGATGGGGAAGAGCGGGTCGGGGAAGACCAGCATGAGGTCGATAATCTT

我们想从第二个文件bbb.fa中找出与aaa.fa中的序列overlapping的序列(overlap-ratio<0.8),请问如何使用blast比对?
是否有现成的perl或者python,或者C代码可以直接使用?
回复此楼
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

智能机器人

Robot (super robot)

我们都爱小木虫

相关版块跳转 我要订阅楼主 emanlee 的主题更新
最具人气热帖推荐 [查看全部] 作者 回/看 最后发表
[考研] 327考研调剂推荐 +3 呜呜呜呜呢 2026-04-06 3/150 2026-04-06 17:41 by dongzh2009
[考研] 304求调剂 +8 c297914 2026-04-05 9/450 2026-04-06 14:18 by 蒋皓禹
[考研] 一志愿武汉理工大学080200机械工程308分,求调剂 +4 终不似从前 2026-04-05 4/200 2026-04-06 11:46 by 考研学校招点人
[考研] 生物与医药273求调剂 +7 荔题南墙 2026-04-05 7/350 2026-04-06 09:26 by 286640313
[考研] 372分,材料与化工,一志愿湖南大学,求调剂 +3 蓝笺片 2026-04-01 3/150 2026-04-06 09:04 by 无际的草原
[考研] 材料专硕(0856) 339分求调剂 +10 哈哈哈鹅哈哈哈 2026-04-04 10/500 2026-04-05 18:51 by 蓝云思雨
[考研] 280求调剂 +4 李rien 2026-04-04 4/200 2026-04-05 18:44 by imissbao
[考研] 275求调剂 +16 waltzh 2026-04-01 16/800 2026-04-05 17:14 by Hdyxbekcb
[考研] 工科求调剂 +15 11ggg 2026-04-03 15/750 2026-04-05 16:24 by zzx2138
[考研] 272求调剂 +4 电气李 2026-04-05 4/200 2026-04-05 10:41 by lbsjt
[考研] 考研调剂 +3 mcbbc 2026-04-04 3/150 2026-04-05 10:03 by barlinike
[考研] 专硕310求调剂 +5 捞捞我…. 2026-04-04 6/300 2026-04-04 23:33 by barlinike
[考研] 277求调剂 +4 12A3 2026-04-02 5/250 2026-04-04 20:28 by 蓝云思雨
[考研] 292求调剂 +11 2022080213 2026-04-04 13/650 2026-04-04 18:38 by macy2011
[考研] 309求调剂 +6 刘刘刘1231 2026-04-02 7/350 2026-04-04 13:41 by liucky
[考研] 一志愿沪985,326分求调剂 +3 刘墨墨 2026-04-03 3/150 2026-04-04 11:16 by 悲伤的芋头
[考研] 357求调剂 +13 1050389037 2026-04-03 13/650 2026-04-03 22:27 by 无际的草原
[考研] 348环境工程调剂 +3 吴彦祖24k 2026-04-01 3/150 2026-04-02 09:14 by nanaliuyun
[考研] 349求调剂 +6 吃的不少 2026-04-01 6/300 2026-04-01 17:55 by JYD2011
[考研] 274求调剂 +6 xiao爱同学 2026-03-30 6/300 2026-03-31 10:04 by cal0306
信息提示
请填处理意见