24小时热门版块排行榜    

查看: 565  |  回复: 2
当前只显示满足指定条件的回帖,点击这里查看本话题的所有回帖

xiezhancnu

木虫 (小有名气)

[求助] 另开新帖求perl程序高手指导~~

http://muchong.com/bbs/viewthread.php?tid=4816445
就是这个,我需要一个同时处理3000多个基因的程序,具体内容请点击链接进入另一个帖子,我急用的,呵呵~~~求求求高手赐教,
PS:其实说真的,我perl程序的引用真心不会用,每次一牵扯这种二维数组类型的数据就写不出来,很悲催~~~.,,真心求帮助~~先谢过~~呵呵~~~
回复此楼

» 猜你喜欢

» 本主题相关价值贴推荐,对您同样有帮助:

已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

wizardfan

至尊木虫 (著名写手)

优秀版主


xzhdty: 金币+1, 谢谢 2012-08-13 22:23:33
引用回帖:
2楼: Originally posted by wizardfan at 2012-08-13 06:02:26
use strict;
use Data:umper;

#create index for names to avoid the situation that two files have different order of gene names
my %idx;
my %idxReverse;
open IN,"pvalue.csv";
my $lin ...

这个程序可以处理任意多个基因,任意多个数据文件(比如还有个文件叫similarity),只要内存够。但是有一个要求,输入格式必须是CSV,comma separated values。
例如
,DR_2577,DR_A0328,DR_2258,DR_0109,DR_0465,DR_0539,DR_1356,DR_2207,DR_2121,DR_1962
DR_2577,NA,1.71E-08,0.000439,0.026534,1.15E-06,0.712607,0.052184,7.77E-17,2.91E-10,4.74E-17
DR_A0328,NA,NA,0.011085,0.067408,4.92E-07,0.535037,0.019445,4.43E-09,1.27E-07,1.07E-07
DR_2258,NA,NA,NA,0.824648,0.00147,0.054368,0.016167,0.003185,0.000694,0.00236
DR_0109,NA,NA,NA,NA,0.09646,0.161788,0.696339,0.004726,0.007971,0.040283
DR_0465,NA,NA,NA,NA,NA,0.189011,0.012672,1.10E-06,4.67E-06,0.000102
DR_0539,NA,NA,NA,NA,NA,NA,0.034369,0.57409,0.306483,0.573561
DR_1356,NA,NA,NA,NA,NA,NA,NA,0.217671,0.024297,0.381359
DR_2207,NA,NA,NA,NA,NA,NA,NA,NA,6.84E-10,5.93E-13
DR_2121,NA,NA,NA,NA,NA,NA,NA,NA,NA,1.38E-09
DR_1962,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
3楼2012-08-13 06:05:12
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
查看全部 3 个回答

wizardfan

至尊木虫 (著名写手)

优秀版主

【答案】应助回帖

★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
感谢参与,应助指数 +1
xiezhancnu: 金币+30 2012-08-14 13:11:13
use strict;
use Data:umper;

#create index for names to avoid the situation that two files have different order of gene names
my %idx;
my %idxReverse;
open IN,"pvalue.csv";
my $line = ;
chomp($line);
my @arr = split(",",$line);
my $len = scalar @arr;
for(my $i=1;$i<$len;$i++){
        $idx{$arr[$i]}=$i;
        $idxReverse{$i}=$arr[$i];
}
close IN;
#populate the data structure
my %hash;#values
my @tags;#the elements to be displayed in the header line after gene1 and gene2
&dealOneFile("cor.csv","correlation";
&dealOneFile("pvalue.csv","pvalue";
#print the result
#header
print "gene1,gene2";
foreach my $tag(@tags){
        print ",$tag";
}
print "\n";
#values
foreach my $aa(sort {$a<=>$b} keys %hash){
        my %tmp = %{$hash{$aa}};
        foreach my $bb(sort {$a<=>$b} keys %tmp){
                print "$idxReverse{$aa},$idxReverse{$bb}";
                foreach my $tag(@tags){
                        print ",$hash{$aa}{$bb}{$tag}";
                }
                print "\n";
        }
}

sub dealOneFile{
        my $filename = $_[0];
        my $tag = $_[1];
        push(@tags,$tag);
        open IN,"$filename";
        my $line = ;
        chomp($line);
        my @header = split(",",$line);
        shift @header;
        my $count = 0;
        while($line={
                chomp($line);
                $count++;
                my @arr = split(",",$line);
                my $current = shift(@arr);
                my $currentIdx = $idx{$current};
                for(my $i=$count;$i<$len;$i++){
                        my $second = $header[$i];
                        my $secondIdx = $idx{$second};
                        if ($currentIdx<$secondIdx){
                                $hash{$currentIdx}{$secondIdx}{$tag}=$arr[$i];
                        }else{
                                $hash{$secondIdx}{$currentIdx}{$tag}=$arr[$i];
                        }
                }
        }
}

不知道什么原因,你老是把文件名搞错,自己认真点吧。Perl是很基础的生物信息学工具,好好弄懂,别老是一不会就来提问了。自己的本领才是最关键的。
2楼2012-08-13 06:02:26
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
信息提示
请填处理意见