24小时热门版块排行榜    

查看: 1371  |  回复: 2
当前只显示满足指定条件的回帖,点击这里查看本话题的所有回帖

安静的糖粒子

新虫 (初入文坛)

[求助] 基因表达数据分析,对于还是分子生物学小白的我来说太难了。。。 已有2人参与

The raw gene expression data were extracted using NimbleScan software v 2.4.
The raw data of each gene were presented as the average signal intensity of 21–27 probes.
Then, the raw data in all the arrays were normalized to the medium value.
The t -test approach in the CyberT program( Baldi and Long 2001 , Hatfi eld et al. 2003 ) was used to determine whether the difference in signal intensity between stressed and control samples was signifi cant ( P < 0.01).If the log2 ratio of the signal of WD/control for a particular gene was + 1 or more or −1-fold or less while the P -value in the t test was <0.01, the gene was classifi ed as up- or downregulated,respectively.
The reproducibility of the microarray data was presented with the volcano plots produced using Excel software. The categorization of gene expression was performed following hierarchical clustering and K-means clustering methods based on the MeV (Multi Experiment Viewer) software ( Saeed et al. 2003 ).

这一段话是我在一篇文献里看到的,关于分析基因表达数据的方法。然而以我目前的水平还理解不了这些东西,希望各位能帮我解释一下。最好细化到基础的专业名词。多谢多谢!
回复此楼
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

李拓邦

新虫 (初入文坛)

【答案】应助回帖

感谢参与,应助指数 +1
The raw gene expression data were extracted using NimbleScan software v 2.4.
原始基因表达数据被提取使用Ni..软件。
(首先有实验样本和对照样本,然后这两个样本,分别提取mRNA,然后逆转录合成cDNA,在这个过程中,用CY3CY5荧光染料分别标记两个DNA样本,然后将这两个DNA样本共同杂交到芯片上,用芯片扫描仪在532nm扫描CY3,在635nm扫描CY5,得到结果,这个芯片扫描仪叫 NimbleScan,它扫描完以后自动处理的软件也叫这个名字 )
The raw data of each gene were presented as the average signal intensity of 21–27 probes.
每个基因的原始数据被表示为21-27探针的平均信号强度。(就是芯片上有几个探针测荧光信号)
Then, the raw data in all the arrays were normalized to the medium value.The t -test approach in the CyberT program( Baldi and Long 2001 , Hatfi eld et al. 2003 ) was used to determine whether the difference in signal intensity between stressed and control samples was signifi cant ( P < 0.01).If the log2 ratio of the signal of WD/control for a particular gene was + 1 or more or −1-fold or less while the P -value in the t test was <0.01, the gene was classifi ed as up- or downregulated,respectively.
然后,所有阵列中的原始数据被标准化为中间值。CyberT 程序中的T测试方法被用于确定实验和对照样本中的信号强度的差别是否是显著的。当上面的T测试的P值是<0.01,如果一个特定基因的实验组和对照组的信号的log2比是+1或更多或-1或更少。那么这个基因就被分类到上调表达或下调表达。(用软件进行数据提取之后,首先要校正,然后先算CY3和CY5的比值,这个比值就是该基因在实验组中的表达水平。因为这个比值是比较小的,那么用LOG2算一下,它的数值就会变大一点,变的更明显一点。所以这个比值的正负,就是实验组相对于对照组来说,这个基因表达量多了,或者少了)
The reproducibility of the microarray data was presented with the volcano plots produced using Excel software. The categorization of gene expression was performed following hierarchical clustering and K-means clustering methods based on the MeV (Multi Experiment Viewer) software ( Saeed et al. 2003 ).
基因芯片数据的重复性用Excel软件生成的火山图展现出来。基因表达分类是使用等级聚类和K-均值的聚类方法得到的,这个方法基于MeV软件.(这个等级聚类你可以理解为,相似度高的分为一类,相似度低的分为一类)
3楼2015-12-04 02:17:01
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
查看全部 3 个回答

xiaodi1989

铜虫 (小有名气)

【答案】应助回帖

感谢参与,应助指数 +1
RNA expression array chip现在已经过时了,要学数据分析就去看RNAseq吧,youku上有视频。
简单说你这篇文献的方法就是平均每个gene有26-27个探针来检测,原理就是探针与cDNA互补,在扩增是产生荧光信号。选用多个探针的原因是mRNA有splicing variant。最简单的理解就是每个gene都被26-27个引物扩增了一遍,有表达的信号高,没表达的信号低。
2楼2015-12-04 00:33:49
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
信息提示
请填处理意见