I try to analyze a GEO data-set (illumina human ht-12 v4). But I can not find the control profile, also can't find the "detection pval" column. So which file should I use? The GEO number is GSE93219. Thanks a lot.
I try to analyze a GEO data-set (illumina human ht-12 v4). But I can not find the control profile, also can't find the "detection pval" column. So which file should I use? The GEO number is GSE93219. Thanks a lot.
Unfortunately the researchers who deposited GSE93219 chose to upload their data in a rather minimalistic form. They uploaded a matrix of (background corrected) intensity values but nothing else. They didn't upload IDAT files, detection p-values or control probe information.
Unfortunately this means that you cannot apply the neqc() normalization method recommended for Illumina BeadChips in the limma User's Guide. It isn't possible to reproduce the authors' own analysis pipeline either, which requires detection p-values, so you could legitimately write to the authors and ask them to provide more information.
Since all you have to use are intensities, I would suggest using the normexp background correction method followed by quantile normalization. You can do that like this.
First, read, in the data:
> Intensities <- read.delim("GSE93219_non-normalized.txt.gz",row.names="ID_REF",skip=4,comment.char="",quote="",check.names=FALSE) > head(Intensities) 7677145074_A 7677145074_D 7677145074_F 7677145074_I 7677145074_J ILMN_1762337 -2.1116250 -2.37869400 -1.513151 4.1281220 2.1620370 ILMN_2055271 3.8533870 8.51458600 -1.531050 6.1021130 -4.8840290 ILMN_1736007 -0.6954489 0.04154839 4.007586 -0.9355530 0.3217204 ILMN_2383229 0.1194849 0.42363610 3.039766 1.5672620 -1.3412360 ILMN_1806310 2.1144340 -1.60486700 5.863817 -0.7480759 7.8510020 ILMN_1779670 3.9005440 2.62810400 1.189531 0.1058746 -1.6286350 7677145074_L 7677145100_B 7677145100_D 7677145100_E 7677145100_G ILMN_1762337 -1.8716880 2.8382580 0.5438465 9.08820200 -1.136811 ILMN_2055271 -1.5607060 3.1012740 7.1532370 13.20197000 15.902690 ILMN_1736007 -6.6722180 -0.7286062 -0.3817440 3.20575500 -0.412476 ILMN_2383229 0.2424704 -1.8650650 -4.8001250 -1.24175500 1.938118 ILMN_1806310 8.4159860 -0.3495244 6.0778200 -4.98494900 -5.160500 ILMN_1779670 5.5388950 -5.7164070 8.6839370 0.03834331 -3.216622 7677145100_J 7677145100_K 7677145100_L ILMN_1762337 1.414906 -7.0116310 -1.5758850 ILMN_2055271 3.147137 0.2801241 -0.4535244 ILMN_1736007 -4.763370 -2.8862500 -2.7357590 ILMN_2383229 -3.012951 -0.7996641 -2.5958350 ILMN_1806310 -2.477314 1.2336840 0.0526173 ILMN_1779670 -1.621891 0.3306231 2.5410890 > x <- new("EListRaw") > x$E <- as.matrix(Intensities)
Then background correct and quantile normalize:
> xb <- backgroundCorrect(x,method="normexp",offset=16) > y <- normalizeBetweenArrays(xb,method="quantile")
Then you can proceed with a DE analysis. The targets information is as follows:
> targets Donor Subset Status 7677145074_A 106 Th2A Stimulated 7677145074_D 149 Th2A Stimulated 7677145074_F 106 Th1 Ctrl 7677145074_I 106 Th2 Ctrl 7677145074_J 106 Th2A Ctrl 7677145074_L 149 Th1 Ctrl 7677145100_B 21 Th1 Stimulated 7677145100_D 149 Th17 Stimulated 7677145100_E 149 Th2 Stimulated 7677145100_G 149 Th1 Stimulated 7677145100_J 21 Th17 Stimulated 7677145100_K 149 Th2A Ctrl 7677145100_L 149 Th2 Ctrl
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What is the analysis you want to do?
I want to do differential expression analysis between cell types. But I can't find the control profile to normalize the data. Thanks.