Hi Pekka,
I had read about the custom cfds now and it sounds very good. But
there are
still a few questions. The first problem is to install the custom
CDF
package. I download the package from BrainArray and want to install
with the
command:
install.packages("C:/Users/Julie/documents/R/win-
library/3.0/CustomCDF_1.2.1
.tar.gz", repos=NULL,type="source") and I get the following error
message:
Installing package into ?C:/Users/Julie/Documents/R/win-
library/3.0?
(as ?lib? is unspecified)
* installing *source* package 'CustomCDF' ...
** libs
*** arch - i386
ERROR: compilation failed for package 'CustomCDF'
* removing 'C:/Users/Julie/Documents/R/win-library/3.0/CustomCDF'
Warnmeldungen:
1: Ausf?hrung von Kommando '"C:/PROGRA~1/R/R-30~1.2/bin/x64/R" CMD
INSTALL
-l "C:\Users\Julie\Documents\R\win-
library\3.0"
"C:/Users/Julie/documents/R/win-
library/3.0/CustomCDF_1.2.1.tar.gz"'ergab
Status 1 (the command has the status 1)
2: In
install.packages("C:/Users/Julie/documents/R/win-
library/3.0/CustomCDF_1.2.1
.tar.gz", :
Installation des Pakets
?C:/Users/Julie/documents/R/win-
library/3.0/CustomCDF_1.2.1.tar.gz?hatte
Exit-Status ungleich 0 (The Installation of the pacakge has the
exit-status
unequal 0)
So I tried to download the Chip information directly. I take the
cdf file
version 18 for Affymetrix Mouse Genome 430 2.0 Array ([1]Mouse4302)
As I have mentioned I want to use the Entrez IDs so I take
ENTREZG
(mouse4302mmentrezgcdf). The Installation of the package works very
well but
I'm irritated when I see that there are only 17607 genes/ affyids
data<-ReadAffy(verbose=TRUE,filenames=cels,cdfname="mouse4302mmentr
ezgcdf")
> data
AffyBatch object
size of arrays=1002x1002 features (47 kb)
cdf=mouse4302mmentrezgcdf (17607 affyids)
number of samples=96
number of genes=17607
annotation=mouse4302mmentrezgcdf
notes=
When I take no cdf file I get more affyids
data2<-ReadAffy(verbose=TRUE,filenames=cels)
> data2
AffyBatch object
size of arrays=1002x1002 features (47 kb)
cdf=Mouse430_2 (45101 affyids)
number of samples=96
number of genes=45101
annotation=mouse4302
notes=
When I take the new cdf file, wasn't there a lost of information?
2. I have a question to the median. Median of what?
Until nowI have done this:
Example
control 1 Control 2 control 3 diet1 diet2
diet3
(this are replicates for the same group)
Bglap 2,5 3,2 3,1
3,9
4,8 3,1
Bglap 1 0,7 0,9
1,2
0,7 1
Bglap 4,9 3,3 4,1
4,8
5,5 5,2
mean value
Con1 Con2 Con3 diet1 diet2 diet3
Bglap 2,8 2,4 2,7 3,3 3,66 3,1
For this values I calculated the p-value with wilcoxon and than I
want to
compare the results of different experiments with RankProd. So I
put all
values in a big excel table and upload them to R. This table looks
like
this:
Experment1
Experiment2
con1 con2 con3 Diet1 diet2 diet3 con1
con2
con3 con4 con4 diet1 diet2 diet3 diet4 diet5
Bglap 2,8 2,4 2,7 3,3 3,66 3,1
5,1
6,6 6,2 6,6 6,3 5,9 6,5 6,4 5,7
6,9
Copd 5,4 7,2 5,8 4,3 5 4,9
3
2,7 4 3,5 4,2 4,3 3,5 3,9 2,5
3,1
Sirt1 7 6,5 7,2 7,3 7,1 6,7
4,5
3,7 4,2 4,6 4,1 4,2 4,5 4,8 4,5
3,9
...
cl<- 1 1 1 2 2 2 1 1 1 1 1 2 2 2 2 2
origin<- 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Must I upload other values for the rank prod?
Kind regards
Stefanie
Gesendet: Freitag, 02. Mai 2014 um 14:26 Uhr
Von: "Pekka Kohonen" <pkpekka at="" gmail.com="">
An: "Stefanie Busch" <stefanie.busch2 at="" web.de="">
Cc: Bioconductor <bioconductor at="" r-project.org="">
Betreff: Re: [BioC] 1. comparing chip Information in meta
analysis /
Rankprod and 2. two color normalization
Hi Stefanie,
You could map the Affymetrix identifiers to single Entrez/Ensembl
identifier using the "custom cdfs" from "BrainArray". You can do
the
normalization for instance using the "simpleaffy" package. If the
Agilent/illumina chip have duplicate probes for some genes you can
just take the median of the fold-change values and use those in the
RankProd package. It is best to have just one identifier/gene per
array, although having more than one is not strictly forbidden.
Custom CDF manuscript:
[2]
http://www.ncbi.nlm.nih.gov/pubmed/?term=16284200
another package to use might be this. But I have not used it
myself.
RankAggreg:
[3]
http://www.biomedcentral.com/1471-2105/10/62
Generally using rank-based analysis can lead to significant results
that have very small effect sizes (fold-change). So you should use
fold change to filter the results to some extent as well.
Best, Pekka
2014-04-30 11:36 GMT+02:00 Stefanie Busch <stefanie.busch2 at="" web.de="">:
>
> Hello,
>
> I have two questions and I hope you can help me.
>
> I want to compare several studies with similar design but
different
arrays.
> The first step was to quantile normalize all data which works
well beside
> the two color experiment with an Agilent chip. I read the limma
User Guide
> and find out that I must preprocess with the function
> normalizeBetweenArrays. So I get M- and A-values and my question
is which
> one shows the expression values for this experiment?
>
> For comparing the results of the different studies I want to use
the
> package: RankProd. For a better comparision between the studies I
used the
> Entrez IDs and I download the last chip information directly
from
affymerix
> and illumina. So this reveal a new problem. For example on the
chip
> Affymetrix Mouse Genome 430 2.0 Array the ID 1449880_s_at stands
for three
> gene names and entrez ids:Bglap /// Bglap2 /// Bglap3 - 12095 ///
12096
///
> 12097. On the Illumina Chip each gene has a single Array ID:
> Bglap-rs1 - ILMN_1233122 - 12095
> Bglap1 - ILMN_2610166 - 12096
> Bglap2 - ILMN_2944508 - 12097
>
> So I don't no what I should do to compare the results of this two
> experiments. When I paste the expression values of 1449880_s_at
three
times
> with the three different entrez-IDs the ranking which was
calculating with
> the RankProd-Package was changed.
> Example:
> Chip ID Entrez-Id Control1 control 2 etc.
> 1449880_s_at - 12095 - 3,855 - 4,211 ...
> 1449880_s_at - 12096 - 3,855 - 4,211 ...
> 1449880_s_at - 12097 - 3,855 - 4,211 ...
>
> The other possibility is to take the three expression Values of
the
illumina
> chip to one value. But I don't know if the is the right way. What
is the
> better way?
>
> Kind regards
> Stefanie Busch
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> [4]
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
[5]
http://news.gmane.org/gmane.science.biology.informatics.conductor
References
1.
http://www.affymetrix.com/support/technical/byproduct.affx?produ
ct=moe430-20
2.
http://www.ncbi.nlm.nih.gov/pubmed/?term=16284200
3.
http://www.biomedcentral.com/1471-2105/10/62
4.
https://stat.ethz.ch/mailman/listinfo/bioconductor
5.
http://news.gmane.org/gmane.science.biology.informatics.conductor