Dear BioC-ers,
I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
I am wondering if there is a quick way to generate the needed vectors
(mu and tau for the unexpressed distribution) in the same way as the
package frmaTools allows for the fRMA necessary vectors.
I hope I am not posting about an issue already treated in this mailing
list, but searching it produced no obvious hints.
thanks a lot for your help and suggestions.
cheers
dario
-- output of sessionInfo():
sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
[3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
[5] AnnotationDbi_1.20.7 affy_1.36.1
[7] frma_1.10.0 Biobase_2.18.0
[9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
loaded via a namespace (and not attached):
[1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
[4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
[7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7
[10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
[13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
[16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
[19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>
--
Sent via the guest posting facility at bioconductor.org.
Dario,
Generating the barcode vectors (estimating the null distribution for
each probeset) typically isn't something one can run on a laptop. It
takes about 1-2 days running in parallel on about 20 nodes of a
computing cluster. If you have access to such resources, I'm happy to
help you create your own implementation. Is the custom CDF you're
using one of the Brain Array CDFs or something of your own design?
Best,
Matt
On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
<guest at="" bioconductor.org=""> wrote:
>
> Dear BioC-ers,
>
> I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
> I am wondering if there is a quick way to generate the needed
vectors (mu and tau for the unexpressed distribution) in the same way
as the package frmaTools allows for the fRMA necessary vectors.
> I hope I am not posting about an issue already treated in this
mailing list, but searching it produced no obvious hints.
>
> thanks a lot for your help and suggestions.
> cheers
> dario
>
>
>
> -- output of sessionInfo():
>
> sessionInfo()
> R version 2.15.3 (2013-03-01)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
> [5] AnnotationDbi_1.20.7 affy_1.36.1
> [7] frma_1.10.0 Biobase_2.18.0
> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>
>
> --
> Sent via the guest posting facility at bioconductor.org.
--
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880
Dear Matt,
thanks a lot for the quick reply!
i'm working on data from 8 homo sapiens affymetrix platforms re-
annotated with brainarray cdf (ensembl gene).
i can have access to relatively large computer clusters, so that is
not worrying me.
the most obvious question is probably concerning what volume of data
from chipsets other than 133a and 133p2 i would need in order to
generate meaningful estimations.
thanks
d
On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
wrote:
> Dario,
>
> Generating the barcode vectors (estimating the null distribution for
> each probeset) typically isn't something one can run on a laptop. It
> takes about 1-2 days running in parallel on about 20 nodes of a
> computing cluster. If you have access to such resources, I'm happy
to
> help you create your own implementation. Is the custom CDF you're
> using one of the Brain Array CDFs or something of your own design?
>
> Best,
> Matt
>
>
> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
> <guest at="" bioconductor.org=""> wrote:
>>
>> Dear BioC-ers,
>>
>> I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
>> I am wondering if there is a quick way to generate the needed
vectors (mu and tau for the unexpressed distribution) in the same way
as the package frmaTools allows for the fRMA necessary vectors.
>> I hope I am not posting about an issue already treated in this
mailing list, but searching it produced no obvious hints.
>>
>> thanks a lot for your help and suggestions.
>> cheers
>> dario
>>
>>
>>
>> -- output of sessionInfo():
>>
>> sessionInfo()
>> R version 2.15.3 (2013-03-01)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods
base
>>
>> other attached packages:
>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>> [5] AnnotationDbi_1.20.7 affy_1.36.1
>> [7] frma_1.10.0 Biobase_2.18.0
>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>
>> loaded via a namespace (and not attached):
>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>
>
>
> --
> Matthew N McCall, PhD
> 112 Arvine Heights
> Rochester, NY 14611
> Cell: 202-222-5880
Dario,
For the barcode implementations in BioC, I used > 10,000 arrays from
each platform. I doubt this amount of data is available for all 8 Affy
platforms you're using. If you don't mind giving me a brief overview
of your research goals for this project (not cc'ing the BioC mailing
list if you're more comfortable with that), I might be able to provide
some alternatives to a full barcode implementation.
Best,
Matt
On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se="">
wrote:
> Dear Matt,
> thanks a lot for the quick reply!
> i'm working on data from 8 homo sapiens affymetrix platforms re-
annotated with brainarray cdf (ensembl gene).
> i can have access to relatively large computer clusters, so that is
not worrying me.
> the most obvious question is probably concerning what volume of data
from chipsets other than 133a and 133p2 i would need in order to
generate meaningful estimations.
> thanks
> d
>
>
> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
wrote:
>
>> Dario,
>>
>> Generating the barcode vectors (estimating the null distribution
for
>> each probeset) typically isn't something one can run on a laptop.
It
>> takes about 1-2 days running in parallel on about 20 nodes of a
>> computing cluster. If you have access to such resources, I'm happy
to
>> help you create your own implementation. Is the custom CDF you're
>> using one of the Brain Array CDFs or something of your own design?
>>
>> Best,
>> Matt
>>
>>
>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
>> <guest at="" bioconductor.org=""> wrote:
>>>
>>> Dear BioC-ers,
>>>
>>> I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
>>> I am wondering if there is a quick way to generate the needed
vectors (mu and tau for the unexpressed distribution) in the same way
as the package frmaTools allows for the fRMA necessary vectors.
>>> I hope I am not posting about an issue already treated in this
mailing list, but searching it produced no obvious hints.
>>>
>>> thanks a lot for your help and suggestions.
>>> cheers
>>> dario
>>>
>>>
>>>
>>> -- output of sessionInfo():
>>>
>>> sessionInfo()
>>> R version 2.15.3 (2013-03-01)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods
base
>>>
>>> other attached packages:
>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>>> [5] AnnotationDbi_1.20.7 affy_1.36.1
>>> [7] frma_1.10.0 Biobase_2.18.0
>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>>> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>>
>>>
>>> --
>>> Sent via the guest posting facility at bioconductor.org.
>>
>>
>>
>> --
>> Matthew N McCall, PhD
>> 112 Arvine Heights
>> Rochester, NY 14611
>> Cell: 202-222-5880
>
--
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880
Hi Matt,
Sorry to interfere with this specific discussion, but i would also be
interested in your suggestions on potential alternative approaches.
The reason I am interested is because ideally I would like to apply a
(your) barcoding approach for platforms that are less used compared to
the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x
arrays.
Regards,
Guido
-----Original Message-----
From: bioconductor-bounces@r-project.org [mailto:bioconductor-
bounces@r-project.org] On Behalf Of Matthew McCall
Sent: Tuesday, March 26, 2013 15:59
To: Dario Greco
Cc: Bioconductor at r-project.org
Subject: Re: [BioC] barcode with custom CDF
Dario,
For the barcode implementations in BioC, I used > 10,000 arrays from
each platform. I doubt this amount of data is available for all 8 Affy
platforms you're using. If you don't mind giving me a brief overview
of your research goals for this project (not cc'ing the BioC mailing
list if you're more comfortable with that), I might be able to provide
some alternatives to a full barcode implementation.
Best,
Matt
On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se="">
wrote:
> Dear Matt,
> thanks a lot for the quick reply!
> i'm working on data from 8 homo sapiens affymetrix platforms re-
annotated with brainarray cdf (ensembl gene).
> i can have access to relatively large computer clusters, so that is
not worrying me.
> the most obvious question is probably concerning what volume of data
from chipsets other than 133a and 133p2 i would need in order to
generate meaningful estimations.
> thanks
> d
>
>
> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
wrote:
>
>> Dario,
>>
>> Generating the barcode vectors (estimating the null distribution
for
>> each probeset) typically isn't something one can run on a laptop.
It
>> takes about 1-2 days running in parallel on about 20 nodes of a
>> computing cluster. If you have access to such resources, I'm happy
to
>> help you create your own implementation. Is the custom CDF you're
>> using one of the Brain Array CDFs or something of your own design?
>>
>> Best,
>> Matt
>>
>>
>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
>> <guest at="" bioconductor.org=""> wrote:
>>>
>>> Dear BioC-ers,
>>>
>>> I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
>>> I am wondering if there is a quick way to generate the needed
vectors (mu and tau for the unexpressed distribution) in the same way
as the package frmaTools allows for the fRMA necessary vectors.
>>> I hope I am not posting about an issue already treated in this
mailing list, but searching it produced no obvious hints.
>>>
>>> thanks a lot for your help and suggestions.
>>> cheers
>>> dario
>>>
>>>
>>>
>>> -- output of sessionInfo():
>>>
>>> sessionInfo()
>>> R version 2.15.3 (2013-03-01)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods
base
>>>
>>> other attached packages:
>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>>> [5] AnnotationDbi_1.20.7 affy_1.36.1
>>> [7] frma_1.10.0 Biobase_2.18.0
>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>>> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>>
>>>
>>> --
>>> Sent via the guest posting facility at bioconductor.org.
>>
>>
>>
>> --
>> Matthew N McCall, PhD
>> 112 Arvine Heights
>> Rochester, NY 14611
>> Cell: 202-222-5880
>
--
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Guido,
No worries. It depends on what you plan to do with the data. One
option is to go back to the method described in the original barcode
paper (Zilliox and Irizarry Nat Methods 2007), which discards any
genes that don't show a bimodal distribution. Another option is to
define the null distribution based on your specific data set -- e.g.
you estimate the null distribution for each gene using say 50
untreated samples and then use that distribution to "barcode" treated
samples (this is similar to the POE algorithm --
http://astor.som.jhmi.edu/poe/). There are other options as well. The
reason the barcode implementations I make require so many arrays is
that we are trying to perform well regardless of what the researcher
is interested in -- we give a bunch of examples of how to use the
barcode algorithm for various tasks in the NAR 2011 paper.
As for a HuGene and MoGene ST barcode implementation -- I'm working on
this and hope to have something by the fall BioC release.
Best,
Matt
On Tue, Mar 26, 2013 at 11:15 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote:
> Hi Matt,
> Sorry to interfere with this specific discussion, but i would also
be interested in your suggestions on potential alternative approaches.
> The reason I am interested is because ideally I would like to apply
a (your) barcoding approach for platforms that are less used compared
to the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST
v1.x arrays.
>
> Regards,
> Guido
>
> -----Original Message-----
> From: bioconductor-bounces at r-project.org [mailto:bioconductor-
bounces at r-project.org] On Behalf Of Matthew McCall
> Sent: Tuesday, March 26, 2013 15:59
> To: Dario Greco
> Cc: Bioconductor at r-project.org
> Subject: Re: [BioC] barcode with custom CDF
>
> Dario,
>
> For the barcode implementations in BioC, I used > 10,000 arrays from
each platform. I doubt this amount of data is available for all 8 Affy
platforms you're using. If you don't mind giving me a brief overview
of your research goals for this project (not cc'ing the BioC mailing
list if you're more comfortable with that), I might be able to provide
some alternatives to a full barcode implementation.
>
> Best,
> Matt
>
>
> On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se="">
wrote:
>> Dear Matt,
>> thanks a lot for the quick reply!
>> i'm working on data from 8 homo sapiens affymetrix platforms re-
annotated with brainarray cdf (ensembl gene).
>> i can have access to relatively large computer clusters, so that is
not worrying me.
>> the most obvious question is probably concerning what volume of
data from chipsets other than 133a and 133p2 i would need in order to
generate meaningful estimations.
>> thanks
>> d
>>
>>
>> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
wrote:
>>
>>> Dario,
>>>
>>> Generating the barcode vectors (estimating the null distribution
for
>>> each probeset) typically isn't something one can run on a laptop.
It
>>> takes about 1-2 days running in parallel on about 20 nodes of a
>>> computing cluster. If you have access to such resources, I'm happy
to
>>> help you create your own implementation. Is the custom CDF you're
>>> using one of the Brain Array CDFs or something of your own design?
>>>
>>> Best,
>>> Matt
>>>
>>>
>>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
>>> <guest at="" bioconductor.org=""> wrote:
>>>>
>>>> Dear BioC-ers,
>>>>
>>>> I would like to run the function 'barcode' on a set of CEL files
preprocessed with a custom CDF.
>>>> I am wondering if there is a quick way to generate the needed
vectors (mu and tau for the unexpressed distribution) in the same way
as the package frmaTools allows for the fRMA necessary vectors.
>>>> I hope I am not posting about an issue already treated in this
mailing list, but searching it produced no obvious hints.
>>>>
>>>> thanks a lot for your help and suggestions.
>>>> cheers
>>>> dario
>>>>
>>>>
>>>>
>>>> -- output of sessionInfo():
>>>>
>>>> sessionInfo()
>>>> R version 2.15.3 (2013-03-01)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
base
>>>>
>>>> other attached packages:
>>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>>>> [5] AnnotationDbi_1.20.7 affy_1.36.1
>>>> [7] frma_1.10.0 Biobase_2.18.0
>>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>>>> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
>>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>>>
>>>>
>>>> --
>>>> Sent via the guest posting facility at bioconductor.org.
>>>
>>>
>>>
>>> --
>>> Matthew N McCall, PhD
>>> 112 Arvine Heights
>>> Rochester, NY 14611
>>> Cell: 202-222-5880
>>
>
>
>
> --
> Matthew N McCall, PhD
> 112 Arvine Heights
> Rochester, NY 14611
> Cell: 202-222-5880
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
--
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880
Hi Dario and Guido,
The UPC function in our SCAN.UPC package addresses this need. We use a
"single-sample" approach to estimating barcodes. Essentially this
means
that we use the probe values within a given microarray sample to
estimate
a background distribution and then use that information to estimate
whether each gene is "active" or "inactive" in that array. This is
similar
in concept to the barcode function (fRMA package) except that it does
not
require a large collection of reference samples, so it can easily be
applied to Affy arrays from any platform. We have performed a
comparison
using the Affy Latin Square data, and our approached compares
favorably to
the barcode function (manuscript in revision, we can send more details
offline if you're
interested).
It's also straightforward to use alternative CDFs, such as from
BrainArray. This functionality is described in the package's
documentation.
One caveat: the UPC function is currently available only in the
"development" version of Bioconductor (it will be released to the main
version in a couple weeks). So if you want to try it out, you'll need
to
install the development version of R and then the development version
of
Bioconductor.
Please let us know if you have any questions!
Regards,
-Steve
>
>Message: 13
>Date: Tue, 26 Mar 2013 15:15:25 +0000
>From: "Hooiveld, Guido" <guido.hooiveld at="" wur.nl="">
>To: "'Matthew McCall'" <mccallm at="" gmail.com="">, "'Dario Greco'"
> <dario.greco at="" ki.se="">
>Cc: "'Bioconductor at r-project.org'" <bioconductor at="" r-project.org="">
>Subject: Re: [BioC] barcode with custom CDF
>Message-ID:
> <eb992c246eb7bf449bc1e6b12af7f65007db84c0 at="" scomp0933.wurnet.nl="">
>Content-Type: text/plain; charset="us-ascii"
>
>Hi Matt,
>Sorry to interfere with this specific discussion, but i would also be
>interested in your suggestions on potential alternative approaches.
>The reason I am interested is because ideally I would like to apply a
>(your) barcoding approach for platforms that are less used compared
to
>the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x
>arrays.
>
>Regards,
>Guido
>
>-----Original Message-----
>From: bioconductor-bounces at r-project.org
>[mailto:bioconductor-bounces at r-project.org] On Behalf Of Matthew
McCall
>Sent: Tuesday, March 26, 2013 15:59
>To: Dario Greco
>Cc: Bioconductor at r-project.org
>Subject: Re: [BioC] barcode with custom CDF
>
>Dario,
>
>For the barcode implementations in BioC, I used > 10,000 arrays from
each
>platform. I doubt this amount of data is available for all 8 Affy
>platforms you're using. If you don't mind giving me a brief overview
of
>your research goals for this project (not cc'ing the BioC mailing
list if
>you're more comfortable with that), I might be able to provide some
>alternatives to a full barcode implementation.
>
>Best,
>Matt
>
>
>On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se="">
wrote:
>> Dear Matt,
>> thanks a lot for the quick reply!
>> i'm working on data from 8 homo sapiens affymetrix platforms
>>re-annotated with brainarray cdf (ensembl gene).
>> i can have access to relatively large computer clusters, so that is
not
>>worrying me.
>> the most obvious question is probably concerning what volume of
data
>>from chipsets other than 133a and 133p2 i would need in order to
>>generate meaningful estimations.
>> thanks
>> d
>>
>>
>> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
wrote:
>>
>>> Dario,
>>>
>>> Generating the barcode vectors (estimating the null distribution
for
>>> each probeset) typically isn't something one can run on a laptop.
It
>>> takes about 1-2 days running in parallel on about 20 nodes of a
>>> computing cluster. If you have access to such resources, I'm happy
to
>>> help you create your own implementation. Is the custom CDF you're
>>> using one of the Brain Array CDFs or something of your own design?
>>>
>>> Best,
>>> Matt
>>>
>>>
>>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
>>> <guest at="" bioconductor.org=""> wrote:
>>>>
>>>> Dear BioC-ers,
>>>>
>>>> I would like to run the function 'barcode' on a set of CEL files
>>>>preprocessed with a custom CDF.
>>>> I am wondering if there is a quick way to generate the needed
vectors
>>>>(mu and tau for the unexpressed distribution) in the same way as
the
>>>>package frmaTools allows for the fRMA necessary vectors.
>>>> I hope I am not posting about an issue already treated in this
>>>>mailing list, but searching it produced no obvious hints.
>>>>
>>>> thanks a lot for your help and suggestions.
>>>> cheers
>>>> dario
>>>>
>>>>
>>>>
>>>> -- output of sessionInfo():
>>>>
>>>> sessionInfo()
>>>> R version 2.15.3 (2013-03-01)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
base
>>>>
>>>> other attached packages:
>>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>>>> [5] AnnotationDbi_1.20.7 affy_1.36.1
>>>> [7] frma_1.10.0 Biobase_2.18.0
>>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>>>> [7] ff_2.2-11 foreach_1.4.0
GenomicRanges_1.10.7
>>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>>>
>>>>
>>>> --
>>>> Sent via the guest posting facility at bioconductor.org.
>>>
>>>
>>>
>>> --
>>> Matthew N McCall, PhD
>>> 112 Arvine Heights
>>> Rochester, NY 14611
>>> Cell: 202-222-5880
>>
>
>
>
>--
>Matthew N McCall, PhD
>112 Arvine Heights
>Rochester, NY 14611
>Cell: 202-222-5880
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>*********************************************