Question

applicability of tilingArray package

0

Entering edit mode

Michael Palumbo ▴ 50

@michael-palumbo-2170

Last seen 10.2 years ago

hello, i have general questions regarding the applicability of the tilingArray package to my problem/data. i've used bioconductor in the past, but by no means am i an expert. i have data from affy yeast tiling arrays - 3 mut and 3 wild type. i've run affy's TAS program on the CEL files - as a two sample analysis, ie, comparing wt to mut and viewed the results in IGB. my initial goal is to segment the results as was done in David et al, PNAS 2006. it seems to me there are fundamental differences in my data and the data of David et al. e.g., the normalization step described in tilingArray doc uses DNA hybridized to the chips as a reference - i don't have that, although i do have the wt data. a colleague thought i might be able to use the wt data in the normalization step, but that doesn't seem quite right to me. it is also described that normalization can occur by MM probes - maybe i can normalize the mut chip data w/ MM probes and completely ignore the wt data? i realize that if i did that, the result would no longer be a comparison of mut and wt and what i would 'see' would be different from what i currently see in IGB of the two sample TAS analysis. this also seems like it's not the best approach. on the other hand, again, all i really want to do is segment the two-sample analysis that i've done. is there anything wrong with using the results of TAS's analysis? TAS does a normalization and has bandwidth averaging - as a non-expert, these are convenient and seem good to me. thanks in advance for any and all responses/thoughts, mike palumbo -- Michael Palumbo palumbo at wadsworth.org Wadsworth Center Center for Medical Science New York State Dept of Health 150 New Scotland Ave Albany, NY 12208 IMPORTANT NOTICE: This e-mail and any attachments may contain confidential or sensitive information which is, or may be, legally privileged or otherwise protected by law from further disclosure. It is intended only for the addressee. If you received this in error or from someone who was not authorized to send it to you, please do not distribute, copy or use it or any attachments. Please notify the sender immediately by reply e-mail and delete this from your system. Thank you for your cooperation.

Normalization Yeast affy tilingArray Normalization Yeast affy tilingArray • 1.3k views

ADD COMMENT • link updated 16.1 years ago by Wolfgang Huber ★ 13k • written 16.1 years ago by Michael Palumbo ▴ 50

score 0 · Answer 1 · 2008-11-04

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 12 weeks ago

EMBL European Molecular Biology Laborat…

Hi Michael, there are two separate issues: (i) finding the transcribed regions, separately in each of the samples (wt, mut). (ii) finding the differentially transcribed regions. For (i), you could use an approach similar to that in the David et al. and Huber et al. papers. Since you don't have the DNA reference hybes, you could use the MM probes. This is described in Section 4.2 of the vignette http://www.bioconductor.org/packages/2.3/bioc/vignettes/tilingArray/in st/doc/assessNorm.pdf and as the benchmarks in Section 5 show, it is not quite as good, but still pretty good. Don't think of this in terms of "normalising" the mutant against the "wt" type, that doesn't make much sense. For (ii), if you want to segment e.g. a probe-wise (moderated) t-statistic, the piecewise constant model using in the tilingArray package is not useful. A running window approach (like in TAS) makes sense, the hard part is of course tuning its parameters. AfaIk, there are methods for (i) and (ii) separately, and to join / align them, the approaches are ad hoc. It would be nice if there were a clean method that does (i) and (ii) jointly - maybe someone else has insights in this? Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 04/11/2008 16:42 Michael Palumbo scripsit > hello, > > i have general questions regarding the applicability of the tilingArray > package to my problem/data. i've used bioconductor in the past, but by > no means am i an expert. > > i have data from affy yeast tiling arrays - 3 mut and 3 wild type. i've > run affy's TAS program on the CEL files - as a two sample analysis, ie, > comparing wt to mut and viewed the results in IGB. my initial goal is to > segment the results as was done in David et al, PNAS 2006. it seems to > me there are fundamental differences in my data and the data of David et > al. e.g., the normalization step described in tilingArray doc uses DNA > hybridized to the chips as a reference - i don't have that, although i > do have the wt data. a colleague thought i might be able to use the wt > data in the normalization step, but that doesn't seem quite right to me. > it is also described that normalization can occur by MM probes - maybe i > can normalize the mut chip data w/ MM probes and completely ignore the > wt data? i realize that if i did that, the result would no longer be a > comparison of mut and wt and what i would 'see' would be different from > what i currently see in IGB of the two sample TAS analysis. this also > seems like it's not the best approach. > > on the other hand, again, all i really want to do is segment the > two-sample analysis that i've done. is there anything wrong with using > the results of TAS's analysis? TAS does a normalization and has > bandwidth averaging - as a non-expert, these are convenient and seem > good to me. > > thanks in advance for any and all responses/thoughts, > mike palumbo >

ADD COMMENT • link 16.1 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

wolfgang, thanks for your thoughts. i have to say i'm afraid i only sort of follow what you've said. in an effort to clarify, it sounds like you've said the methods in the tilingArray package probably aren't a good approach to do the segmentation given the data i have. you've said TAS's approach to the segmentation might be good, but finding the best parameters might be difficult. you've also said that for (i) i could use the methods of David et al and Huber et al using MM probes. if i do that, i'll be left with two separate collections (wt and mut) of normalized data, which i'll then need to find (ii), ie, the differentially transcribed regions and then segment those results. the confusing part for me is connecting what you've said about TAS to using the MM normalizing methods. i don't see how i could use the MM normalizing methods and get 2 data sets of expression levels and then use TAS to find the differentially transcribed data and segments. maybe you're suggestion one or the other, ie, stick with TAS to do it all, or use huber et al, MM for normalizing and then find some other method to find the differentially transcribed regions and segmentation? thanks, mike Wolfgang Huber wrote: > Hi Michael, > > there are two separate issues: > (i) finding the transcribed regions, separately in each of the samples > (wt, mut). > (ii) finding the differentially transcribed regions. > > For (i), you could use an approach similar to that in the David et al. > and Huber et al. papers. Since you don't have the DNA reference hybes, > you could use the MM probes. This is described in Section 4.2 of the > vignette > http://www.bioconductor.org/packages/2.3/bioc/vignettes/tilingArray/ inst/doc/assessNorm.pdf > and as the benchmarks in Section 5 show, it is not quite as good, but > still pretty good. > > Don't think of this in terms of "normalising" the mutant against the > "wt" type, that doesn't make much sense. > > For (ii), if you want to segment e.g. a probe-wise (moderated) > t-statistic, the piecewise constant model using in the tilingArray > package is not useful. A running window approach (like in TAS) makes > sense, the hard part is of course tuning its parameters. > > AfaIk, there are methods for (i) and (ii) separately, and to join / > align them, the approaches are ad hoc. It would be nice if there were a > clean method that does (i) and (ii) jointly - maybe someone else has > insights in this? > > Best wishes > Wolfgang > > ------------------------------------------------------------------ > Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > > > 04/11/2008 16:42 Michael Palumbo scripsit > >> hello, >> >> i have general questions regarding the applicability of the tilingArray >> package to my problem/data. i've used bioconductor in the past, but by >> no means am i an expert. >> >> i have data from affy yeast tiling arrays - 3 mut and 3 wild type. i've >> run affy's TAS program on the CEL files - as a two sample analysis, ie, >> comparing wt to mut and viewed the results in IGB. my initial goal is to >> segment the results as was done in David et al, PNAS 2006. it seems to >> me there are fundamental differences in my data and the data of David et >> al. e.g., the normalization step described in tilingArray doc uses DNA >> hybridized to the chips as a reference - i don't have that, although i >> do have the wt data. a colleague thought i might be able to use the wt >> data in the normalization step, but that doesn't seem quite right to me. >> it is also described that normalization can occur by MM probes - maybe i >> can normalize the mut chip data w/ MM probes and completely ignore the >> wt data? i realize that if i did that, the result would no longer be a >> comparison of mut and wt and what i would 'see' would be different from >> what i currently see in IGB of the two sample TAS analysis. this also >> seems like it's not the best approach. >> >> on the other hand, again, all i really want to do is segment the >> two-sample analysis that i've done. is there anything wrong with using >> the results of TAS's analysis? TAS does a normalization and has >> bandwidth averaging - as a non-expert, these are convenient and seem >> good to me. >> >> thanks in advance for any and all responses/thoughts, >> mike palumbo >> >> -- Michael Palumbo palumbo at wadsworth.org Bioinformatics Core voice (518) 402-4587 Wadsworth Center fax (518) 402-4623 Center for Medical Science New York State Dept of Health 150 New Scotland Ave Albany, NY 12208 IMPORTANT NOTICE: This e-mail and any attachments may contain confidential or sensitive information which is, or may be, legally privileged or otherwise protected by law from further disclosure. It is intended only for the addressee. If you received this in error or from someone who was not authorized to send it to you, please do not distribute, copy or use it or any attachments. Please notify the sender immediately by reply e-mail and delete this from your system. Thank you for your cooperation.

ADD REPLY • link 16.1 years ago Michael Palumbo ▴ 50

0

Entering edit mode

Dear Michael, > thanks for your thoughts. i have to say i'm afraid i only sort of follow > what you've said. in an effort to clarify, it sounds like you've said > the methods in the tilingArray package probably aren't a good approach > to do the segmentation given the data i have. I think that, for the data you have, there are *two* different segmentation tasks. (i) segmentation of what is transcribed (at all) in each of the conditions (ii) identification of what is *differentially* transcribed between the conditions, The method in the tilingArray package was designed for (i). Perhaps it would be helpful if you could clarify which one you are after. Personally, I think that solving (ii) without at least giving some shot at (i) will leave you with biological interpretation problems, and underuses the data. > you've said TAS's approach to the segmentation might be good, but > finding the best parameters might be difficult. you've also said that > for (i) i could use the methods of David et al and Huber et al using MM > probes. if i do that, i'll be left with two separate collections (wt and > mut) of normalized data, which i'll then need to find (ii), ie, the > differentially transcribed regions and then segment those results. Yes. And, more precisely, you could get two separate collections of expressed segments (on for wt and one for mutant); then you need to find some sort of consensus segmentation that is a compromise and superset of them both, and then you can ask: - which segments have different expression levels between the two conditions - which segments change size (transcription start and stop sites) between the conditions But for this there is no readymade software that I am aware of. > the confusing part for me is connecting what you've said about TAS to > using the MM normalizing methods. i don't see how i could use the MM > normalizing methods and get 2 data sets of expression levels and then > use TAS to find the differentially transcribed data and segments. Me neither. > maybe > you're suggestion one or the other, ie, stick with TAS to do it all, Yes, that is an option. > or > use huber et al, MM for normalizing and then find some other method to > find the differentially transcribed regions and segmentation? Yes, that is another option. See above. It seems that the second option might turn out to be more flexible to adapt to your biological questions, and possibly more sensitive, but it's also more work for you. Best wishes Wolfgang ---------------------------------------------------- Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber >> Hi Michael, >> >> there are two separate issues: >> (i) finding the transcribed regions, separately in each of the samples >> (wt, mut). >> (ii) finding the differentially transcribed regions. >> >> For (i), you could use an approach similar to that in the David et al. >> and Huber et al. papers. Since you don't have the DNA reference hybes, >> you could use the MM probes. This is described in Section 4.2 of the >> vignette >> http://www.bioconductor.org/packages/2.3/bioc/vignettes/tilingArray /inst/doc/assessNorm.pdf >> >> and as the benchmarks in Section 5 show, it is not quite as good, but >> still pretty good. >> >> Don't think of this in terms of "normalising" the mutant against the >> "wt" type, that doesn't make much sense. >> >> For (ii), if you want to segment e.g. a probe-wise (moderated) >> t-statistic, the piecewise constant model using in the tilingArray >> package is not useful. A running window approach (like in TAS) makes >> sense, the hard part is of course tuning its parameters. >> >> AfaIk, there are methods for (i) and (ii) separately, and to join / >> align them, the approaches are ad hoc. It would be nice if there were a >> clean method that does (i) and (ii) jointly - maybe someone else has >> insights in this? >> >> Best wishes >> Wolfgang >> >> ------------------------------------------------------------------ >> Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber >> >> >> 04/11/2008 16:42 Michael Palumbo scripsit >> >>> hello, >>> >>> i have general questions regarding the applicability of the tilingArray >>> package to my problem/data. i've used bioconductor in the past, but by >>> no means am i an expert. >>> >>> i have data from affy yeast tiling arrays - 3 mut and 3 wild type. i've >>> run affy's TAS program on the CEL files - as a two sample analysis, ie, >>> comparing wt to mut and viewed the results in IGB. my initial goal is to >>> segment the results as was done in David et al, PNAS 2006. it seems to >>> me there are fundamental differences in my data and the data of David et >>> al. e.g., the normalization step described in tilingArray doc uses DNA >>> hybridized to the chips as a reference - i don't have that, although i >>> do have the wt data. a colleague thought i might be able to use the wt >>> data in the normalization step, but that doesn't seem quite right to me. >>> it is also described that normalization can occur by MM probes - maybe i >>> can normalize the mut chip data w/ MM probes and completely ignore the >>> wt data? i realize that if i did that, the result would no longer be a >>> comparison of mut and wt and what i would 'see' would be different from >>> what i currently see in IGB of the two sample TAS analysis. this also >>> seems like it's not the best approach. >>> >>> on the other hand, again, all i really want to do is segment the >>> two-sample analysis that i've done. is there anything wrong with using >>> the results of TAS's analysis? TAS does a normalization and has >>> bandwidth averaging - as a non-expert, these are convenient and seem >>> good to me. >>> >>> thanks in advance for any and all responses/thoughts, >>> mike palumbo >>> >>> >

ADD REPLY • link 16.1 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

wolfgang, thanks again for the thoughtful response. i think i have a much better understanding of what you've said. more info... > I think that, for the data you have, there are *two* different > segmentation tasks. > (i) segmentation of what is transcribed (at all) in each of the > conditions > (ii) identification of what is *differentially* transcribed between > the conditions, > > The method in the tilingArray package was designed for (i). Perhaps it > would be helpful if you could clarify which one you are after. > Personally, I think that solving (ii) without at least giving some > shot at (i) will leave you with biological interpretation problems, > and underuses the data. i am trying to determine (ii). more precisely - i want to segment what is differentially transcribed; and, to say that in a different way: i'd like to identify the locations along the plot of differential transcription where change points occur. i think this wording works more when thinking of a two sample TAS analysis - were the signal result *is* a measure of the differential transcription. what's i've said doesn't make as much sense in relation to your suggestion of using the tilingArray package and getting two sets of transcription levels (although, what you've said makes perfect sense): > [...snip...] And, more precisely, you could get two separate > collections of expressed segments (one for wt and one for mutant); > then you need to find some sort of consensus segmentation that is a > compromise and superset of them both, and then you can ask: > - which segments have different expression levels between the two > conditions > - which segments change size (transcription start and stop sites) > between the conditions > But for this there is no readymade software that I am aware of. what you've described above seems thorough and a very good approach. re: TAS "doing it all" - when we've mentioned TAS doing the segmentation of the differentially expressed regions, i've been thinking of the 'interval analysis' function in TAS. it just occurred to me that the interval analysis is really just a thresholding sort of thing. all it does is identity regions above/below some user defined value. this is very different from the type of segmentation used in Huber, et al, which is more like change point identification. is there some other function in TAS that i'm not aware of that performs this more complex segmentation? (i suppose this is the wrong place to as that question). tia, mike palumbo Wolfgang Huber wrote: > Dear Michael, > >> thanks for your thoughts. i have to say i'm afraid i only sort of >> follow what you've said. in an effort to clarify, it sounds like >> you've said the methods in the tilingArray package probably aren't a >> good approach to do the segmentation given the data i have. > > I think that, for the data you have, there are *two* different > segmentation tasks. > > (i) segmentation of what is transcribed (at all) in each of the > conditions > > (ii) identification of what is *differentially* transcribed between > the conditions, > > The method in the tilingArray package was designed for (i). Perhaps it > would be helpful if you could clarify which one you are after. > Personally, I think that solving (ii) without at least giving some > shot at (i) will leave you with biological interpretation problems, > and underuses the data. > >> you've said TAS's approach to the segmentation might be good, but >> finding the best parameters might be difficult. you've also said that >> for (i) i could use the methods of David et al and Huber et al using >> MM probes. if i do that, i'll be left with two separate collections >> (wt and mut) of normalized data, which i'll then need to find (ii), >> ie, the differentially transcribed regions and then segment those >> results. > > Yes. And, more precisely, you could get two separate collections of > expressed segments (on for wt and one for mutant); then you need to > find some sort of consensus segmentation that is a compromise and > superset of them both, and then you can ask: > - which segments have different expression levels between the two > conditions > - which segments change size (transcription start and stop sites) > between the conditions > But for this there is no readymade software that I am aware of. > >> the confusing part for me is connecting what you've said about TAS to >> using the MM normalizing methods. i don't see how i could use the MM >> normalizing methods and get 2 data sets of expression levels and then >> use TAS to find the differentially transcribed data and segments. > > Me neither. > >> maybe you're suggestion one or the other, ie, stick with TAS to do it >> all, > > Yes, that is an option. > >> or use huber et al, MM for normalizing and then find some other >> method to find the differentially transcribed regions and segmentation? > > Yes, that is another option. See above. It seems that the second > option might turn out to be more flexible to adapt to your biological > questions, and possibly more sensitive, but it's also more work for you. > > Best wishes > Wolfgang > > ---------------------------------------------------- > Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber > > > >>> Hi Michael, >>> >>> there are two separate issues: >>> (i) finding the transcribed regions, separately in each of the samples >>> (wt, mut). >>> (ii) finding the differentially transcribed regions. >>> >>> For (i), you could use an approach similar to that in the David et al. >>> and Huber et al. papers. Since you don't have the DNA reference hybes, >>> you could use the MM probes. This is described in Section 4.2 of the >>> vignette >>> http://www.bioconductor.org/packages/2.3/bioc/vignettes/tilingArra y/inst/doc/assessNorm.pdf >>> >>> and as the benchmarks in Section 5 show, it is not quite as good, but >>> still pretty good. >>> >>> Don't think of this in terms of "normalising" the mutant against the >>> "wt" type, that doesn't make much sense. >>> >>> For (ii), if you want to segment e.g. a probe-wise (moderated) >>> t-statistic, the piecewise constant model using in the tilingArray >>> package is not useful. A running window approach (like in TAS) makes >>> sense, the hard part is of course tuning its parameters. >>> >>> AfaIk, there are methods for (i) and (ii) separately, and to join / >>> align them, the approaches are ad hoc. It would be nice if there were a >>> clean method that does (i) and (ii) jointly - maybe someone else has >>> insights in this? >>> >>> Best wishes >>> Wolfgang >>> >>> ------------------------------------------------------------------ >>> Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber >>> >>> >>> 04/11/2008 16:42 Michael Palumbo scripsit >>> >>>> hello, >>>> >>>> i have general questions regarding the applicability of the >>>> tilingArray >>>> package to my problem/data. i've used bioconductor in the past, but by >>>> no means am i an expert. >>>> >>>> i have data from affy yeast tiling arrays - 3 mut and 3 wild type. >>>> i've >>>> run affy's TAS program on the CEL files - as a two sample analysis, >>>> ie, >>>> comparing wt to mut and viewed the results in IGB. my initial goal >>>> is to >>>> segment the results as was done in David et al, PNAS 2006. it seems to >>>> me there are fundamental differences in my data and the data of >>>> David et >>>> al. e.g., the normalization step described in tilingArray doc uses DNA >>>> hybridized to the chips as a reference - i don't have that, although i >>>> do have the wt data. a colleague thought i might be able to use the wt >>>> data in the normalization step, but that doesn't seem quite right >>>> to me. >>>> it is also described that normalization can occur by MM probes - >>>> maybe i >>>> can normalize the mut chip data w/ MM probes and completely ignore the >>>> wt data? i realize that if i did that, the result would no longer be a >>>> comparison of mut and wt and what i would 'see' would be different >>>> from >>>> what i currently see in IGB of the two sample TAS analysis. this also >>>> seems like it's not the best approach. >>>> >>>> on the other hand, again, all i really want to do is segment the >>>> two-sample analysis that i've done. is there anything wrong with using >>>> the results of TAS's analysis? TAS does a normalization and has >>>> bandwidth averaging - as a non-expert, these are convenient and seem >>>> good to me. >>>> >>>> thanks in advance for any and all responses/thoughts, >>>> mike palumbo >>>> >>>> >> -- Michael Palumbo palumbo at wadsworth.org Bioinformatics Core voice (518) 402-4587 Wadsworth Center fax (518) 402-4623 Center for Medical Science New York State Dept of Health 150 New Scotland Ave Albany, NY 12208 IMPORTANT NOTICE: This e-mail and any attachments may contain confidential or sensitive information which is, or may be, legally privileged or otherwise protected by law from further disclosure. It is intended only for the addressee. If you received this in error or from someone who was not authorized to send it to you, please do not distribute, copy or use it or any attachments. Please notify the sender immediately by reply e-mail and delete this from your system. Thank you for your cooperation.

ADD REPLY • link 16.0 years ago Michael Palumbo ▴ 50