SCAN.UPC for microarray and RNAseq data
1
0
Entering edit mode
shirley zhang ★ 1.0k
@shirley-zhang-2038
Last seen 10.2 years ago
Dear Steve, I have a very large Affy Exon array data including >1000 samples, and I would like to compare them with RNAseq data. 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in which each row is for one gene, and each column for each sample. Is it possilbe for UPC taking RPKM value as input? 2. My exon array data has been preprocessed using RMA, and adjusted for many technical variables. So I have a data matrix with gene-level RMA log2 value for each gene across all 1,000 samples. Can I use UPC normalize my data by directly using this data matrix as an input file. Many thanks, Shirley [[alternative HTML version deleted]]
RNASeq affy RNASeq affy • 2.0k views
ADD COMMENT
0
Entering edit mode
@stephen-piccolo-6761
Last seen 4.2 years ago
United States
Hi Shirley, 1. We recommend that you use raw RNA-Seq counts as input to UPC_RNASeq. However, you could try using RPKM values and see how it works. My guess is that the results will be comparable, but I have not tested this extensively. In this case, it may not be necessary to correct for gene length, but you may still want to correct for GC content. 2. Yes, in the latest version we added functions called UPC_Generic and UPC_Generic_ExpressionSet that are designed to UPC normalize any type of data, even if it has been pre-normalized. Give that a try and let me know if you have any questions. -Steve From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 6:08 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: SCAN.UPC for microarray and RNAseq data Dear Steve, I have a very large Affy Exon array data including >1000 samples, and I would like to compare them with RNAseq data. 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in which each row is for one gene, and each column for each sample. Is it possilbe for UPC taking RPKM value as input? 2. My exon array data has been preprocessed using RMA, and adjusted for many technical variables. So I have a data matrix with gene-level RMA log2 value for each gene across all 1,000 samples. Can I use UPC normalize my data by directly using this data matrix as an input file. Many thanks, Shirley [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Steve, Many thanks for your quick response. For RNAseq data, I have to compare my own RNAseq data with public RNAseq data which contains > 3000 samples. For this comparison, 1. I am planing to try UPC_RNASeq with read.counts as input for my own data. 2. However, the public large RNAseq data use RKPM value. Do you think it is reasonable by comparing UPCs generating from different type of input values? Many thanks, Shirley On Fri, Jun 27, 2014 at 8:49 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu> wrote: > Hi Shirley, > > 1. We recommend that you use raw RNA-Seq counts as input to UPC_RNASeq. > However, you could try using RPKM values and see how it works. My guess is > that the results will be comparable, but I have not tested this > extensively. In this case, it may not be necessary to correct for gene > length, but you may still want to correct for GC content. > > 2. Yes, in the latest version we added functions called UPC_Generic and > UPC_Generic_ExpressionSet that are designed to UPC normalize any type of > data, even if it has been pre-normalized. Give that a try and let me know > if you have any questions. > > -Steve > > From: shirley zhang <shirley0818@gmail.com> > Date: Friday, June 27, 2014 at 6:08 AM > To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> > Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> > Subject: SCAN.UPC for microarray and RNAseq data > > Dear Steve, > > I have a very large Affy Exon array data including >1000 samples, and I > would like to compare them with RNAseq data. > > 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in > which each row is for one gene, and each column for each sample. Is it > possilbe for UPC taking RPKM value as input? > > 2. My exon array data has been preprocessed using RMA, and adjusted for > many technical variables. So I have a data matrix with gene-level RMA log2 > value for each gene across all 1,000 samples. Can I use UPC normalize my > data by directly using this data matrix as an input file. > > Many thanks, > Shirley > > > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
If possible, I�d suggest converting your values to RPKM for consistency. UPC should be robust to these differences. However, I have not evaluated this extensively. From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 6:59 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: Re: SCAN.UPC for microarray and RNAseq data Hi Steve, Many thanks for your quick response. For RNAseq data, I have to compare my own RNAseq data with public RNAseq data which contains > 3000 samples. For this comparison, 1. I am planing to try UPC_RNASeq with read.counts as input for my own data. 2. However, the public large RNAseq data use RKPM value. Do you think it is reasonable by comparing UPCs generating from different type of input values? Many thanks, Shirley On Fri, Jun 27, 2014 at 8:49 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> wrote: Hi Shirley, 1. We recommend that you use raw RNA-Seq counts as input to UPC_RNASeq. However, you could try using RPKM values and see how it works. My guess is that the results will be comparable, but I have not tested this extensively. In this case, it may not be necessary to correct for gene length, but you may still want to correct for GC content. 2. Yes, in the latest version we added functions called UPC_Generic and UPC_Generic_ExpressionSet that are designed to UPC normalize any type of data, even if it has been pre-normalized. Give that a try and let me know if you have any questions. -Steve From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 6:08 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: SCAN.UPC for microarray and RNAseq data Dear Steve, I have a very large Affy Exon array data including >1000 samples, and I would like to compare them with RNAseq data. 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in which each row is for one gene, and each column for each sample. Is it possilbe for UPC taking RPKM value as input? 2. My exon array data has been preprocessed using RMA, and adjusted for many technical variables. So I have a data matrix with gene-level RMA log2 value for each gene across all 1,000 samples. Can I use UPC normalize my data by directly using this data matrix as an input file. Many thanks, Shirley [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
If I need to convert my values to RPKM in order to be comparable/consistent with the public RNAseq RPKM values, then I could compare both RPKM values directly, not even bother to use UPC values. What do you think? Many thanks, Shirley On Fri, Jun 27, 2014 at 9:14 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu> wrote: > If possible, I’d suggest converting your values to RPKM for consistency. > UPC should be robust to these differences. However, I have not evaluated > this extensively. > > > From: shirley zhang <shirley0818@gmail.com> > Date: Friday, June 27, 2014 at 6:59 AM > > To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> > Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> > Subject: Re: SCAN.UPC for microarray and RNAseq data > > Hi Steve, > > Many thanks for your quick response. > > For RNAseq data, I have to compare my own RNAseq data with public RNAseq > data which contains > 3000 samples. For this comparison, > > 1. I am planing to try UPC_RNASeq with read.counts as input for my own > data. > 2. However, the public large RNAseq data use RKPM value. > > Do you think it is reasonable by comparing UPCs generating from different > type of input values? > > Many thanks, > Shirley > > > > On Fri, Jun 27, 2014 at 8:49 AM, Steve Piccolo < > stephen.piccolo@hsc.utah.edu> wrote: > >> Hi Shirley, >> >> 1. We recommend that you use raw RNA-Seq counts as input to UPC_RNASeq. >> However, you could try using RPKM values and see how it works. My guess is >> that the results will be comparable, but I have not tested this >> extensively. In this case, it may not be necessary to correct for gene >> length, but you may still want to correct for GC content. >> >> 2. Yes, in the latest version we added functions called UPC_Generic and >> UPC_Generic_ExpressionSet that are designed to UPC normalize any type of >> data, even if it has been pre-normalized. Give that a try and let me know >> if you have any questions. >> >> -Steve >> >> From: shirley zhang <shirley0818@gmail.com> >> Date: Friday, June 27, 2014 at 6:08 AM >> To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> >> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> >> Subject: SCAN.UPC for microarray and RNAseq data >> >> Dear Steve, >> >> I have a very large Affy Exon array data including >1000 samples, and I >> would like to compare them with RNAseq data. >> >> 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in >> which each row is for one gene, and each column for each sample. Is it >> possilbe for UPC taking RPKM value as input? >> >> 2. My exon array data has been preprocessed using RMA, and adjusted for >> many technical variables. So I have a data matrix with gene-level RMA log2 >> value for each gene across all 1,000 samples. Can I use UPC normalize my >> data by directly using this data matrix as an input file. >> >> Many thanks, >> Shirley >> >> >> >> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Yes, unless you want to compare directly against the exon array data. Having said that, if it would be a lot of work to convert your data to RPKM, it might be worth a try to use UPC as a way to deal with the fact that even among RNA-Seq data sets there is variability in how data are obtained and processed, so UPCs can be a way to overcome such differences. Regards, -Steve From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 7:20 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: Re: SCAN.UPC for microarray and RNAseq data If I need to convert my values to RPKM in order to be comparable/consistent with the public RNAseq RPKM values, then I could compare both RPKM values directly, not even bother to use UPC values. What do you think? Many thanks, Shirley On Fri, Jun 27, 2014 at 9:14 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> wrote: If possible, I’d suggest converting your values to RPKM for consistency. UPC should be robust to these differences. However, I have not evaluated this extensively. From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 6:59 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: Re: SCAN.UPC for microarray and RNAseq data Hi Steve, Many thanks for your quick response. For RNAseq data, I have to compare my own RNAseq data with public RNAseq data which contains > 3000 samples. For this comparison, 1. I am planing to try UPC_RNASeq with read.counts as input for my own data. 2. However, the public large RNAseq data use RKPM value. Do you think it is reasonable by comparing UPCs generating from different type of input values? Many thanks, Shirley On Fri, Jun 27, 2014 at 8:49 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> wrote: Hi Shirley, 1. We recommend that you use raw RNA-Seq counts as input to UPC_RNASeq. However, you could try using RPKM values and see how it works. My guess is that the results will be comparable, but I have not tested this extensively. In this case, it may not be necessary to correct for gene length, but you may still want to correct for GC content. 2. Yes, in the latest version we added functions called UPC_Generic and UPC_Generic_ExpressionSet that are designed to UPC normalize any type of data, even if it has been pre-normalized. Give that a try and let me know if you have any questions. -Steve From: shirley zhang <shirley0818@gmail.com<mailto:shirley0818@gmail.com>> Date: Friday, June 27, 2014 at 6:08 AM To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu<mailto:stephen.piccolo@hsc.utah.edu>> Cc: "bioconductor@r-project.org<mailto:bioconductor@r-project.org>" <bioconductor@r-project.org<mailto:bioconductor@r-project.org>> Subject: SCAN.UPC for microarray and RNAseq data Dear Steve, I have a very large Affy Exon array data including >1000 samples, and I would like to compare them with RNAseq data. 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in which each row is for one gene, and each column for each sample. Is it possilbe for UPC taking RPKM value as input? 2. My exon array data has been preprocessed using RMA, and adjusted for many technical variables. So I have a data matrix with gene-level RMA log2 value for each gene across all 1,000 samples. Can I use UPC normalize my data by directly using this data matrix as an input file. Many thanks, Shirley [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks Steve. I will give your suggestion a try. Shirley On Fri, Jun 27, 2014 at 9:26 AM, Steve Piccolo <stephen.piccolo@hsc.utah.edu> wrote: > Yes, unless you want to compare directly against the exon array data. > > Having said that, if it would be a lot of work to convert your data to > RPKM, it might be worth a try to use UPC as a way to deal with the fact > that even among RNA-Seq data sets there is variability in how data are > obtained and processed, so UPCs can be a way to overcome such differences. > > Regards, > -Steve > > From: shirley zhang <shirley0818@gmail.com> > Date: Friday, June 27, 2014 at 7:20 AM > > To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> > Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> > Subject: Re: SCAN.UPC for microarray and RNAseq data > > If I need to convert my values to RPKM in order to be > comparable/consistent with the public RNAseq RPKM values, then I could > compare both RPKM values directly, not even bother to use UPC values. What > do you think? > > Many thanks, > Shirley > > > On Fri, Jun 27, 2014 at 9:14 AM, Steve Piccolo < > stephen.piccolo@hsc.utah.edu> wrote: > >> If possible, I’d suggest converting your values to RPKM for >> consistency. UPC should be robust to these differences. However, I have not >> evaluated this extensively. >> >> >> From: shirley zhang <shirley0818@gmail.com> >> Date: Friday, June 27, 2014 at 6:59 AM >> >> To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> >> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> >> Subject: Re: SCAN.UPC for microarray and RNAseq data >> >> Hi Steve, >> >> Many thanks for your quick response. >> >> For RNAseq data, I have to compare my own RNAseq data with public RNAseq >> data which contains > 3000 samples. For this comparison, >> >> 1. I am planing to try UPC_RNASeq with read.counts as input for my own >> data. >> 2. However, the public large RNAseq data use RKPM value. >> >> Do you think it is reasonable by comparing UPCs generating from different >> type of input values? >> >> Many thanks, >> Shirley >> >> >> >> On Fri, Jun 27, 2014 at 8:49 AM, Steve Piccolo < >> stephen.piccolo@hsc.utah.edu> wrote: >> >>> Hi Shirley, >>> >>> 1. We recommend that you use raw RNA-Seq counts as input to >>> UPC_RNASeq. However, you could try using RPKM values and see how it works. >>> My guess is that the results will be comparable, but I have not tested this >>> extensively. In this case, it may not be necessary to correct for gene >>> length, but you may still want to correct for GC content. >>> >>> 2. Yes, in the latest version we added functions called UPC_Generic >>> and UPC_Generic_ExpressionSet that are designed to UPC normalize any type >>> of data, even if it has been pre-normalized. Give that a try and let me >>> know if you have any questions. >>> >>> -Steve >>> >>> From: shirley zhang <shirley0818@gmail.com> >>> Date: Friday, June 27, 2014 at 6:08 AM >>> To: Stephen Piccolo <stephen.piccolo@hsc.utah.edu> >>> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> >>> Subject: SCAN.UPC for microarray and RNAseq data >>> >>> Dear Steve, >>> >>> I have a very large Affy Exon array data including >1000 samples, and I >>> would like to compare them with RNAseq data. >>> >>> 1. In the SCAN..vignette.pdf, UPC_RNASeq can take read.counts matrix in >>> which each row is for one gene, and each column for each sample. Is it >>> possilbe for UPC taking RPKM value as input? >>> >>> 2. My exon array data has been preprocessed using RMA, and adjusted for >>> many technical variables. So I have a data matrix with gene-level RMA log2 >>> value for each gene across all 1,000 samples. Can I use UPC normalize my >>> data by directly using this data matrix as an input file. >>> >>> Many thanks, >>> Shirley >>> >>> >>> >>> >> > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6