import data for limma analysis
2
0
Entering edit mode
@tineke-casneuf-655
Last seen 10.1 years ago
This question may have been asked, but I can't find the asnwer on the mail archive. I am trying to find a way to identify differentially expressed genes; I was recommended to use limma. I read in the manual that I can use data analysed with other packages (like affy and others) or that I can import data from a software program, like ArrayVision, ImageGene, GenePix and others. But I have Affymetrix data that are already normalised (with Affy software); I have also calculated averages (for the replicates) and ratios (between experiments and controls). All my data are in csv- files. I was wondering whether I can import it from my own csv files and how to do this (read.table() for the data and scan() for the gene names) . Or should I import the data from the experiments and controls separatly and then go on using the commands in the tutorial (model.matrix, c, lmFit, makeContrasts, contrasts.fit, eBayes) or do I need to do more with my data or maybe omit steps? Thank you very much in advance! -- ================================================================== Tineke Casneuf Tel: 32 (0)9 3313692 DEPARTMENT OF PLANT SYSTEMS BIOLOGY Fax:32 (0)9 3313809 GHENT UNIVERSITY/VIB, Technology Park 927, B-9052 Gent, Belgium Vlaams Interuniversitair Instituut voor Biotechnologie VIB e-mail:ticas@psb.ugent.be http://www.psb.ugent.be/bioinformatics/
GO affy limma GO affy limma • 1.9k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 12 hours ago
WEHI, Melbourne, Australia
At 11:20 PM 1/03/2004, Tineke Casneuf wrote: >This question may have been asked, but I can't find the asnwer on the >mail archive. I am trying to find a way to identify differentially >expressed genes; I was recommended to use limma. I read in the manual >that I can use data analysed with other packages (like affy and others) >or that I can import data from a software program, like ArrayVision, >ImageGene, GenePix and others. > >But I have Affymetrix data that are already normalised (with Affy >software); Just import your data into R using read.table() etc. All you need is a matrix of log-expression values. >I have also calculated averages (for the replicates) and >ratios (between experiments and controls). No one can help you very much if you've already summarized your data in this way. You've effectively already done your own analysis. If you want help from limma or other analysis packages, you need to go back to the original normalized data as it came out of the MAS program. Gordon > All my data are in csv-files. >I was wondering whether I can import it from my own csv files and how to >do this (read.table() for the data and scan() for the gene names) . Or >should I import the data from the experiments and controls separatly and >then go on using the commands in the tutorial (model.matrix, c, lmFit, >makeContrasts, contrasts.fit, eBayes) or do I need to do more with my >data or maybe omit steps? > >Thank you very much in advance! > >-- >================================================================== >Tineke Casneuf Tel: 32 (0)9 3313692 >DEPARTMENT OF PLANT SYSTEMS BIOLOGY Fax:32 (0)9 3313809 >GHENT UNIVERSITY/VIB, Technology Park 927, B-9052 Gent, Belgium >Vlaams Interuniversitair Instituut voor Biotechnologie VIB >e-mail:ticas@psb.ugent.be http://www.psb.ugent.be/bioinformatics/
ADD COMMENT
0
Entering edit mode
@tineke-casneuf-655
Last seen 10.1 years ago
The problem is that I have a very large dataset, with more than 50 experiments (so 300 datasets, since replicates were used). I have written some scripts, so that calculating the averages and ratios is easy and fast. If it would be better for my research however to analyse the data with bioconductor (thereby I mean the functions model.matrix, makeContrasts, contrasts.fit), I still can do that (it would just take more time). I don't really understand what the 'advantage' is of using the linear fit. I know what a linear fit is, but I understand how it is done on microarray data, or what happens to the data. If it's not needed I guess I can do this (please correct me if I'm wrong): - import the data (genes horizontal and experiments vertical), with the names of the genes as row names and the log ratios in the table: > scan("list_of_genes", what = "list") -> genes > read.table(file = "signals_in_table", row.names = genes) -> data [- Instead of using the expressions in the limma tutorial: > data <- ReadAffy() > eset <- rma(data)] [- So since I have already calculated the averages of the replicates and the ratios between experiments and controls, I guess I can also skip these commands: > design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) > colnames(design) <- c("group1", "group2", "group3") > contrast.matrix <- makeContrasts(group2-group1, group3-group3, group3-group1, levels=design)] > fit <- lmFit(data, design) > fit2 <- contrasts.fit(fit, contrast.matrix)] - To find the differential expressed genes, can I just perform this function on my data? > fit2 <- eBayes(data) Tineke -- ================================================================== Tineke Casneuf Tel: 32 (0)9 3313692 DEPARTMENT OF PLANT SYSTEMS BIOLOGY Fax:32 (0)9 3313809 GHENT UNIVERSITY/VIB, Technology Park 927, B-9052 Gent, Belgium Vlaams Interuniversitair Instituut voor Biotechnologie VIB e-mail:ticas@psb.ugent.be http://www.psb.ugent.be/bioinformatics/
ADD COMMENT
0
Entering edit mode
At 08:23 PM 3/03/2004, Tineke Casneuf wrote: >The problem is that I have a very large >dataset, with more than 50 experiments (so 300 datasets, since replicates >were used). I have written some scripts, so that calculating the averages and > >ratios is easy and fast. If it would be better for my research however to >analyse the data with bioconductor (thereby I mean the functions >model.matrix, makeContrasts, contrasts.fit), I still can do that (it would >just take more time). I don't really understand what the 'advantage' is of >using the linear fit. I know what a linear fit is, but I understand how it is > >done on microarray data, or what happens to the data. > >If it's not needed I guess I can do this (please correct me if I'm wrong): >- import the data (genes horizontal and experiments vertical), with the names > >of the genes as row names and the log ratios in the table: > > scan("list_of_genes", what = "list") -> genes > > read.table(file = "signals_in_table", row.names = genes) -> data > >[- Instead of using the expressions in the limma tutorial: > > data <- ReadAffy() > > eset <- rma(data)] > >[- So since I have already calculated the averages of the replicates and the >ratios between experiments and controls, I guess I can also skip these >commands: > > design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) > > colnames(design) <- c("group1", "group2", "group3") > > contrast.matrix <- makeContrasts(group2-group1, group3-group3, >group3-group1, levels=design)] > > fit <- lmFit(data, design) > > fit2 <- contrasts.fit(fit, contrast.matrix)] > >- To find the differential expressed genes, can I just perform this function >on my data? > > fit2 <- eBayes(data) Err ... No. As I said in my reply to you on this list a few days ago, given your summarized data you can't do any statistical analysis. Just stick with the averages and log-ratios you've computed yourself. Gordon >Tineke > >-- >================================================================== >Tineke Casneuf Tel: 32 (0)9 3313692 >DEPARTMENT OF PLANT SYSTEMS BIOLOGY Fax:32 (0)9 3313809 >GHENT UNIVERSITY/VIB, Technology Park 927, B-9052 Gent, Belgium >Vlaams Interuniversitair Instituut voor Biotechnologie VIB >e-mail:ticas@psb.ugent.be http://www.psb.ugent.be/bioinformatics/
ADD REPLY

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6