Problem using the readDGE function
2
0
Entering edit mode
@humberto_munoz-10903
Last seen 8.4 years ago

I plan to use the readDGE function with two CSV files containing gene counts from different samples. One has 2814 genes and the other 2809 genes. The files are on my Desktop and this is the error that I get: 

> files <- dir(pattern="*\\.csv$")


> RG <- readDGE(files)
Error in `[.data.frame`(d[[i]], , columns[2]) : 
  undefined columns selected

How I can fix the error?

readDGE edgeR • 3.3k views
ADD COMMENT
0
Entering edit mode

How many columns does each of the two files have? What are the column headings?

ADD REPLY
0
Entering edit mode

Each file has two columns, the first are the Gene IDs and the second Gene Read Counts. Is the function readDGE creating a DGEList that includes all genes that have at least one count in one of the samples? 

ADD REPLY
0
Entering edit mode

Aaron I followed your comment and I got these results. How I can see all rows in data 2, or to compute its size. Also I want to do the TMM normalization, but I got the error message below.

> readDGE(data2, sep=",")
An object of class "DGEList"
$samples
                            files group lib.size norm.factors
Dark Aerobic     Dark Aerobic.csv     1   481909            1
Dark Anaerobic Dark Anaerobic.csv     1  1033135            1

$counts
           Samples
Tags        Dark Aerobic Dark Anaerobic
  641610012           17             28
  641610013           55             36
  641610014          331           1551
  641610015         1005           2292
  641610016           96            136
2816 more rows ...

> y<-calcNormFactors(data2, method=c("TMM","RLE","upperquartile","none"),
+                    refColumn=NULL, logratioTrim=.3, sumTrim=0.05, doWeighting=TRUE,
+                    Acutoff=-1e10, p=0.75)
Error in colSums(x) : 'x' must be numeric

Thanks for your helpful comments.

ADD REPLY
0
Entering edit mode

Humberto, please use the "ADD COMMENT" button to add your replies, rather than adding your replies as Answers. I have been moving your answers back here as to be comments on the original questions.

ADD REPLY
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 52 minutes ago
The city by the bay

readDGE expects that each file is tab-separated and contains at least two columns (one of gene names/IDs and another of gene counts). It seems that your files do not follow this format, i.e., fewer columns than expected. This is probably because a different separator is involved - for CSV files, you should set sep="," in the readDGE call, as is mentioned in the documentation for the function. Also see the columns argument in ?readDGE if there are more than two columns and the first two do not correspond to the IDs and counts.

ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 47 minutes ago
WEHI, Melbourne, Australia

There a few problems here:

First, as already noted, you need to specify sep="," because you have a comma-separated file.

Second, there is a problem with your files. Somewhere in one of your data files you have a character entered where you should have a number. Have a look especially at the last row of your files, as that is often the culprit. Check that your files don't contain any unnecessary spaces, because a space will be read as a character.

Third, you only have two samples in total, meaning the sample size is n=1 in each group. In other words you have no replication. So there isn't much analysis that edgeR will be able to do for you, because edgeR is designed to work with biological replicates.

ADD COMMENT
0
Entering edit mode

Actually, I have sample studies of an experiment with 11 different conditions and not biological replicates. My intention is to apply TMM normalization considering the first sample as the reference (Dark Aerobic). First, I'm trying with the first two samples (Dark Aerobic and Dark Anaerobic). According to your last commend, this TMM normalization is not applicable with these data sets.   

ADD REPLY
0
Entering edit mode

You are mis-interpreting my answer. Your difficulties and my answer having nothing to do with the TMM method.

ADD REPLY

Login before adding your answer.

Traffic: 769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6