edgeR: non-unique values when setting 'row.names' - but the row names are unique!!!
3
0
Entering edit mode
ccheung • 0
@ccheung-7248
Last seen 9.2 years ago
European Union

Hi,

I'm having problems with the DGEList function in edgeR.  Here are the commands that I had input:

library(edgeR)
raw.data <- read.table(file = "Documents/.../myfile.csv", header=TRUE, sep=",")
Data <- raw.data[, 2:45]
rownames( Data ) <- raw.data[ , 1 ]
colnames(Data) <- paste (c("ML1,ML32,ML4,ML29,etc"), sep="")
groups <- c(rep("1",11), rep("2",33))
DGE1 <- DGEList(counts = Data , group = groups )

At this point, it keeps on giving me this error message:

Error in `row.names<-.data.frame`(`*tmp*`, value = c("ML1,ML32,ML4,ML29,etc",  :
  duplicate 'row.names' are not allowed
non-unique values when setting 'row.names': 

But I know for sure that my row names are unique!  Any advice would be appreciated. Thanx.

carol

edgeR row.names DGEList • 21k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

The hint here is that the row.names in the error are actually the column names for your data matrix! One of the things that happens when you run DGEList() is that a 'samples' data frame is constructed, and the row.names of that samples data.frame are the column names of your data.

If you have duplicate column names (and you do), then this will result in an error. You shouldn't have duplicate column names anyway (you are calling two samples by the same name), so fix that and the error will go away.

ADD COMMENT
0
Entering edit mode
ccheung • 0
@ccheung-7248
Last seen 9.2 years ago
European Union

Hi,

Thanx John for your answer!  However, I double-checked and I am pretty sure that both the column and the row names are unique.  Just to be sure, I even put in the command,

rownames(df) = make.names(nams, unique=TRUE)

but to no avail.....

Any other ideas? Thanx.

 

carol

 

ADD COMMENT
0
Entering edit mode

I am not sure how you can be 'pretty sure' that the column names are unique. Either they are or they are not. Something like

any(duplicated(colnames(Data)))

will tell you for sure. And note that I am talking about the column names, not row names, so ensuring that the row names are unique is not helpful.

But I am still sure that you DO have duplicated column names, and I can replicate exactly the error you get by trying to create a DGEList with duplicated column names:

> mat <- matrix(rnorm(1e5), ncol = 10)
> colnames(mat) <- paste0("ML", c(1:9,1))
> colnames(mat)
 [1] "ML1" "ML2" "ML3" "ML4" "ML5" "ML6" "ML7" "ML8" "ML9" "ML1"
> dglst <- DGEList(mat)
Error in `row.names<-.data.frame`(`*tmp*`, value = c("ML1", "ML2", "ML3",  :
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘ML1’

Also, I am assuming that the code you show

colnames(Data) <- paste (c("ML1,ML32,ML4,ML29,etc"), sep="")

isn't really what you have done, because that won't work unless you have just a single column. In other words,

> paste (c("ML1,ML32,ML4,ML29,etc"), sep="")
[1] "ML1,ML32,ML4,ML29,etc"

is a character vector of length one, so you cannot set the column names for a 44 column matrix using that command.

ADD REPLY
0
Entering edit mode
ccheung • 0
@ccheung-7248
Last seen 9.2 years ago
European Union

Hi,

Haha, OK, I'm absolutely positive that the column names are not duplicated. With regard to your last comment, in fact, that is what I had input....Perhaps that is the problem. I will put in another command according to the edgeR vignette and see if that'll fix it.  Thanx!

carol

carol

ADD COMMENT

Login before adding your answer.

Traffic: 754 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6