How to create a matrix from a set of files containing counts for a set of attributes?
1
0
Entering edit mode
jdanowski • 0
@jdanowski-23686
Last seen 4.3 years ago

I have a set of files, one for each African country, that contain online news themes. https://drive.google.com/drive/folders/1nJvLwVDAdaMhRqTDMP6bP8sU8lgKLT1Y?usp=sharing

I would like to create a matrix where each row is a country and the columns are the aggregate themes.

The package "edgeR" that looks like it will do this. When I install it and try to run the function "readDGE" to produce the matrix I get errors. Here is the function:

readDGE(files, path=NULL, columns=c(1,2), group=NULL, labels=NULL, ...)

Error: Error: '...' used in an incorrect context

I tried it without the ellipsis and got this: Error in is.data.frame(files) : object 'files' not found

How can I get this working? Or, is there a better way to create the matrix in another package?

edgeR • 1.4k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

You have to create a vector of file names.

myfilenames <- c("myfile1", "myfile2")
y <- readDGE(files = myfilenames)

Note of course you have adapt my example to your files, not just copy my code.

ADD COMMENT
0
Entering edit mode

Thank you for your help, Gordon!

ADD REPLY
0
Entering edit mode

To output the matrix, I wrote: write.csv(y)
But, I get a warning:
Warning message: In (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : row names were found from a short variable and have been discarded

ADD REPLY
0
Entering edit mode

Update: I entered all of my files (n=38). When specifying: write.csv(y) I get this error: write.csv(y) Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 38, 1442

ADD REPLY
0
Entering edit mode

You should get into the habit of understanding the structure and class of your objects. For example, try this:

class(y)
str(y)

It should be a DGEList object, so, not a data-matrix or data-frame as you expect. This means that write.csv() will not know what to do with it. Gordon has already solved the initial issue. To solve any other related issues, I really encourage you to first go through the EdgeR workflow / vignette / user's guide so that you better understand how to use the package.

PS - the data that you probably want is stored in the counts variable, so, accessible via y$counts; however, these will not be normalised for size factors.

ADD REPLY
0
Entering edit mode

Thank you for your advice!

ADD REPLY
0
Entering edit mode

Another attempt:

write.csv(y,"C:/Users/james/Desktop/Documents/Downloads/txtfiles/ICTThemesMatrix.csv", row.names = T)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 38, 1442

ADD REPLY

Login before adding your answer.

Traffic: 841 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6