Question

Issue with DESeqDataSetFromMatrix

0

Entering edit mode

matthew.sinton • 0

@matthewsinton-18766

Last seen 6.3 years ago

Hi All,

I'm very new to using R, and I'm trying to use it to perform an analysis of RNA-seq data, to look at differential expression. I've done a few online tutorials to try and build up some background, but I'm hitting a roadblock.

Basically, when I run my script, I get the following error:

Error in DESeqDataSetFromMatrix(countData = countData, colData = colData, : ncol(countData) == nrow(colData) is not TRUE

I realise that this error means that my colData and countData don't quite match up. I performed the following, which showed that I have 7 columns in my countData and 6 rows in my colData file:

dim(colData)
dim(countData)

However, I cannot see how to alter the colData file to make it match up. Or how to get my countData file to skip the first row containing gene IDs, so that it only counts 6 columns.

My countData.csv file looks like this:

GeneID                         Control1     Control2     Control3     LPO1    LPO2     LPO3

ENSG01254216542                1.1             1.3         1.14         7.0    7.5      7.2

The colData.csv file looks like this:

                     Condition         type

Control1           untreated         paired-read

Control2           untreated         paired-read        

Control3           untreated         paired-read

LPO1               treated             paired-read

LPO2               treated             paired-read

LPO3               treated             paired-read

The script that I'm using is:

library(DESeq2)

data <-read.csv("//csce.datastore.ed.ac.uk//Control1.output2.csv") 

se <- data

countData <- as.matrix(se,row.names="Geneid", header = TRUE, sep = '\t', row.names = 1)
colData <- read.csv("//csce.datastore.ed.ac.uk//csce//biology//users//s0348375//Win7//Desktop//colData2.csv", row.names=1)

colData <- colData[,c("condition","type")]
colnames(countData) <- NULL
dds <- (countData = countData,
                              colData = colData,
                              design = ~ condition)
dds <- dds[ rowSums(counts(dds)) > 1, ]
dds<-DESeq(dds)

I was wondering if anyone may be able to help with a solution? I'm sure that it's something very obvious, but any help would be very much appreciated, as I don't have access to bioinformatic support.

Thanks,

Matthew

deseq2 deseqdatasetfrommatrix rnaseq affy • 3.5k views

ADD COMMENT • link updated 6.3 years ago by Michael Love 43k • written 6.3 years ago by matthew.sinton • 0

score 0 · Answer 1 · 2018-12-10

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 3 days ago

United States

Is this Affymetric gene expression data? If so you should not use an RNA-seq method. You should use limma's methods for microarray data.

ADD COMMENT • link 6.3 years ago Michael Love 43k

0

Entering edit mode

Hi,

No, this is from an RNA-seq analysis. I aligned my reads using STAR, and then used featureCounts to generate my counts

Thanks

ADD REPLY • link 6.3 years ago matthew.sinton • 0

0

Entering edit mode

Ok I was thrown off by the non-integer counts and the "affy" tag. There are some mistakes in the as.matrix line. Those arguments don't work with as.matrix. If you want to split off the first column of data, you should instead do something like:

cts <- as.matrix(data[,-1])

ADD REPLY • link 6.3 years ago Michael Love 43k