Number of rows in DESeq2 output (.csv) is not the same as number of rows in the results(dds) dataframe
1
0
Entering edit mode
Jamie • 0
@2c2f3803
Last seen 2.5 years ago
Denmark

Hi all,

I am new to DESeq2, but did this together with a colleague that has done some transcriptomics analyses before. However, neither of us could figure out why the output of results is: DataFrame with 14235 rows and 6 columns while our .csv file (imported into excel) shows 31415 rows and 6+1 columns (this last one is obviously because the gene names are now an extra column).

Can anyone tell us why we have so many more rows suddenly? The code we used is below.

#read in counts table with gene names as rownames
read.table("mt_mapped_paired_readcounts.tsv.txt", sep= '\t', header = FALSE, row.names = 1) -> counts
#filter out rows with only zeros
counts.nozero <- counts[rowSums(counts) != 0,]
dim(counts.nozero)
#removing the last row, which contains NA
counts.nozero.nona <- counts.nozero[1:14235,]

#file to explain which column is which
read.table("columndata", sep= ',', header = TRUE) -> columndata

library(DESeq2)
dds <- DESeqDataSetFromMatrix(countData = counts.nozero.nona,
                              colData = columndata,
                              design = ~ cables)
dds<- DESeq(dds)

res <- results(dds, name="cables_yes_cables_vs_no_cables")
res

#res output
log2 fold change (MLE): cables yes cables vs no cables 
Wald test p-value: cables yes cables vs no cables 
DataFrame with 14235 rows and 6 columns

write.csv(as.data.frame(res), file="deseq2results.csv")
output difference • 912 views
ADD COMMENT
0
Entering edit mode

Are you sure you checked the right file ? There is no reason for as.data.frame(res) to add lines. please check library(readr); deseq2results <- read_csv("deseq2results.csv"); dim(deseq2results)

ADD REPLY
0
Entering edit mode
Jamie • 0
@2c2f3803
Last seen 2.5 years ago
Denmark

I think I found the problem, DESeq2 or R could not deal with 5' and 3' or mentionings of commas in the gene names, so after conversion to '_' or letters the # of output rows in our .csv file corresponds to the output rows of results(dds).

Sorry for bothering people with such a basic problem! I will leave the post so others can find how to fix it if they made the same mistake.

ADD COMMENT

Login before adding your answer.

Traffic: 401 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6