Question

What causes this DESeq2 error at "fitting model and testing"?

0

Entering edit mode

ap3637 • 0

@ap3637-13814

Last seen 6.0 years ago

I am using DESeq2 on a pretty large dataset. I have mock and infection conditions for several mouse lines at 3 time points. My goal is to contrast each line at each timepoint with it's corresponding mock at the same timepoint. Previously I split the data into smaller subsets by line, but now I'm trying to use the entire data set at once to get the benefits of more data.

I have the following DESeq object:

class: DESeqDataSet
dim: 50686 116
metadata(1): version
assays(1): counts
rownames(50686): ENSMUSG00000099985 ENSMUSG00000030105 ... ENSMUSG00000053774 ENSMUSG00000030058
rowData names(0):
colnames(116): RIX007 RIX008 ... RIX233 RIX234
colData names(20): ID Tissue ... Replicate sizeFactor

My design is:

~ Infection_Line_Time

My plan was to make a composite field, infection_line_time, and then relevel and rerun DESeq(dds) for each condition. For example, to get the results for mock_day1_lineA, I do the following.

dds$Infection_Line_Time <- relevel(dds$Infection_Line_Time, "mock_day1_16012X15119")
dds <- DESeq(dds)

Also the Infection_Line_Time looks like this:
> dds$Infection_Line_Time

  [1] mock_day1_16012X15119  mock_day1_16012X15119  mock_day3_16012X15119  mock_day3_16012X15119  mock_day5_16012X15119  mock_day5_16012X15119 

			  [7] ebola_day1_16012X15119 ebola_day1_16012X15119 ebola_day3_16012X15119 ebola_day3_16012X15119 ebola_day5_16012X15119 ebola_day5_16012X15119

			 [13] mock_day1_16513X16188  mock_day1_16513X16188  mock_day3_16513X16188  mock_day3_16513X16188  mock_day5_16513X16188  mock_day5_16513X16188 

			 [19] mock_day5_16513X16188  ebola_day1_16513X16188 ebola_day3_16513X16188 ebola_day5_16513X16188 ebola_day3_16513X16188 ebola_day5_16513X16188

			 [25] ebola_day5_16513X16188 ebola_day5_16513X16188 mock_day1_8034X8048    mock_day1_8034X8048    mock_day1_8034X8048    mock_day3_8034X8048   

			 [31] mock_day3_8034X8048    mock_day3_8034X8048    mock_day5_8034X8048    mock_day5_8034X8048    mock_day5_8034X8048    ebola_day1_8034X8048  

			 [37] ebola_day1_8034X8048   ebola_day1_8034X8048   ebola_day3_8034X8048   ebola_day3_8034X8048   ebola_day3_8034X8048   ebola_day5_8034X8048  

			 [43] ebola_day5_8034X8048   ebola_day5_8034X8048   ebola_day1_16072X3393  ebola_day1_16072X3393  ebola_day1_16072X3393  ebola_day3_16072X3393 

			 [49] ebola_day3_16072X3393  ebola_day3_16072X3393  ebola_day5_16072X3393  ebola_day5_16072X3393  ebola_day5_16072X3393  ebola_day5_16072X3393 

			 [55] mock_day1_16072X3393   mock_day1_16072X3393   mock_day1_16072X3393   mock_day3_16072X3393   mock_day3_16072X3393   mock_day3_16072X3393  

			 [61] mock_day5_16072X3393   mock_day5_16072X3393   mock_day1_CC041        mock_day3_CC041        mock_day3_CC041        mock_day5_CC041       

			 [67] mock_day5_CC041        mock_day5_CC041        mock_day5_CC041        ebola_day1_CC041       ebola_day1_CC041       ebola_day1_CC041      

			 [73] ebola_day3_CC041       ebola_day3_CC041       ebola_day3_CC041       mock_day1_CC041        mock_day1_CC041        mock_day1_CC041       

			 [79] mock_day3_CC041        mock_day3_CC041        mock_day3_CC041        mock_day5_CC041        mock_day5_CC041        mock_day5_CC041       

			 [85] ebola_day1_CC041       ebola_day1_CC041       ebola_day1_CC041       ebola_day1_CC041       ebola_day1_CC041       ebola_day3_CC041      

			 [91] ebola_day3_CC041       ebola_day3_CC041       ebola_day3_CC041       ebola_day3_CC041       ebola_day5_CC041       ebola_day5_CC041      

			 [97] ebola_day5_CC041       ebola_day5_CC041       ebola_day5_CC041       mock_day1_CC011        mock_day3_CC011        mock_day3_CC011       

			[103] mock_day5_CC011        mock_day5_CC011        ebola_day1_CC011       ebola_day1_CC011       ebola_day3_CC011       ebola_day3_CC011      

			[109] ebola_day5_CC011       ebola_day5_CC011       ebola_day1_CC011       ebola_day3_CC011       ebola_day5_CC011       ebola_day1_CC011      

			[115] ebola_day3_CC011       ebola_day5_CC011      

			36 Levels: mock_day1_16012X15119 ebola_day1_16012X15119 ebola_day1_16072X3393 ebola_day1_16513X16188 ebola_day1_8034X8048 ... mock_day5_CC041

When I attempt to run the command,

dds <- DESeq(dds)

it looks normal until the "fitting model and testing" phase, then throws the following error:

error: matrix multiplication: incompatible matrix dimensions: 116x36 and 0x1

			Error in fitBeta(ySEXP = ySEXP, xSEXP = xSEXP, nfSEXP = nfSEXP, alpha_hatSEXP = alpha_hatSEXP,  : 

			  matrix multiplication: incompatible matrix dimensions: 116x36 and 0x1

I've never seen this before and I can't find anyone else who has encountered this problem online. I'm actively working on resolving it, but could use some help if anyone has any input or advice.

My theories are as of now that it's a memory issue, or that when I'm comparing different lines to mock_day1_lineA, that it isn't happy for some reason. Like, for instance, ebola_day3_lineC vs mock_day1_lineA will be one possibility in the resultsNames(dds). It's a comparison that makes no sense and I won't use, but it's still being calculated for every permutation. I'm still not sure how to handle it. Is there a better design or a way to do this without resorting to cutting my data into separate data sets?

Thanks in advance for any insight you can provide!

deseq2 differential expression error • 4.4k views

ADD COMMENT • link updated 22 months ago by snow34 • 0 • written 7.6 years ago by ap3637 • 0

score 0 · Answer 1 · 2017-09-29

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 2 days ago

United States

Are you using the latest version (1.16)?

ADD COMMENT • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Yes, using version 1.16.

ADD REPLY • link 7.6 years ago ap3637 • 0

0

Entering edit mode

I can't reproduce this error so I'm not sure what to say, e.g. 36 levels, 3 replicates each works fine with simulated data, this takes 30 seconds to fit per 500 genes:

dds <- makeExampleDESeqDataSet(n=500,m=108)
dds$condition <- factor(rep(1:36,each=3))
dds <- DESeq(dds)

Can you print your sessionInfo() so I can see if maybe there are some package incompatibilities? You can also run biocValid() from the BiocInstaller package to see if you are up to date on all packages.

By the way, you shouldn't be releveling and rerunning DESeq() for comparing groups, you only need to use results() with contrast specifying the two groups to compare.

e.g. for the above example, you only run DESeq() once, then:

results(dds, contrast=c("condition","36","35"))

ADD REPLY • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Thanks for your reply. The output you requested is too many characters to post here directly, so here is a link to that information.

https://justpaste.it/1c00t

Thanks very much for your help, the point about the releveling is also very useful. I'm going to try to update some packages that are older (though they seem unrelated at first glance). If it doesn't work after that I'll try running it on a high memory machine. Currently the rld(dds) step is taking several hours because of the data size. Thanks again!

ADD REPLY • link 7.6 years ago ap3637 • 0

0

Entering edit mode

if that's rlog(), I highly recommend using vst() instead. It's much faster for large datasets, and this is what I tend to use now for speed.

ADD REPLY • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Update:

It seems like it's getting past this phase using VST and updated R packages. At "estimating dispersions" now, so hopefully everything will go through. I did receive the following warning, but maybe it's okay.

27 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
-- replacing outliers and refitting for 23973 genes
-- DESeq argument 'minReplicatesForReplace' = 7 
-- original counts are preserved in counts(dds)

ADD REPLY • link 7.6 years ago ap3637 • 0

0

Entering edit mode

This is ok, it's just that the increases to the likelihood are accumulating slowly. Typically the beta isn't changing much but still producing small changes to likelihood.

ADD REPLY • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Ok great, that's good to know. So assuming the "estimating dispersions" goes through, for future reference for anyone who encounters this problem, try VST instead of rlog, and update your R packages.

My guess is that the rlog might have failed somehow in the larger data set, if I had to guess what the fix was I would say varianceStabilizingTransformation(dds).

Thanks for your help Michael!

ADD REPLY • link 7.6 years ago ap3637 • 0

0

Entering edit mode

Hi, Michael I would like to request how to use vst() functions before estimateDispersions step. When the vst() function or varianceStabilizingTransformation() was used, it return a DESeqTransform. I mention it that DESeqTransform can not been used to difference expression. Thanks very much!

ADD REPLY • link 5.7 years ago liushang • 0

0

Entering edit mode

I don't see how this is related to the thread you are adding to.

A few points:

vst() requires dispersion estimation. Either it has already been run and will be used by the function (if blind=FALSE), or it will run dispersion estimation internally. You can run vst() on any DESeqDataSet. Please see the examples in the documentation/vignette/workflow.

ADD REPLY • link 5.7 years ago Michael Love 43k

0

Entering edit mode

I also meet the same error:

error: matrix multiplication: incompatible matrix dimensions: 1176x590 and 0x1 Error in fitBeta(ySEXP = ySEXP, xSEXP = xSEXP, nfSEXP = nfSEXP, alphahatSEXP = alphahatSEXP, : matrix multiplication: incompatible matrix dimensions: 1176x590 and 0x1

I mention it that vst() can only be used for visualization and not used to DE, So I would like to know how to use vst() for DE

ADD REPLY • link 5.7 years ago liushang • 0

0

Entering edit mode

You shouldn't use vst() for DE. Why don't you try following our workflow?

https://www.bioconductor.org/packages/release/workflows/html/rnaseqGene.html

And if you encounter problems, please make a new post with your entire code and the error that occurred if you encounter an error.

ADD REPLY • link 5.7 years ago Michael Love 43k

score 0 · Answer 2 · 2023-06-13

0

Entering edit mode

snow34 • 0

@2b2fe8ea

Last seen 21 months ago

China

well well! almost 4 years later I meet the same question. After a series test, the possible reason seems like I used too many samples....If I use 441 samples, DESeq2 gives me this 'error in fitbeta(ysexp = ysexp, xsexp = xsexp, nfsexp = nfsexp, alpha_hatsexp = alpha_hatsexp, : matrix multiplication: incompatible matrix dimensions: 441x2 and 0x1' If I cut it to 10 samples, I get the diff gene list smoothly!

ADD COMMENT • link 22 months ago snow34 • 0

0

Entering edit mode

Do you want to make a new post with details about your analysis, e.g. what is design(dds), colData(dds), etc?

ADD REPLY • link 22 months ago Michael Love 43k

0

Entering edit mode

The package version is DESeq2_1.40.1

The colData(dds) is like:

colData(dds) DataFrame with 441 rows and 1 column condition <factor> tcga.ee.a2gi 1 tcga.ee.a3je 1 tcga.ee.a29l 1 tcga.d3.a5gn 1 tcga.d3.a3cb 1 ... ... tcga.bf.a5ep 0 tcga.eb.a6qz 0 tcga.eb.a3xf 0 tcga.d9.a3z3 0 tcga.bf.a3dl 0

The dds object is like:

dds class: DESeqDataSet dim: 579 441 metadata(1): version assays(1): counts rownames(579): hsa-let-7a-1 hsa-let-7a-2 ... hsa-mir-99a hsa-mir-99b rowData names(0): colnames(441): tcga.ee.a2gi tcga.ee.a3je ... tcga.d9.a3z3 tcga.bf.a3dl colData names(1): condition

The error info is like:

DESeq(dds) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. final dispersion estimates fitting model and testing

warning: solve(): system is singular; attempting approx solution Error in fitBeta(ySEXP = ySEXP, xSEXP = xSEXP, nfSEXP = nfSEXP, alpha_hatSEXP = alpha_hatSEXP, : matrix multiplication: incompatible matrix dimensions: 441x2 and 0x1

Hope those information helped !

ADD REPLY • link 22 months ago snow34 • 0

0

Entering edit mode

Sorry I can't tell from the information you've posted anything about the experimental design.

ADD REPLY • link 22 months ago Michael Love 43k

0

Entering edit mode

Hope the following info helped/

I use this command get dds :

dds <- DESeqDataSetFromMatrix(mycounts, colData, design= ~ condition)

The object of conditon is :

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [149] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [186] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [223] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [260] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [297] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Levels: 0 1

And also condition is the only column of colData :

colData$condition

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [149] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [186] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [223] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [260] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [297] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 [334] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [371] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [408] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Levels: 0 1

The row names of colData is the sample names :

row.names(colData)

[1] "tcga.ee.a2gi" "tcga.ee.a3je" "tcga.ee.a29l" "tcga.d3.a5gn" [5] "tcga.d3.a3cb" "tcga.er.a2nb" "tcga.ee.a3jd" "tcga.fs.a1zq" [9] "tcga.ee.a2gn" "tcga.ee.a2mm" "tcga.d3.a5gs" "tcga.er.a2nc" [13] "tcga.w3.a828" "tcga.d3.a3c6" "tcga.gn.a268" "tcga.ee.a20c" [17] "tcga.er.a19j" "tcga.d3.a2jk" "tcga.er.a2nf_1" "tcga.er.a2nf_2" [21] "tcga.er.a2nf_3" "tcga.er.a2nf_4" "tcga.yd.a9ta" "tcga.da.a1i7" [25] "tcga.ee.a3j5" "tcga.da.a3f5" "tcga.qb.a6fs" "tcga.ee.a2a5" [29] "tcga.da.a1ib" "tcga.we.a8k1" "tcga.ee.a29g" "tcga.ee.a2gj" [33] "tcga.ee.a29b" "tcga.gn.a4u3" "tcga.ee.a2gh" "tcga.d3.a2jd"

...

[421] "tcga.eb.a51b" "tcga.eb.a97m" "tcga.ee.a3ad" "tcga.ee.a3ab" [425] "tcga.er.a2ne" "tcga.eb.a4iq" "tcga.eb.a24c" "tcga.fs.a1yw" [429] "tcga.bf.a1q0" "tcga.bf.a1pz" "tcga.bf.a5eq" "tcga.d9.a3z1" [433] "tcga.eb.a85i" "tcga.d3.a1q3" "tcga.bf.aap1" "tcga.eb.a6qy" [437] "tcga.bf.a5ep" "tcga.eb.a6qz" "tcga.eb.a3xf" "tcga.d9.a3z3" [441] "tcga.bf.a3dl"

ADD REPLY • link 22 months ago snow34 • 0