Question

Help with Deseq2 design

0

Entering edit mode

sxp585 • 0

@sxp585-22100

Last seen 5.2 years ago

Hi,

I would like to ask for help in the design component of my Deseq2 RNA-seq analysis. My data set consists of 6 samples: 3 treated samples (gene overexpression) and three untreated samples (control_vector). The samples are paired in that the biological reps of each group are the same plant leaf cut in half and exposed to the two different treatments. Another confounding factor in this sequencing experiment seems to be a batch effect. My bio rep #1s comes from a first round of experiments and was sequenced at a depth of ~12.5 mil reads/sample (this is 3' seq). Bio reps #2 and #3 were from an experiment conducted on a seperate day with the sequencing also done later at a depth of ~6 mil reads/ sample. When I look at the PCA of my normalized reads, my data is first separated by leaf. That is bio reps #1 (treated and untreated) pair together, bio reps #2 pair together then bio reps #3 pair together. The data seperates on PCA2 according to batch. That is bio reps #1 are separated from bio reps #2 and #3 along this axis. I'm unsure what causes their seperation on PCA3, but I don't see the effects of my treatment (gene overexpression) until PCA4. Below is my col data and the current design I'm using for my DEG. When analyzing all the data together I get only 4 genes that significantly DE including my OE gene. If I analyze just batch two, I get ~40 genes DE, but many genes including my overexpressed genes now have p-adj values of NA. Any advice is appreciated. Coldata:

SampleName LeafNo Treatment Batch

treated_1 one OE one

treated_2 two OE two

treated_3 three OE two

control_1 one control one

control_2 two control two

control_3 three control two

Current design: design = ~LeafNo + Treatment

deseq2 design multifactor • 1.1k views

ADD COMMENT • link updated 5.2 years ago by Michael Love 43k • written 5.2 years ago by sxp585 • 0

score 0 · Answer 1 · 2020-01-24

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

There's not much to do here other than use the samples you generated and the design at the bottom of your post regardless of the DE results.

ADD COMMENT • link 5.2 years ago Michael Love 43k

0

Entering edit mode

HI Michael,

Thanks for taking the time to look at my question. I was just reading back through and realized I didn't explicitly ask (but maybe you did get the gist)--my formula now is controlling for LeafNo. effect only. Can the design be written such that both 'LeafNo' and 'Batch' can be controlled to reduce the background noise in order to see a greater effect of the treatment. Maybe the answer is no!?!

Thank you again for your time.

ADD REPLY • link 5.2 years ago sxp585 • 0

1

Entering edit mode

Leaf number controls for batch if the leaves are nested within batch.

ADD REPLY • link 5.2 years ago Michael Love 43k

0

Entering edit mode

Question...how good a job can you do modeling LeafNo with only 2 samples of each leaf? Is it really fruitful to try and fit two factors with only 6 total samples?

ADD REPLY • link 5.2 years ago swbarnes2 ★ 1.4k