Question

processing samples with significantly different counts

0

Entering edit mode

Anastasiia • 0

@2cc8145c

Last seen 2.0 years ago

Germany

Hi all,

I am working with RNA-seq data from mice cecum samples colonized with artificial community (but I am interested in only 1 species which has random abundance fluctuations). Not surprisingly, after Salmon I got very different pseudo-counts for different libraries. I want to perform DGE, but I am not sure if DESeq2 can manage to deal with such variations (I cannot check the intermediate results with boxplots or whatever, because it just uses the internal model on raw counts. Visualizations with VST look OK but don't answer my question).

I've read that DESeq internally accounts for library size. Is it enough in my case? Do I need to somehow additionally "normalize" my samples?

Thank you


df1 <- as.data.frame(txi$counts)
apply(df1,2,median)

lane1014MZI000248  lane107MZI000244  lane101MZI000243  lane108MZI000247 lane1013MZI000245  lane102MZI000246  lane106MZI000235 
              117               177               414               399               202               323               806 
lane1010MZI000213 lane1011MZI000189 lane1019MZI000237  lane103MZI000204 lane1016MZI000211  lane104MZI000212 lane1012MZI000236 
              497               469              1075               161               205               376              1305 
 lane105MZI000188 lane1017MZI000214 lane1020MZI000238 lane1015MZI000210  lane109MZI000209 lane1018MZI000234 
               95               350               967               713               931               787 

apply(df1,2,max)

lane1014MZI000248  lane107MZI000244  lane101MZI000243  lane108MZI000247 lane1013MZI000245  lane102MZI000246  lane106MZI000235 
           201168            565021           1195385            689544            432225            829544           1863546 
lane1010MZI000213 lane1011MZI000189 lane1019MZI000237  lane103MZI000204 lane1016MZI000211  lane104MZI000212 lane1012MZI000236 
          2438458           1090979           1967173           1678215            805961           1302926           1875980 
 lane105MZI000188 lane1017MZI000214 lane1020MZI000238 lane1015MZI000210  lane109MZI000209 lane1018MZI000234 
           588991           2805928           2939780           2019798           2668822           3088835

DESeq2 Normalization • 665 views

ADD COMMENT • link updated 2.1 years ago by ATpoint ★ 4.8k • written 2.1 years ago by Anastasiia • 0

score 0 · Answer 1 · 2023-03-16

0

Entering edit mode

ATpoint ★ 4.8k

@atpoint-13662

Last seen 6 hours ago

Germany

That's not what I'd call "significantly" different. Do the usual QC such as PCA (see vignette) and color the plot by sequencing depth. Check if depth is a major driver of variation in the early PCs. If that is not the case you're fine.

ADD COMMENT • link 2.1 years ago ATpoint ★ 4.8k