Why some statistically differently expressed genes are not DEGs?
1
0
Entering edit mode
Wuschel ▴ 10
@wuschel-15944
Last seen 21 months ago
HUJI

In my transcriptomics analysis, there are some genes that I am interested are significantly different, but not differentially expressed. I am in confusion if I really understand the concept behind what is exactly DEGs means now In my DEG pipeline parameters were;

pval_adj_method <- 'BH'
pval_cut <- 0.01
l2fc_cut <- 1

I have 3 time course - 0/12/24 hrs 4 genotypes - WT, 3 mutants (aox1/2/3) I compared each mutants to it's WT values at given time point e.g.

aox1_0h-WT_0h
aox2_0h-WT_0h

aox1_12h-WT_12h

etc.

From this output, only a few combinations showed they are DEGs

**Gene1_** aox2_0h.WT_0h    down-regulated
**Gene2-** aox3_0h.WT_0h    up-regulated
**Gene6_** aox2_12h.WT_12h  down-regulated

However, once I plotted original values, I see other mutants also statistically different compared to their WT

What does it mean?

Does DEG (pval_cut = 0.01 l2fc_cut = 1) mean at a given time point comparisions'sor does it compare across all the time course?

For example, inf Gene 2, if aox3 considered as DEG why not the other 2 mutants? This is what confusing to me Would be grateful if someone could explain this kindly.

NB: these count data (dput out put) are not normalized. This file was written before Step 1: Merge sequencing replicates of the Data analysis pipeline.

Data

structure(list(AGI = c("Gene_1", "Gene_2", "Gene_3", "Gene_4", 
"Gene_5", "Gene_6"), aox1_0h_1.tsv = c(12.19671398, 215.2944799, 
16.44462648, 494.4740158, 231.5254774, 72.57597906), aox1_0h_4.tsv = c(20.483339, 
309.7008485, 24.97567468, 692.9926789, 363.0050461, 123.5888977
), aox1_0h_5.tsv = c(12.02356715, 99.9827504, 11.36592332, 276.0989805, 
142.2411187, 58.85580698), aox1_12h_4.tsv = c(15.40599103, 72.92196194, 
19.36412892, 376.7248056, 269.677545, 99.21985896), aox1_12h_5.tsv = c(14.1934073, 
79.71249677, 32.45167316, 356.2905953, 196.2274429, 89.37924571
), aox1_12h_6.tsv = c(24.80343931, 95.02515421, 34.05358939, 
440.896732, 346.8153177, 127.4466926), aox1_24h_4.tsv = c(6.184604587, 
196.0341138, 25.18641024, 258.9801928, 223.9545066, 33.84484312
), aox1_24h_6.tsv = c(6.883819431, 98.05429052, 2.021393634, 
112.1958862, 139.6212486, 28.75295204), aox2_0h_3.tsv = c(25.53690512, 
175.1041763, 28.87826692, 459.5438353, 433.4289707, 137.3774675
), aox2_0h_4.tsv = c(10.18876629, 280.2141447, 32.48869835, 624.7432893, 
323.5662966, 87.89508861), aox2_0h_5.tsv = c(12.34945421, 312.5272061, 
28.36719904, 848.7432271, 469.8874526, 158.9474865), aox2_12h_1.tsv = c(19.08287299, 
137.9990764, 29.84418565, 377.4153307, 314.9721922, 95.42395784
), aox2_12h_2.tsv = c(7.890962701, 74.2399241, 15.29254776, 236.4643073, 
135.7776899, 26.49812298), aox2_12h_3.tsv = c(3.678221008, 62.60983209, 
11.67625132, 200.3733542, 100.7079901, 31.6012183), aox2_24h_1.tsv = c(11.2162246, 
149.1030715, 30.70364462, 321.8824458, 234.0832479, 67.24515005
), aox2_24h_4.tsv = c(13.99519719, 902.6355571, 41.48997933, 
501.303002, 772.8283795, 350.2515435), aox2_24h_5.tsv = c(7.867480669, 
361.1331651, 14.63979526, 218.9941277, 330.2108122, 147.5707818
), aox3_0h_1.tsv = c(14.15880435, 381.2255777, 40.53415811, 456.0047815, 
317.8303883, 159.6022776), aox3_0h_2.tsv = c(16.7311523, 291.2105347, 
27.30607326, 339.5042008, 179.0730805, 61.18757328), aox3_0h_3.tsv = c(19.09795927, 
293.2638492, 25.89886385, 398.8993785, 185.8289419, 70.22345994
), aox3_12h_1.tsv = c(81.20410372, 541.3370048, 85.56280843, 
974.0674911, 836.3820935, 353.6488805), aox3_12h_2.tsv = c(29.20942766, 
248.4582068, 51.52438237, 486.5817019, 390.8069314, 151.3860219
), aox3_12h_3.tsv = c(46.13590166, 273.2940355, 63.7297075, 503.9319503, 
395.8325311, 157.8115329), aox3_24h_1.tsv = c(34.29811675, 386.5204438, 
72.86154515, 693.3497618, 570.2158898, 194.3375745), aox3_24h_2.tsv = c(6.882795111, 
92.6003284, 26.22967721, 228.5226723, 120.5788092, 49.8811201
), aox3_24h_3.tsv = c(18.1408733, 196.4513931, 40.71817893, 375.4835311, 
334.5830825, 97.88584005), WT_0h_1.tsv = c(18.68220076, 114.5750048, 
9.829645971, 220.5843393, 174.4842611, 94.2738265), WT_0h_2.tsv = c(13.60937828, 
73.97033656, 17.038227, 214.8653415, 155.5073735, 100.5135226
), WT_0h_3.tsv = c(16.96217952, 76.57929551, 16.07674964, 141.6508369, 
145.1300591, 73.15117509), WT_12h_2.tsv = c(5.05511416, 11.33088644, 
4.189101647, 44.84100697, 44.55982506, 18.2843175), WT_12h_3.tsv = c(0.990461611, 
0.949660867, 1.699219964, 15.91825777, 15.35309945, 9.38446604
), WT_12h_5.tsv = c(5.933614952, 46.86761146, 10.29591467, 122.7757985, 
69.69893277, 56.94217399), WT_24h_1.tsv = c(4.029020182, 31.21885208, 
10.11987433, 100.5907921, 76.64674385, 37.26256017), WT_24h_4.tsv = c(3.054100241, 
102.1107272, 18.4440106, 121.8360378, 154.6688737, 44.27986257
), WT_24h_5.tsv = c(8.133354312, 97.5821868, 10.70295122, 176.7748047, 
164.02099, 24.53833578)), class = c("spec_tbl_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -6L), spec = structure(list(
    cols = list(AGI = structure(list(), class = c("collector_character", 
    "collector")), aox1_0h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_0h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_0h_5.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_12h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_12h_5.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_12h_6.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_24h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox1_24h_6.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_0h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_0h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_0h_5.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_12h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_12h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_12h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_24h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_24h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox2_24h_5.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_0h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_0h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_0h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_12h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_12h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_12h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_24h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_24h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), aox3_24h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_0h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_0h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_0h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_12h_2.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_12h_3.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_12h_5.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_24h_1.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_24h_4.tsv = structure(list(), class = c("collector_double", 
    "collector")), WT_24h_5.tsv = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))

This experiment has 3 time points - 0 h/12h /24 hrs 4 genotypes - WT, 3 mutants (aox1/ aox2/ aox3)

##############
# Data Analysis Pipeline

#Follwed below pipeline : Link

#https://github.com/wyguo/ThreeDRNAseq/blob/master/vignettes/user_manuals/3D_RNA-seq_command_line_user_manual.md

#Specific steps 

#run-tximport

genes_counts <- txi_genes$counts


##----->> parameters for 3D analysis
pval_adj_method <- 'BH'
pval_cut <- 0.05
l2fc_cut <- 1
deltaPS_cut <- 0.1
DAS_pval_method <- 'F-test'

##----->> pair-wise contrast groups
contrast <- c('aox1_0h-WT_0h', 
               'aox2_0h-WT_0h',
               'aox3_0h-WT_0h',
               'aox1_12h-WT_12h', 
               'aox2_12h-WT_12h',
               'aox3_12h-WT_12h',
               'aox1_24h-WT_24h', 
               'aox2_24h-WT_24h',
               'aox3_24h-WT_24h')



# Step 2: DE genes

# batch.effect <- NULL ## if has no batch effects
design <- condition2design(condition = metadata$treat,
                           batch.effect = NULL) # batch.effect <- NULL ## if has no batch effects
if(DE_pipeline == 'glmQL'){
  ##----->> edgeR glmQL pipeline
  genes_3D_stat <- edgeR.pipeline(dge = genes_dge,
                                  design = design,
                                  deltaPS = NULL,
                                  contrast = contrast,
                                  diffAS = F,
                                  method = 'glmQL',
                                  adjust.method = pval_adj_method)
}



## save results
DDD.data$genes_3D_stat <- genes_3D_stat


################################################################################
##----->> Summary DE genes
DE_genes <- summaryDEtarget(stat = genes_3D_stat$DE.stat,
                            cutoff = c(adj.pval=pval_cut,
                                       log2FC=l2fc_cut))
DDD.data$DE_genes <- DE_genes
<h6>#</h6>
Bar plots
library(tidyverse)

df <- read_csv("data/Selected_Count_BR2.csv")

longdata <- df %>% 
  gather(key, value, -AGI) %>% 
  separate(key, c("Genotype", "Time", "Replicate"))

library(plotrix)

longdata2 <- longdata %>% 
  group_by(AGI, Genotype, Time) %>% 
  summarise_each(funs(mean,sd,std.error)) %>% 
  ungroup() %>% 
  mutate(Time = factor(Time), Genotype = factor(Genotype))

colnames(longdata2)



longdata2$Genotype <- factor(longdata2$Genotype,levels = c("WT", "aox1", "aox2", "aox3"))

# Plotting

longdata2 %>% 
  ggplot(aes(x = Time, y = value_mean, fill = Genotype)) + 
  geom_bar(position = position_dodge(), stat = "identity") + 
  geom_errorbar(aes(ymin = value_mean - value_std.error, ymax = value_mean + value_std.error), 
                width = 0.1, 
                position = position_dodge(0.9)) + 
  scale_fill_manual(values=c("#008000","#B8860B","#4169E1","#DC143C"))+
  facet_wrap( ~ AGI,scales = "free_y")

Picture1

RNASeqData DEGreport edgeR • 1.1k views
ADD COMMENT
3
Entering edit mode
@gordon-smyth
Last seen just now
WEHI, Melbourne, Australia

I understand why you might want to use a Shiny App tool likeThreeDRNAseq but using a wrapper package instead of using the underlying Bioconductor packages directly limits our ability to help. If you don't understand the output from ThreeDRNAseq then you really need to send questions to the authors of that package via their github site.

I am the author of the edgeR package but there isn't any way for me help you. I don't see any edgeR functions amongst any of the code shown in your question. I can't tell what analysis has been done, what output you are looking at or even what your question is about.

If you're interesting in trying edgeR directly, see the workflow example here:

https://www.bioconductor.org/packages/release/workflows/vignettes/RnaSeqGeneEdgeRQL/inst/doc/edgeRQL.html

ADD COMMENT
0
Entering edit mode

Thank you very much Prof.

ADD REPLY

Login before adding your answer.

Traffic: 581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6