In my transcriptomics analysis, there are some genes that I am interested are significantly different, but not differentially expressed. I am in confusion if I really understand the concept behind what is exactly DEGs means now In my DEG pipeline parameters were;
pval_adj_method <- 'BH'
pval_cut <- 0.01
l2fc_cut <- 1
I have 3 time course - 0/12/24 hrs 4 genotypes - WT, 3 mutants (aox1/2/3) I compared each mutants to it's WT values at given time point e.g.
aox1_0h-WT_0h
aox2_0h-WT_0h
aox1_12h-WT_12h
etc.
From this output, only a few combinations showed they are DEGs
**Gene1_** aox2_0h.WT_0h down-regulated
**Gene2-** aox3_0h.WT_0h up-regulated
**Gene6_** aox2_12h.WT_12h down-regulated
However, once I plotted original values, I see other mutants also statistically different compared to their WT
What does it mean?
Does DEG (pval_cut = 0.01 l2fc_cut = 1) mean at a given time point comparisions'sor does it compare across all the time course?
For example, inf Gene 2, if aox3 considered as DEG why not the other 2 mutants? This is what confusing to me Would be grateful if someone could explain this kindly.
NB: these count data (dput out put) are not normalized. This file was written before Step 1: Merge sequencing replicates of the Data analysis pipeline.
Data
structure(list(AGI = c("Gene_1", "Gene_2", "Gene_3", "Gene_4",
"Gene_5", "Gene_6"), aox1_0h_1.tsv = c(12.19671398, 215.2944799,
16.44462648, 494.4740158, 231.5254774, 72.57597906), aox1_0h_4.tsv = c(20.483339,
309.7008485, 24.97567468, 692.9926789, 363.0050461, 123.5888977
), aox1_0h_5.tsv = c(12.02356715, 99.9827504, 11.36592332, 276.0989805,
142.2411187, 58.85580698), aox1_12h_4.tsv = c(15.40599103, 72.92196194,
19.36412892, 376.7248056, 269.677545, 99.21985896), aox1_12h_5.tsv = c(14.1934073,
79.71249677, 32.45167316, 356.2905953, 196.2274429, 89.37924571
), aox1_12h_6.tsv = c(24.80343931, 95.02515421, 34.05358939,
440.896732, 346.8153177, 127.4466926), aox1_24h_4.tsv = c(6.184604587,
196.0341138, 25.18641024, 258.9801928, 223.9545066, 33.84484312
), aox1_24h_6.tsv = c(6.883819431, 98.05429052, 2.021393634,
112.1958862, 139.6212486, 28.75295204), aox2_0h_3.tsv = c(25.53690512,
175.1041763, 28.87826692, 459.5438353, 433.4289707, 137.3774675
), aox2_0h_4.tsv = c(10.18876629, 280.2141447, 32.48869835, 624.7432893,
323.5662966, 87.89508861), aox2_0h_5.tsv = c(12.34945421, 312.5272061,
28.36719904, 848.7432271, 469.8874526, 158.9474865), aox2_12h_1.tsv = c(19.08287299,
137.9990764, 29.84418565, 377.4153307, 314.9721922, 95.42395784
), aox2_12h_2.tsv = c(7.890962701, 74.2399241, 15.29254776, 236.4643073,
135.7776899, 26.49812298), aox2_12h_3.tsv = c(3.678221008, 62.60983209,
11.67625132, 200.3733542, 100.7079901, 31.6012183), aox2_24h_1.tsv = c(11.2162246,
149.1030715, 30.70364462, 321.8824458, 234.0832479, 67.24515005
), aox2_24h_4.tsv = c(13.99519719, 902.6355571, 41.48997933,
501.303002, 772.8283795, 350.2515435), aox2_24h_5.tsv = c(7.867480669,
361.1331651, 14.63979526, 218.9941277, 330.2108122, 147.5707818
), aox3_0h_1.tsv = c(14.15880435, 381.2255777, 40.53415811, 456.0047815,
317.8303883, 159.6022776), aox3_0h_2.tsv = c(16.7311523, 291.2105347,
27.30607326, 339.5042008, 179.0730805, 61.18757328), aox3_0h_3.tsv = c(19.09795927,
293.2638492, 25.89886385, 398.8993785, 185.8289419, 70.22345994
), aox3_12h_1.tsv = c(81.20410372, 541.3370048, 85.56280843,
974.0674911, 836.3820935, 353.6488805), aox3_12h_2.tsv = c(29.20942766,
248.4582068, 51.52438237, 486.5817019, 390.8069314, 151.3860219
), aox3_12h_3.tsv = c(46.13590166, 273.2940355, 63.7297075, 503.9319503,
395.8325311, 157.8115329), aox3_24h_1.tsv = c(34.29811675, 386.5204438,
72.86154515, 693.3497618, 570.2158898, 194.3375745), aox3_24h_2.tsv = c(6.882795111,
92.6003284, 26.22967721, 228.5226723, 120.5788092, 49.8811201
), aox3_24h_3.tsv = c(18.1408733, 196.4513931, 40.71817893, 375.4835311,
334.5830825, 97.88584005), WT_0h_1.tsv = c(18.68220076, 114.5750048,
9.829645971, 220.5843393, 174.4842611, 94.2738265), WT_0h_2.tsv = c(13.60937828,
73.97033656, 17.038227, 214.8653415, 155.5073735, 100.5135226
), WT_0h_3.tsv = c(16.96217952, 76.57929551, 16.07674964, 141.6508369,
145.1300591, 73.15117509), WT_12h_2.tsv = c(5.05511416, 11.33088644,
4.189101647, 44.84100697, 44.55982506, 18.2843175), WT_12h_3.tsv = c(0.990461611,
0.949660867, 1.699219964, 15.91825777, 15.35309945, 9.38446604
), WT_12h_5.tsv = c(5.933614952, 46.86761146, 10.29591467, 122.7757985,
69.69893277, 56.94217399), WT_24h_1.tsv = c(4.029020182, 31.21885208,
10.11987433, 100.5907921, 76.64674385, 37.26256017), WT_24h_4.tsv = c(3.054100241,
102.1107272, 18.4440106, 121.8360378, 154.6688737, 44.27986257
), WT_24h_5.tsv = c(8.133354312, 97.5821868, 10.70295122, 176.7748047,
164.02099, 24.53833578)), class = c("spec_tbl_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L), spec = structure(list(
cols = list(AGI = structure(list(), class = c("collector_character",
"collector")), aox1_0h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_0h_4.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_0h_5.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_12h_4.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_12h_5.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_12h_6.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_24h_4.tsv = structure(list(), class = c("collector_double",
"collector")), aox1_24h_6.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_0h_3.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_0h_4.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_0h_5.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_12h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_12h_2.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_12h_3.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_24h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_24h_4.tsv = structure(list(), class = c("collector_double",
"collector")), aox2_24h_5.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_0h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_0h_2.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_0h_3.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_12h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_12h_2.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_12h_3.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_24h_1.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_24h_2.tsv = structure(list(), class = c("collector_double",
"collector")), aox3_24h_3.tsv = structure(list(), class = c("collector_double",
"collector")), WT_0h_1.tsv = structure(list(), class = c("collector_double",
"collector")), WT_0h_2.tsv = structure(list(), class = c("collector_double",
"collector")), WT_0h_3.tsv = structure(list(), class = c("collector_double",
"collector")), WT_12h_2.tsv = structure(list(), class = c("collector_double",
"collector")), WT_12h_3.tsv = structure(list(), class = c("collector_double",
"collector")), WT_12h_5.tsv = structure(list(), class = c("collector_double",
"collector")), WT_24h_1.tsv = structure(list(), class = c("collector_double",
"collector")), WT_24h_4.tsv = structure(list(), class = c("collector_double",
"collector")), WT_24h_5.tsv = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
This experiment has 3 time points - 0 h/12h /24 hrs 4 genotypes - WT, 3 mutants (aox1/ aox2/ aox3)
##############
# Data Analysis Pipeline
#Follwed below pipeline : Link
#https://github.com/wyguo/ThreeDRNAseq/blob/master/vignettes/user_manuals/3D_RNA-seq_command_line_user_manual.md
#Specific steps
#run-tximport
genes_counts <- txi_genes$counts
##----->> parameters for 3D analysis
pval_adj_method <- 'BH'
pval_cut <- 0.05
l2fc_cut <- 1
deltaPS_cut <- 0.1
DAS_pval_method <- 'F-test'
##----->> pair-wise contrast groups
contrast <- c('aox1_0h-WT_0h',
'aox2_0h-WT_0h',
'aox3_0h-WT_0h',
'aox1_12h-WT_12h',
'aox2_12h-WT_12h',
'aox3_12h-WT_12h',
'aox1_24h-WT_24h',
'aox2_24h-WT_24h',
'aox3_24h-WT_24h')
# Step 2: DE genes
# batch.effect <- NULL ## if has no batch effects
design <- condition2design(condition = metadata$treat,
batch.effect = NULL) # batch.effect <- NULL ## if has no batch effects
if(DE_pipeline == 'glmQL'){
##----->> edgeR glmQL pipeline
genes_3D_stat <- edgeR.pipeline(dge = genes_dge,
design = design,
deltaPS = NULL,
contrast = contrast,
diffAS = F,
method = 'glmQL',
adjust.method = pval_adj_method)
}
## save results
DDD.data$genes_3D_stat <- genes_3D_stat
################################################################################
##----->> Summary DE genes
DE_genes <- summaryDEtarget(stat = genes_3D_stat$DE.stat,
cutoff = c(adj.pval=pval_cut,
log2FC=l2fc_cut))
DDD.data$DE_genes <- DE_genes
<h6>#</h6>
Bar plots
library(tidyverse)
df <- read_csv("data/Selected_Count_BR2.csv")
longdata <- df %>%
gather(key, value, -AGI) %>%
separate(key, c("Genotype", "Time", "Replicate"))
library(plotrix)
longdata2 <- longdata %>%
group_by(AGI, Genotype, Time) %>%
summarise_each(funs(mean,sd,std.error)) %>%
ungroup() %>%
mutate(Time = factor(Time), Genotype = factor(Genotype))
colnames(longdata2)
longdata2$Genotype <- factor(longdata2$Genotype,levels = c("WT", "aox1", "aox2", "aox3"))
# Plotting
longdata2 %>%
ggplot(aes(x = Time, y = value_mean, fill = Genotype)) +
geom_bar(position = position_dodge(), stat = "identity") +
geom_errorbar(aes(ymin = value_mean - value_std.error, ymax = value_mean + value_std.error),
width = 0.1,
position = position_dodge(0.9)) +
scale_fill_manual(values=c("#008000","#B8860B","#4169E1","#DC143C"))+
facet_wrap( ~ AGI,scales = "free_y")
Thank you very much Prof.