Batch effect adjust in model matrix
1
0
Entering edit mode
Giuseppe • 0
@giuseppe-25187
Last seen 12 months ago
Salerno

Hello everyone, I'm working with an RNAseq (bulk) dataset and I need to correct the batch effect induced by days of experiment (showed also by MDS plot)

I have two genotypes (APOE and C57) and each has been processed in different days, so:

  • APOE_Day1
  • APOE_Day2
  • C57_Day3
  • C57_Day4

I need some suggestion on how to model this batch effect in the design matrix in order to find difference among C57 and APOE group

below my code

x <- read.delim("RNAdataset_APOE4KI.txt",  stringsAsFactors=FALSE, header = TRUE)
DG <- readDGE(x ,header=FALSE)
matrix= DG$counts
DG <- DGEList(matrix, group= as.character(x$Genotype))
DG$samples$Experiment3= x$Experiment3

Design matrix to filter out low expressed genes taking into account the group

design= model.matrix(~0+ DG$samples$group)
colnames(design)= c("APOE","C57")

Filtering

keep <- filterByExpr(DG, design=design)
DG <- DG[keep,]
DG$samples$lib.size <- colSums(DG$counts) 
DG <- calcNormFactors(DG, method = "TMM") #samples are normalized using TMM method
logCPM <- cpm(DG, log=TRUE)

Design matrix for differential analsysis

design= model.matrix(??????)
sessionInfo()

R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Big Sur ... 10.16

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Users/IOE23854723/miniconda3/envs/rstudioosx64/lib/libopenblasp-r0.3.23.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Rome
tzcode source: system (macOS)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] orca_1.1-1          plotly_4.10.2       tinytex_0.46        DT_0.28             htmlwidgets_1.6.2   DGEobj.utils_1.0.6  sva_3.48.0          BiocParallel_1.34.2
 [9] genefilter_1.82.1   mgcv_1.9-0          nlme_3.1-163        gplots_3.1.3        RColorBrewer_1.1-3  ggrepel_0.9.3       magrittr_2.0.3      dplyr_1.1.2        
[17] ggplot2_3.4.3       edgeR_3.42.4        limma_3.56.2        knitr_1.43         

loaded via a namespace (and not attached):
 [1] DBI_1.1.3               bitops_1.0-7            rlang_1.1.1             matrixStats_1.0.0       compiler_4.3.1          RSQLite_2.3.1           png_0.1-8              
 [8] vctrs_0.6.3             stringr_1.5.0           pkgconfig_2.0.3         crayon_1.5.2            fastmap_1.1.1           XVector_0.40.0          ellipsis_0.3.2         
[15] labeling_0.4.3          caTools_1.18.2          utf8_1.2.3              purrr_1.0.2             bit_4.0.5               xfun_0.40               zlibbioc_1.46.0        
[22] cachem_1.0.8            GenomeInfoDb_1.36.1     jsonlite_1.8.7          blob_1.2.4              parallel_4.3.1          R6_2.5.1                bslib_0.5.1            
[29] stringi_1.7.12          jquerylib_0.1.4         Rcpp_1.0.11             assertthat_0.2.1        IRanges_2.34.1          Matrix_1.6-1            splines_4.3.1          
[36] tidyselect_1.2.0        rstudioapi_0.15.0       yaml_2.3.7              codetools_0.2-19        lattice_0.21-8          tibble_3.2.1            Biobase_2.60.0         
[43] withr_2.5.0             KEGGREST_1.40.0         survival_3.5-7          Biostrings_2.68.1       pillar_1.9.0            MatrixGenerics_1.12.2   KernSmooth_2.23-22     
[50] DGEobj_1.1.2            stats4_4.3.1            generics_0.1.3          RCurl_1.98-1.12         S4Vectors_0.38.1        munsell_0.5.0           scales_1.2.1           
[57] gtools_3.9.4            xtable_1.8-4            glue_1.6.2              lazyeval_0.2.2          tools_4.3.1             data.table_1.14.8       annotate_1.78.0        
[64] locfit_1.5-9.8          XML_3.99-0.14           grid_4.3.1              tidyr_1.3.0             crosstalk_1.2.0         AnnotationDbi_1.62.2    colorspace_2.1-0       
[71] GenomeInfoDbData_1.2.10 cli_3.6.1               fansi_1.0.4             viridisLite_0.4.2       gtable_0.3.4            sass_0.4.7              digest_0.6.33          
[78] BiocGenerics_0.46.0     farver_2.1.1            memoise_2.0.1           htmltools_0.5.6         lifecycle_1.0.3         httr_1.4.7              bit64_4.0.5

Thanks a lot in advance Giuseppe

BatchEffect edgeR • 659 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 7 hours ago
Germany

Each sample is a different day, that is called "nested", so there is nothing you can do here. Next time do a better experimental design, here that would for example be one ApoE and one C57 per day. If that was the case you could do ~batch+group.

All you can do here is ~group, hoping that the biological differences are much stronger than the batch effect that you cannot correct for. It's a perfect example why you need to think about analysis before doing the experiment.

ADD COMMENT

Login before adding your answer.

Traffic: 542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6