Question

Tidyverse masking Genefilter after BioConductor update

0

Entering edit mode

jonellevillar • 0

@jonellevillar-15405

Last seen 9 months ago

Bergen

Dear Friends,

Recently I updated to BioConductor v3.7 and the library for IlluminaHumanMethylationEPICanno.ilm10b4.hg19. I was running R 3.4.4 at the time and started to have problems with several packages after accessing the library for EPIC anno b4. I have now updated R to 3.5.1 and am experiencing a variety of problems. Tidyverse has different conflicts each day. Today's conflicts are greater than usual, so it must be Monday.

One issue that concerns me is that after sorting out all these R package updates, my results are very different from when I ran EPICanno version b2. My results now show adjusted p-values that are all 1, despite the p values being different for each probe.

I am running Windows 10 but had a similar problem with Tidyverse conflicts on a Mac with OS High Sierra.

Thank you for any suggestions.

Jonelle

> library(tidyverse)
-- Attaching packages --------------------------------------- tidyverse 1.2.1 --
v ggplot2 3.0.0     v purrr   0.2.5
v tibble  1.4.2     v dplyr   0.7.6
v tidyr   0.8.1     v stringr 1.3.1
v ggplot2 3.0.0     v forcats 0.3.0
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x purrr::accumulate() masks foreach::accumulate()
x dplyr::collapse()   masks Biostrings::collapse(), IRanges::collapse(), nlme::collapse()
x dplyr::combine()    masks minfi::combine(), Biobase::combine(), BiocGenerics::combine()
x purrr::compact()    masks XVector::compact()
x dplyr::count()      masks matrixStats::count()
x dplyr::desc()       masks IRanges::desc()
x tidyr::expand()     masks S4Vectors::expand()
x dplyr::filter()     masks stats::filter()
x dplyr::first()      masks S4Vectors::first()
x dplyr::lag()        masks stats::lag()
x ggplot2::Position() masks BiocGenerics::Position(), base::Position()
x purrr::reduce()     masks GenomicRanges::reduce(), IRanges::reduce()
x dplyr::rename()     masks S4Vectors::rename()
x purrr::simplify()   masks DelayedArray::simplify()
x dplyr::slice()      masks XVector::slice(), IRanges::slice()
x readr::spec()       masks genefilter::spec()
x purrr::when()       masks foreach::when()
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.3.0                                       stringr_1.3.1                                      
 [3] dplyr_0.7.6                                         purrr_0.2.5                                        
 [5] tidyr_0.8.1                                         tibble_1.4.2                                       
 [7] ggplot2_3.0.0                                       tidyverse_1.2.1                                    
 [9] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0 minfi_1.26.2                                       
[11] bumphunter_1.22.0                                   locfit_1.5-9.1                                     
[13] iterators_1.0.10                                    foreach_1.4.4                                      
[15] Biostrings_2.48.0                                   XVector_0.20.0                                     
[17] SummarizedExperiment_1.10.1                         DelayedArray_0.6.3                                 
[19] matrixStats_0.54.0                                  Biobase_2.40.0                                     
[21] GenomicRanges_1.32.6                                GenomeInfoDb_1.16.0                                
[23] IRanges_2.14.10                                     S4Vectors_0.18.3                                   
[25] BiocGenerics_0.26.0                                 readr_1.1.1                                        
[27] sva_3.28.0                                          BiocParallel_1.14.2                                
[29] genefilter_1.62.0                                   mgcv_1.8-24                                        
[31] nlme_3.1-137                                        limma_3.36.2                                       

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2         siggenes_1.54.0          mclust_5.4.1             base64_2.0              
 [5] rstudioapi_0.7           bit64_0.9-7              AnnotationDbi_1.42.1     lubridate_1.7.4         
 [9] xml2_1.2.0               codetools_0.2-15         splines_3.5.1            jsonlite_1.5            
[13] Rsamtools_1.32.2         broom_0.5.0              annotate_1.58.0          HDF5Array_1.8.1         
[17] compiler_3.5.1           httr_1.3.1               backports_1.1.2          assertthat_0.2.0        
[21] Matrix_1.2-14            lazyeval_0.2.1           cli_1.0.0                prettyunits_1.0.2       
[25] tools_3.5.1              bindrcpp_0.2.2           gtable_0.2.0             glue_1.3.0              
[29] GenomeInfoDbData_1.1.0   doRNG_1.7.1              Rcpp_0.12.18             cellranger_1.1.0        
[33] multtest_2.36.0          preprocessCore_1.42.0    rtracklayer_1.40.3       DelayedMatrixStats_1.2.0
[37] rvest_0.3.2              rngtools_1.3.1           XML_3.98-1.12            beanplot_1.2            
[41] zlibbioc_1.26.0          MASS_7.3-50              scales_0.5.0             hms_0.4.2               
[45] rhdf5_2.24.0             GEOquery_2.48.0          RColorBrewer_1.1-2       yaml_2.2.0              
[49] memoise_1.1.0            pkgmaker_0.27            biomaRt_2.36.1           reshape_0.8.7           
[53] stringi_1.1.7            RSQLite_2.1.1            GenomicFeatures_1.32.0   bibtex_0.4.2            
[57] rlang_0.2.1              pkgconfig_2.0.1          bitops_1.0-6             nor1mix_1.2-3           
[61] lattice_0.20-35          Rhdf5lib_1.2.1           bindr_0.1.1              GenomicAlignments_1.16.0
[65] bit_1.1-14               tidyselect_0.2.4         plyr_1.8.4               magrittr_1.5            
[69] R6_2.2.2                 DBI_1.0.0                pillar_1.3.0             haven_1.1.2             
[73] withr_2.1.2              survival_2.42-6          RCurl_1.95-4.11          modelr_0.1.2            
[77] crayon_1.3.4             progress_1.2.0           readxl_1.1.0             grid_3.5.1              
[81] data.table_1.11.4        blob_1.1.1               digest_0.6.15            xtable_1.8-2            
[85] illuminaio_0.22.0        openssl_1.0.2            munsell_0.5.0            registry_0.5            
[89] quadprog_1.5-5

>

bioconductor R tidyverse IlluminaHumanMethylationEPICanno.ilm10b4.hg19. • 1.5k views

ADD COMMENT • link updated 6.7 years ago by James W. MacDonald 68k • written 6.7 years ago by jonellevillar • 0

score 1 · Answer 1 · 2018-08-06

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 16 hours ago

United States

Well, you have the exact same logFC and average expression for the first probe in both comparisons, but the t-statistic is almost cut in half. The code you show indicates that you used treat for at least one of these examples. But you couldn't have used treat for the first example since the default logFC is log2(1.2), which is 0.263. Since you have probes with a logFC smaller than that in your first example, you can't have used treat to get those results. So you made changes to your code between those examples, which probably explains the differences you see.

ADD COMMENT • link 6.7 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you James for your comments. I did use "treat" in both analyses. I ran them again on my colleague's computer and the results were in line with earlier runs with version b2 (IlluminaHumanMethylationEPICanno.ilm10b2.hg19). I only called dplyr from the Tidyverse but perhaps a package is still being masked when I run limma.

ADD REPLY • link 6.7 years ago jonellevillar • 0

score 0 · Answer 2 · 2018-08-06

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 16 hours ago

United States

There are two issues here. First, you have conflicts between function names. The easy way around that is to always use the fully qualified function name (e.g., Package::function), so for example if you need to use the collapse function from Biostrings, you call Biostrings::collapse, which will ensure that you get that one rather than the collapse function that is highest on the call stack. You might also consider not loading the tidyverse, but rather loading individual packages that are part of the tidyverse that you might need.

The second issue is that you say your results have changed, without giving any way for people to try to diagnose. Just saying that you get different results, without giving any code, or ideally a self-contained example that shows the differences is not nearly enough information for anybody to even make an attempt to try to help.

ADD COMMENT • link 6.7 years ago James W. MacDonald 68k

0

Entering edit mode

<caption>EPICanno version b2</caption>

	logFC	AveExpr	t	P.Value	adj.P.Val	gene
cg05491587	-0.62266	2.15877	-5.05809	1.20E-06	0.292371	KCNG2
cg04695077	-0.21331	-1.88039	-4.94459	1.99E-06	0.292371
cg25289484	0.369608	3.032281	4.896182	2.46E-06	0.292371	SPEN
cg23296325	0.182325	-4.16335	4.882616	2.61E-06	0.292371	NAA30
cg05646885	-0.16456	0.127818	-4.76654	4.33E-06	0.292371
cg14476293	0.255106	2.994545	4.76376	4.38E-06	0.292371	MIR548F3
cg15986164	-0.18748	3.796379	-4.75744	4.50E-06	0.292371	GAK
cg23504411	0.19675	-1.18717	4.747701	4.70E-06	0.292371	HIST1H3J
cg16688303	0.248158	0.650324	4.746597	4.72E-06	0.292371	LRRTM2;CTNNA1;CTNNA1;

After updating to the b4.hg19 package. Here is an example of the top 10 hits.

<caption>EPICanno version b4</caption>

	logFC	AveExpr	t	P.Value	adj.P.Val	gene
cg05491587	-0.62266	2.15877	-2.92137	0.002007	1	KCNG2
cg24664470	1.103853	2.67871	2.596937	0.005183	1
cg04798314	-1.14038	1.45482	-2.46264	0.007513	1	SMYD3
cg01733572	-0.59554	-2.63138	-2.28935	0.011714	1	DNAAF1
cg07988609	-0.56729	2.479462	-2.13306	0.017259	1
cg26999053	-0.71406	0.4881	-2.03263	0.021921	1	MYH10;MYH10;MYH10
cg18232235	-0.91379	4.605203	-2.02702	0.022369	1
cg14614789	-0.78839	-1.60431	-1.89377	0.030181	1
cg13353717	0.529046	-2.34306	1.801462	0.036801	1	HLA-DQB1

Thank you for your consideration.

Jonelle

ADD REPLY • link 6.7 years ago jonellevillar • 0

0

Entering edit mode

Thank you James for your suggestion to just load the packages of Tidyverse that I need. What I consider as a problem with my results was the p-values and the adjusted p-values. Please see the two tables below.

Using the limma package. Here is a bit of my script.

mod = model.matrix(~ 0 + AP1 + factor(Gender) + factor(smoker), data = target)

colnames(mod) <-gsub("AP1","",colnames(mod))
colnames(mod) <- make.names(colnames(mod))
colnames(mod)

contrast.matrix <- makeContrasts(Olanzapine = Ola-(Ari+Que)/2, levels = mod)

## fit model####
fit = lmFit(data, mod)
fitContrasts <- contrasts.fit(fit,contrasts=contrast.matrix)
eb = treat(fitContrasts)
results = topTreat(eb,adjust="fdr",number=length(eb$coefficients))
head(results)

ADD REPLY • link 6.7 years ago jonellevillar • 0