# consider the following:
LibList=c('this', 'that', 'theother', ... , 'libraryN - 1', 'library N')
LibsToInstall<-LibList[ !(LibList %in% as.data.frame(installed.packages())[["Package"]]) ]
BiocManager::install(LibsToInstall, update=TRUE, ask=FALSE)
lapply(TcgaUtilLibs, library, character.only = TRUE)
Irrespective of the version of R/Bioconductor, as Liblist grows beyond 10, the probability of an installation can only increase; by the time N=20, in my experience, the failure rate approaches 1 (and more often you will be looking at 2 to 3 failures).
By contrast, on the command line, irrespective of OS (e.g. Mac, various flavors of Linux)
cd RlibPathDir
wget https://bioconductor.org/packages/release/bioc/src/contrib/this_1.40.2.tar.gz
wget https://bioconductor.org/packages/release/bioc/src/contrib/that_1.40.2.tar.gz
wget https://bioconductor.org/packages/release/bioc/src/contrib/the_other_1.40.2.tar.gz
...
wget https://bioconductor.org/packages/release/bioc/src/contrib/LibraryN.tar.gz
R CMD INSTALL ./this_1.40.2.tar.gz
almost never fails (all I can really say here is that I have not experience a failure to date, but I have worked with many versions of R, Rstudio, bioconductor, etc.).
What is BiocManager::install
doing that creates such a high failure rate? Is there any prebuilt functionality that will take the above approach instead of whatever the default behavior of BiocManager::install()
is?
I was shown an example of this recently where more than 100 packages were being installed on a docker image. one or two packages would fail to download, with the error message being something along the lines of 'unable to find host'. This implies to me that there was a temporary internet connectivity issue on the wireless network, and in general an unreliable internet connection seems like a likely candidate for intermittent failed downloads. What is your internet connection like to bioconductor.org?
The way the failure is handled is not useful -- the installation continues to download all packages, then fails to install the failed package and its reverse dependencies. The user then tries again, but the failed package and reverse dependencies need to be downloaded again (though not the packages that were successfully installed).
BiocManager::install()
delegates this to R'sinstall.packages()
, so the underlying issue / solution might be there. Several ideas are to investigate alternative download methods, and to cache downloaded tarballs. I believe thatinstall.packages()
usesdownload.packages()
and in turndownload.file()
.download.packages()
has adestdir
argument that might be exploited to avoid re-downloading;download.file()
has amethod
argument that could be explored.For
method
, the help page?download.file
says that an 'libcurl' is used by default but on some OS (including the poster's?) there iswget
. The help page suggests one way of setting this asoptions(download.file.method = "wget")
.I'm not sure how
destdir
could be exploited easily by the 'user' -- I don't think there's a way to tellinstall.packages()
to only download packages that have not yet been downloaded.I think both
method=
anddestdir=
could be used as arguments toBiocManager::install()
and would be passed toinstall.packages()
.Update -- yes, adding
method = "wget"
toBiocManager::install()
changes the download method. Also true withdestfile =
, but still not sure how that might help......and after a little more digging R also has
options(download.file.extra=...)
, which passes command-line arguments to the download method. For wget, passing-c
might mean that partial downloads are continued, and existing downloads are not re-downloaded; there is also-N
and options for retries, but I am not a wget expert. Also, since the download file location is constant per-session, it seems reasonable if a bit hacky to doand repeat
BiocManager::install(pkgs)
as necessary -- successfully downloaded packages won't be re-downloaded; successfully installed packages won't be re-installed.Can you give some examples of the type of error(s) you're encountering when using
BiocManager::install()
e.g. is it missing R package dependencies, missing system libraries, corrupted tarball downloads, problems with existing00LOCK-pkg
folders in the library directory, multiple parallel installations that encounter race conditions etc?There are lots of ways package installation can fail, but I don't feel the behaviour you describe is my typical experience. Certainly not a 50% failure rate if I try to install N=2 packages. Personally I think I'd find getting collecting all the dependencies to install manually using
R CMD INSTALL
considereably more work and error prone.For example, I regularly install hundreds of packages using BiocManager e.g MSMB-Quarto and it generally works fine.
There are other packages that try to manage package installation (amongst other things) for example pak or renv
Sorry about the N=2. That was meant to read 20 (following the 10 in the previous sentence), my apologies there.
The provided link to MSMB-quarto issues a 404 for me.
Re: R CMD install - I wish that were true, I would sure as hell prefer to be doing this through bioconductor.
Re: the types of errors, please see reply below.
Thanks for takign time out to reply.
Hi, I don't see information on types of errors. Feel free to attach a report with a reproducible event and full information on BiocManager::version(), BiocManager::valid(), and sessionInfo(). There are many ways for R package installation to trigger adverse events. BiocManager::install was created to help reduce the risk of adverse events, and it introduces certain constraints to help users avoid inconsistencies. Your report is important to us, but without sufficient information we cannot provide more assistance.
Sorry, I forgot that MSMB-Quarto was a private repo. It's the workflow that builds www.huber.embl.de/msmb and currently installs 120 named packages + dependencies. Here's the current list: