I am trying to use batch correction using the batchelor package on two bulk RNA-Seq datasets . But, I run into errors.
> batchelor::fastMNN(as.matrix(dc),as.matrix(de))
more singular values/vectors requested than available'k' capped at the number of observations
> batchelor::mnnCorrect(as.matrix(dc),as.matrix(de))
'k' capped at the number of observations
> batchelor::rescaleBatches(as.matrix(dc),as.matrix(de))
Error in (function (..., log.base = 2, pseudo.count = 1, subset.row = NULL, : matrix should be double
- Is it ok to use these correction methods for bulk RNA-seq data?
- Is there some requirement for min number of samples?
Here are the dimensions of the datasets.
> dim(dc)
[1] 140213 28
> dim(de)
[1] 140213 4
And here is head.
> head(dc)
T1 T2 T3 T4 A1_L A2_L A3_L A4_L A1_S A2_S
chr2L_2_170 0 0 0 0 0 0 0 0 0 0
chr2L_1368_1544 0 0 0 0 0 0 0 0 0 0
chr2L_172691_172724 1 0 0 0 0 0 0 0 0 0
chr2L_1573892_1573953 0 0 0 0 0 1 0 0 0 0
chr2L_14712715_14712750 0 0 0 0 0 0 0 0 0 0
chr2L_14713015_14713036 0 0 0 0 0 0 0 0 0 0
A3_S A4_S B1_L B2_L B3_L B4_L B1_S B2_S B3_S
chr2L_2_170 0 0 0 0 0 0 0 0 0
chr2L_1368_1544 0 0 0 0 0 0 0 0 0
chr2L_172691_172724 0 0 0 0 0 0 0 0 0
chr2L_1573892_1573953 0 0 0 0 0 0 0 0 0
chr2L_14712715_14712750 0 0 0 0 0 0 0 0 0
chr2L_14713015_14713036 0 0 0 0 0 0 0 0 0
B4_S C1_L C2_L C3_L C4_L C1_S C2_S C3_S C4_S
chr2L_2_170 0 0 0 0 0 0 0 0 0
chr2L_1368_1544 0 0 0 0 0 0 0 0 0
chr2L_172691_172724 0 0 0 0 0 0 0 0 0
chr2L_1573892_1573953 0 0 0 0 2 0 0 0 1
chr2L_14712715_14712750 0 0 0 0 0 0 0 0 0
chr2L_14713015_14713036 0 0 0 0 0 0 0 0 0
> head(de)
GFP_T1 GFP_T2 RRP6_T1 RRP6_T2
chr2L_2_170 4 3 2 6
chr2L_1368_1544 5 0 4 5
chr2L_172691_172724 1 0 0 0
chr2L_1573892_1573953 0 0 1 1
chr2L_14712715_14712750 0 0 2 0
chr2L_14713015_14713036 1 0 1 3
These two datasets run fine with limma::removeBatchEffect()
. Are these errors fixable or is my data not good enough?