I am doing a metatrancriptomic analysis, now I am analyzing the sequences corresponding to a parasite that live inside a host. I have millions of reads to the host but fewer reads of the parasite.
My library size of parasite by replicate is around 193,000. I have seen that "As a rule of thumb, we require that a gene have a count of at least 10–15", given my low deep I can filter by cpm>1. What can I do, use a higher cpm value?
Yes, you will need to use a higher cpm cutoff if a sizeable group of your samples have low sequencing depth. You can use the provided function, filterByExpr, which will choose an appropriate cpm cutoff for you automatically.