Hello,
I have downloaded TCGA datasets (htseq count file) for several cancer disease. I realized that each dataset has large number of tumor sample but not the normal sample. For example only 60 samples normal and up to ~500 or more tumor samples. Will this unbalance sample cause any problem if I use DEseq2 to get the differentially expressed gene profile? Thank you veru much.
I don't believe there will be any major problem due to imbalance; I'd be more worried about lack of matched tumour:normal samples (seems unlikely that they've taken 9 tumour samples from each patient providing a normal), but that's the nature of public clinical data.