Hello everybody,
I am using TCGA LUAD RNAseq data for DEG analysis. There are data of 522 patients, only 59 of them have paired (have normal and tumor sample) data without replicates. Can I compare all data of tumor samples (522) against all data of normal samples (59)? Is it possible in statistics? Or is it more logical to use paired samples? How should I design the DESeq2 analysis if it is possible to use unpaired data?
I input the data as summarized experiment and used design for paired samples as below:
> library("DESeq2")
> ddsSE <- DESeqDataSet(data, design = ~ patient + shortLetterCode)
> ddsSE$shortLetterCode <- relevel(ddsSE$shortLetterCode, ref = "NT")
> DE <- DESeq(ddsSE)
> DEresults <- results(DE)
shortLetterCode column have "NT" (Solid Tissue Normal) and "TP" (Primary Solid Tumor) sample types and I used relevel command to make "NT" as reference sample.
Should I remove "patient" from design part for unpaired samples?
Can we test for the tumor vs normal effect, controlling for patient effect by using unpaired samples?
Thanks.
Thank so very much for your advices.