Regarding the PCA bi-plot, I see no major issue, assuming that you have generated this PCA bi-plot in an unbiased ('unsupervised') way using all genes. Can you share the code that you used? Your 2 groups (Control + Sample) are almost exclusively segregated along PC1. The sample at the bottom-right is behaving differently, but it is still not grouping with Control.
Then, in your second figure generated with pheatmap(), it seems that —yes— your groups are segregated perfectly via hierarchical clustering, and the heatmap colour shade also indicates this.
Regarding the gene naming issue, which species is this? Can you confirm how the read count quantification was performed and with which reference GTF? Generally, to help, please explain your broader analysis pipeline so that we can begin to try to solve this.
Thanks, you are evidently not following the typical DESeq2 analysis pipeline - you are missing the lfcShrink() stage. Please take a look at the Quick start.
Are you showing all of the output of geneinfo? There seems to be at least 2 columns missing.
PCA code:
vsd <- vst(dds, blind = T) # Varaiance Stabilizing transformation
plotPCA(vsd, intgroup = "C.S")
2- the organism is plasmodium falicparium
design(dds) <- ~ C.S
dds <- DESeq(dds)
res <-results(dds)
summary(res)
resSort <- res[order(res$padj),]
library("org.Pf.plasmo.db")
geneinfo <- select(org.Pf.plasmo.db, keys=rownames(resSort)[1:20], columns=c("SYMBOL","GENENAME","GO"), keytype="SYMBOL")
geneinfo
gene info returns some repeated genes and some with decimal:
Thanks, you are evidently not following the typical DESeq2 analysis pipeline - you are missing the
lfcShrink()
stage. Please take a look at the Quick start.Are you showing all of the output of
geneinfo
? There seems to be at least 2 columns missing.Thank you,
can you kindly where should I use ilfcShrink() stage
the geneinfo output is ok its just cut to show gene_id
Hi, regarding lfcShrink, the information is in the Quick start (please see my other comment)