Question

Results Goana after edgeR pipeline

0

Entering edit mode

Barista • 0

@e51ccd8d

Last seen 2.4 years ago

Netherlands

Dear all, I am conducting a human Quantseq analysis where I compare one condition to another.

I just did Gene Ontology analysis with the goana() function, but the results puzzle me in some way. For the upregulated genes in condition A vs B, I can place the results in context, but for the dowregulated genes in condition A vs B, I am wondering if there is not another effect in play.

As you can see below, most of the downregulated GO terms are related to RNA or rRNA or other RNA or ribosome related processes. This is not something I can match with the pathphysiological nature of the disease involved. Also I am wondering if this is not technical noise that is involved. Could anyone shine his light on this?

Furthermore, I am wondering which GO terms to consider, for the downregulated genes this the p-values are 'only'*10-9 but for the upregeulated genes there are many with a p-value <10^-33. Is there a certain cut-off that I can use?

go <- goana(glf, species = "Hs", geneid = glf$genes$ENTREZID)

topGO(go, sort = "down")

                                                                          Term Ont    N Up Down      P.Up       P.Down
GO:0003723                                                         RNA binding  MF 1468 41   84 0.9999905 3.643114e-09
GO:0016072                                              rRNA metabolic process  BP  229  2   26 0.9998741 6.144943e-09
GO:0034470                                                    ncRNA processing  BP  373  2   34 0.9999999 8.824000e-09
GO:0006364                                                     rRNA processing  BP  221  2   25 0.9998171 1.306049e-08
GO:0042254                                                 ribosome biogenesis  BP  294  5   29 0.9989316 2.022792e-08
GO:1990904                                           ribonucleoprotein complex  CC  633 13   45 0.9999462 7.300391e-08
GO:0006614         SRP-dependent cotranslational protein targeting to membrane  BP   94  0   15 1.0000000 1.272881e-07
GO:0006396                                                      RNA processing  BP  878 12   55 1.0000000 1.679039e-07
GO:0022626                                                  cytosolic ribosome  CC   98  2   15 0.9554742 2.241477e-07
GO:0019083                                                 viral transcription  BP  172  0   20 1.0000000 2.424913e-07
GO:0006613                       cotranslational protein targeting to membrane  BP   99  0   15 1.0000000 2.570085e-07
GO:0034660                                             ncRNA metabolic process  BP  449  2   35 1.0000000 2.647635e-07
GO:0000184 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay  BP  116  1   16 0.9970315 3.860681e-07
GO:0022613                                ribonucleoprotein complex biogenesis  BP  421  8   33 0.9996272 5.210472e-07
GO:0045047                                             protein targeting to ER  BP  109  1   15 0.9957761 9.206612e-07
GO:0019080                                               viral gene expression  BP  189  0   20 1.0000000 1.110706e-06
GO:0072599      establishment of protein localization to endoplasmic reticulum  BP  112  2   15 0.9753514 1.310605e-06
GO:0022625                                   cytosolic large ribosomal subunit  CC   52  0   10 1.0000000 2.853507e-06
GO:0030684                                                         preribosome  CC   77  0   12 1.0000000 3.021034e-06
GO:0000956                          nuclear-transcribed mRNA catabolic process  BP  197  2   19 0.9994442 8.124291e-06

EdgeR Bioinformatics Goana • 1.2k views

ADD COMMENT • link 3.3 years ago Barista • 0

score 1 · Answer 1 · 2021-08-26

1

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 2 minutes ago

WEHI, Melbourne, Australia

It isn't possible to give you reliable advice on the specifics of your particular data. However, when I see a pattern like these ribosome-related processes, I usually conclude that the cells in one of groups (B in your case) appear to be under stress and are breaking up or dying. So it is a technical issue, but a technical issue to do with cell preparation rather than one of the software or statistical analysis.

For goana analyses, I ignore all results with P > 1e-6. Sorry is this pretty ad hoc, but GO analyses don't lend themselves to rigorous FDR control.

ADD COMMENT • link 3.3 years ago Gordon Smyth 52k

0

Entering edit mode

Hmmm, that would puzzle me in some way as the samples were randomized before sequencing, in the sense that per plate 96 samples were sequenced where condition A and B were mixed to reduce the batch effect on the groups. Or is this not what you mean?

Furthermore, do you also suspect that the results of the upregulated genes are also influenced by this supposed 'technical issue'. I presume this would be the case as the comparison is always a relative one, so if something is wrong with condition B, the comparison A versus B will be influenced.

ADD REPLY • link 3.3 years ago Barista • 0