Entering edit mode
martin.busch
•
0
@martinbusch-15897
Last seen 4.5 years ago
Hi everyone,
I am sorry to ask another question, however, there is an error message that keeps me puzzled. When passing over a list of human entrez IDs to reactomePA for GSEA using
result <- gsePathway(anaData, nPerm=10000, pvalueCutoff=0.2, pAdjustMethod="BH", verbose=FALSE)
Rstudio becomes busy and cannot finish computation. When I manually stop it I get the error message:
Warning message: In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize, : There are duplicate gene names, fgsea may produce unexpected results
How can I pass over parameters like maxSize=500
and which parameter can I use to avoid duplicate gene names, although the entrez IDs are unique? Seems like the mapping yields duplicate gene names?!
Thank you so much in advance for your help,
Martin
P.S: Input look like this
> head(anaData,10) 1301 3371 4069 57537 11081 5764 114899 2331 1303 7060 6.198340 4.505550 3.962765 3.753962 3.461323 3.148910 3.075820 3.034261 3.010098 2.880258
> length(anaData) [1] 11317
Could you also paste the resul of any(duplicated(names(anaData)))? This is what is checked at fgsea.
Thank you so much for your comment - in fact I was pretty suprised to see that the result was
true
- something that should not have happened. There was some mapping involved and it seems that multiple ensembl IDs can be mapped to one entrez ID. I thought that I had this sorted out. Anyways, not it works pretty fine! Thanks a lot!