Question

p-values, q-values, duplicated entries in output - ballgown

2

Entering edit mode

mkhasin ▴ 20

@mkhasin-13981

Last seen 6.7 years ago

Hi all,

For reference, I'm a relative newcomer to R and followed this tutorial to analyze my dataset of 2 conditions x 2 replicates in ballgown. I noticed that for many p values <<<0.5, the q values appeared disproportionately high (I did see that the OP of that tutorial posted the same question on BC several months ago, heh). When I went to inspect the output manually and ranking by q values, I found that all genes with an existing name (GENE1) had a duplicate that was just its accession (eg ChrN_00000). Normally just annoying, however: some fold changes are ever so slightly different (in the hundredths or thousandths place), which then impacts the p score, which then impacts the q score...

I'm not sure what I did to get this strange output, and I wonder if that might be impacting the q scores. Here's what happened, from creating the bg object:

bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data) bg_filt = subset(bg,"rowVars(texpr(bg)) >1",genomesubset=TRUE) results_genes = stattest(bg_filt, feature="gene", covariate="condition", getFC=TRUE, meas="FPKM") results_genes2 = merge(results_genes,bg_gene_names,by.x=c("id"),by.y=c("gene_id"))

followed by export as a tab-delimited text file. I was wondering if anybody has any insight? Thank you for your help!

ballgown rna-seq q value r • 1.2k views

ADD COMMENT • link 7.6 years ago mkhasin ▴ 20