Hi all,
For reference, I'm a relative newcomer to R and followed this tutorial to analyze my dataset of 2 conditions x 2 replicates in ballgown. I noticed that for many p values <<<0.5, the q values appeared disproportionately high (I did see that the OP of that tutorial posted the same question on BC several months ago, heh). When I went to inspect the output manually and ranking by q values, I found that all genes with an existing name (GENE1) had a duplicate that was just its accession (eg ChrN_00000). Normally just annoying, however: some fold changes are ever so slightly different (in the hundredths or thousandths place), which then impacts the p score, which then impacts the q score...
I'm not sure what I did to get this strange output, and I wonder if that might be impacting the q scores. Here's what happened, from creating the bg object:
bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data)
bg_filt = subset(bg,"rowVars(texpr(bg)) >1",genomesubset=TRUE)
results_genes = stattest(bg_filt, feature="gene", covariate="condition", getFC=TRUE, meas="FPKM")
results_genes2 = merge(results_genes,bg_gene_names,by.x=c("id"),by.y=c("gene_id"))
followed by export as a tab-delimited text file. I was wondering if anybody has any insight? Thank you for your help!