Hi Sam,
First, please always give us the results of sessionInfo(). This is
especially critical in the case of ReportingTools, which has been
fundamentally altered between the previous and current versions of
BioC.
On 6/19/2013 11:12 AM, Sam McInturf wrote:
> Bioconductors,
> I am working on a RNA seq analysis project and am having
trouble
> publishing an HTML report for it. I am unsure of how to make my DE
genes
> have the same ID as what publish() will accept when passing an
argument to
> 'annotation'.
> I mapped the reads using tophat and passed the TAIR 10 gtf file
to the
> -G option. When i counted my reads I used the summarizeOverlaps
function
> from GenomicRanges and again used this same file. I called
differential
> expression in edgeR using the GLM methods. So the rownames of my
DE table
> are the AGI identifiers (AT#G#####). I loaded the org.At.tair.db
> annotations and passed it to HTMLReport in:
>
> publish(DGELists[["Roots"]], myHTML, countTable = cpmMat, conditions
=
> group, annotation = "org.At.tair.db", pvaueCutoff = 0.01, lfc =2, n
= 1000,
> name = "RootsLRT")
> Error: More than half of your IDs could not be mapped.
> In addition: Warning message:
> In .DGELRT.to.data.frame(object, ...) : NAs introduced by coercion
>
> which makes sense, because publish() is looking for Entrez IDs
(right?)
>
> How do I proceed?
Here I assume you are running R-3.0.x and the current release of BioC.
When you run publish() on anything but a data.frame, the first step is
to coerce to a data.frame using a set of assumptions that might not
hold
in your case (or there may be defaults that you don't like). Because
of
this, I tend to just coerce to a data.frame myself and then publish()
that directly. This also allows you to pass in arguments to .modifyDF
which is kind of sweet.
In the case of a DGELRT or DEGExact object, there is a 'genes' slot
that
will be used to annotate the output of topTags(). Ideally you would
just
add the annotation you want to that slot first. So you could do
something like
annot <- select(org.At.tair.db, DGELists[["Roots"]]$genes[,<tair column="" goes="" here="">], c("SYMBOL","GENENAME","OTHERSTUFF"))
and then put that in your DGEobjects. Now you can do something like
outlst <- lapply(DGELists, topTags, otherargsgohere)
htmlst <- lapply(seq_len(length(DGELists)) function(x)
HTMLReport(namevector[x], titlevector[x], otherargs))
and you can define a function similar to this function I use for
Entrez
Gene IDs:
entrezLinks <- function (df, ...){
df$ENTREZID <- hwriter::hwrite(as.character(df$ENTREZID),
link = paste0("
http://www.ncbi.nlm.nih.gov/gene/",
as.character(df$ENTREZID)),
table = FALSE)
return(df)
}
but modified for the Tair equivalent and then
lapply(seq_len(length(htmlst)), function(x) publish(outlst[[x]],
htmlst[[x]], .modifyDF = samsTairLinkFun)))
lapply(htmlst, finish)
et voila!
You can also then use htmlst to make a bunch of links in an index.html
page.
indx <- HTMLReport("index", "A bunch of links for this expt",
reportDirectory=".", baseUrl = "")
publish(hwriter::hwrite("Here are links", page(indx), header=2,
br=TRUE), indx)
publish(Link(htmlst, report=indx), indx)
finish(indx)
Best,
Jim
>
> Thanks in advance!
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099