Entering edit mode
Ruppert Valentino
▴
270
@ruppert-valentino-1376
Last seen 10.3 years ago
Hello,
I am trying to write a script that will enter miRNA and get the
predicted target genes for that miRNA. I am trying to use various
software to do this, one of them is TargetScan. The problem is that I
don't know how to parse the HTML output table so that I can get the
target genes only.
For example I am search for target genes for the miRNA mmu-miR-1 as
follows:
http://www.targetscan.org/cgi-bin/targetscan/vert_50/targetscan.cgi?sp
ecies=Human&gid=&mir_sc=&mir_c=&mir_nc=&mirg=mmu-miR-1
This generates a table
The script is:
URL <- "http://www.targetscan.org/cgi-bin/targetscan/vert_50/targetsca
n.cgi?species=Human&gid=&mir_sc=&mir_c=&mir_nc=&mirg=mmu-miR-1"
dat <- readLines(URL)
But I don't know how to parse the table to separate it into columns
then I can take the column entitled "Human ortholog of target gene"
which would have the target genes.
In the example above the first gene COL4A3 starts at HTML code:
COL4A3
Is there any way to format such a table into columns then transpose
the column entitled "Human ortholog of target gene" and pass that to a
variable?
Many thanks,