I have a use case that I am not sure the best way to proceed.
I have a non-model organism for which we would like to know the
sequence
for a set of genes. It's a bird species, and I can for example align
to the
zebra finch genome, then run bam_tally over a region to get variants
where
my species differs from zebra finch, and then infer the sequence based
on
the reference and the variants.
But that seems harder than it should be. Is there some function that I
am
missing that I can feed a GRanges and get back the consensus sequence
for
that region, based on say a simple vote of the reads that overlap it?
Thanks,
Jim
[[alternative HTML version deleted]]
Hi Jim,
On 09/08/2014 07:32 PM, James W. MacDonald wrote:
> I have a use case that I am not sure the best way to proceed.
>
> I have a non-model organism for which we would like to know the
sequence
> for a set of genes. It's a bird species, and I can for example align
to the
> zebra finch genome, then run bam_tally over a region to get
variants where
> my species differs from zebra finch, and then infer the sequence
based on
> the reference and the variants.
>
> But that seems harder than it should be. Is there some function that
I am
> missing that I can feed a GRanges and get back the consensus
sequence for
> that region, based on say a simple vote of the reads that overlap
it?
So basically you're going to use the same approach as if you had reads
from a zebra finch individual. The real gene sequences for that
individual would probably also differ from the reference sequence,
maybe less than for your bird that is not zebra finch, but that does
not really change the overall game right?
I agree you should try to avoid the intermediate step of variant
calling. Sounds a little bit simpler to aim directly for the
consensus sequence of your regions of interest. Here is one way
to do this:
library(GenomicAlignments)
## Uses pileLettersAt() and alpahabetFrequency() internally.
## Arguments in ... are passed to alpahabetFrequency().
## The 'param' arg must be like in stackStringsFromBam() (see
## ?stackStringsFromBam for the details).
alphabetFrequencyFromBam <- function(file, index=file, param, ...)
{
param <- GenomicAlignments:::.normarg_param(param)
region <- bamWhich(param)
bamWhat(param) <- "seq"
gal <- readGAlignmentsFromBam(file, index=index, param=param)
seqlevels(gal) <- names(region)
qseq <- mcols(gal)$seq
at <- GRanges(names(region),
IRanges(start(region[[1L]]):end(region[[1L]]),
width=1L))
piles <- pileLettersAt(qseq, seqnames(gal), start(gal),
cigar(gal),
at)
alphabetFrequency(piles, ...)
}
## A naive consensus string extraction: at each position, select
## the letter with highest frequency. Insert letter "." if all
## frequencies are 0 at a given position. Insert letter "*" is
## there is a tie.
consensusStringFromFrequencyMatrix <- function(fm)
{
selection <- apply(fm, 2,
function(x) {
i <- which.max(x)
if (x[i] == 0L)
return(5L)
if (sum(x == x[i]) != 1L)
return(6L)
i
})
paste0(c(DNA_BASES, ".", "*")[selection], collapse="")
}
Then:
## Get some genes.
library(GenomicFeatures)
txdb <- makeTranscriptDbFromUCSC(genome="taeGut2", table="refGene")
my_genes <- sample(genes(txdb), 20)
## Extract the corresponding consensus sequences from your BAM
file.
my_gene_seqs <- sapply(seq_along(my_genes),
function(i) {
af <- alphabetFrequencyFromBam(bamfile, param=my_genes[i],
baseOnly=TRUE)
fm <- t(af[ , DNA_BASES])
consensusStringFromFrequencyMatrix(fm)
})
IMPORTANT: The strand in 'my_genes' is ignored, that is, all sequences
are extracted from the plus strand (and going from 5' to 3' along that
strand), even for genes that belong to the minus strand.
It's not going to be very efficient (approx. 1 s per gene).
Cheers,
H.
>
> Thanks,
>
> Jim
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
I should add that the code I sent previously makes sense if
the genes in your individual contains only a "small" number of
substitutions with respect to the reference genes. If they
contain indels and/or a high number of substitutions, it would
not be appropriated (it would still "work" but the consensus
sequences could not be trusted). The problem with indels is that
the alphabetFrequencyFromBam() function doesn't know how to
handle them (because it only knows how to report things that
align with nucleotides in the reference genome, but insertions
don't). Also having too many substitutions would make the
alignment process less reliable (would make it less likely
that reads coming from gene X align to the region
corresponding to gene X in the reference).
H.
On 09/10/2014 03:44 PM, Hervé Pagès wrote:
> Hi Jim,
>
> On 09/08/2014 07:32 PM, James W. MacDonald wrote:
>> I have a use case that I am not sure the best way to proceed.
>>
>> I have a non-model organism for which we would like to know the
sequence
>> for a set of genes. It's a bird species, and I can for example
align
>> to the
>> zebra finch genome, then run bam_tally over a region to get
variants
>> where
>> my species differs from zebra finch, and then infer the sequence
based on
>> the reference and the variants.
>>
>> But that seems harder than it should be. Is there some function
that I am
>> missing that I can feed a GRanges and get back the consensus
sequence for
>> that region, based on say a simple vote of the reads that overlap
it?
>
> So basically you're going to use the same approach as if you had
reads
> from a zebra finch individual. The real gene sequences for that
> individual would probably also differ from the reference sequence,
> maybe less than for your bird that is not zebra finch, but that does
> not really change the overall game right?
>
> I agree you should try to avoid the intermediate step of variant
> calling. Sounds a little bit simpler to aim directly for the
> consensus sequence of your regions of interest. Here is one way
> to do this:
>
> library(GenomicAlignments)
>
> ## Uses pileLettersAt() and alpahabetFrequency() internally.
> ## Arguments in ... are passed to alpahabetFrequency().
> ## The 'param' arg must be like in stackStringsFromBam() (see
> ## ?stackStringsFromBam for the details).
> alphabetFrequencyFromBam <- function(file, index=file, param,
...)
> {
> param <- GenomicAlignments:::.normarg_param(param)
> region <- bamWhich(param)
> bamWhat(param) <- "seq"
> gal <- readGAlignmentsFromBam(file, index=index, param=param)
> seqlevels(gal) <- names(region)
> qseq <- mcols(gal)$seq
> at <- GRanges(names(region),
> IRanges(start(region[[1L]]):end(region[[1L]]),
> width=1L))
> piles <- pileLettersAt(qseq, seqnames(gal), start(gal),
cigar(gal),
> at)
> alphabetFrequency(piles, ...)
> }
>
> ## A naive consensus string extraction: at each position, select
> ## the letter with highest frequency. Insert letter "." if all
> ## frequencies are 0 at a given position. Insert letter "*" is
> ## there is a tie.
> consensusStringFromFrequencyMatrix <- function(fm)
> {
> selection <- apply(fm, 2,
> function(x) {
> i <- which.max(x)
> if (x[i] == 0L)
> return(5L)
> if (sum(x == x[i]) != 1L)
> return(6L)
> i
> })
> paste0(c(DNA_BASES, ".", "*")[selection], collapse="")
> }
>
> Then:
>
> ## Get some genes.
> library(GenomicFeatures)
> txdb <- makeTranscriptDbFromUCSC(genome="taeGut2",
table="refGene")
> my_genes <- sample(genes(txdb), 20)
>
> ## Extract the corresponding consensus sequences from your BAM
file.
> my_gene_seqs <- sapply(seq_along(my_genes),
> function(i) {
> af <- alphabetFrequencyFromBam(bamfile, param=my_genes[i],
> baseOnly=TRUE)
> fm <- t(af[ , DNA_BASES])
> consensusStringFromFrequencyMatrix(fm)
> })
>
> IMPORTANT: The strand in 'my_genes' is ignored, that is, all
sequences
> are extracted from the plus strand (and going from 5' to 3' along
that
> strand), even for genes that belong to the minus strand.
>
> It's not going to be very efficient (approx. 1 s per gene).
>
> Cheers,
> H.
>
>>
>> Thanks,
>>
>> Jim
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
Hi Herve,
Thanks! That works better than the kludgetastic function I set up.
But note I had to add one line:
alphabetFrequencyFromBam <- function(file, index=file, param, ...){
seqlevels(param) <- seqlevelsInUse(param)
## added this
param <- GenomicAlignments:::.normarg_param(param)
region <- bamWhich(param)
bamWhat(param) <- "seq"
gal <- readGAlignmentsFromBam(file, index=index, param=param)
seqlevels(gal) <- names(region)
qseq <- mcols(gal)$seq
at <- GRanges(names(region),
IRanges(start(region[[1L]]):end(region[[1L]]),
width=1L))
piles <- pileLettersAt(qseq, seqnames(gal), start(gal),
cigar(gal),
at)
alphabetFrequency(piles, ...)
}
because of how normarg_param() treats GRanges with multiple seqlevels:
> bamWhich(GenomicAlignments:::.normarg_param("chr5:1307857-1308626"))
IRangesList of length 1
$chr5
IRanges of length 1
start end width
[1] 1307857 1308626 770
> bdnf <- GRanges("chr5", IRanges(1307857,1308626))
> bamWhich(GenomicAlignments:::.normarg_param(bdnf))
IRangesList of length 1
$chr5
IRanges of length 1
start end width
[1] 1307857 1308626 770
> gns <- genes(txdb)
> bdnf2 <- gns["751584"]
> bdnf2
GRanges with 1 range and 1 metadata column:
seqnames ranges strand | gene_id
<rle> <iranges> <rle> | <characterlist>
751584 chr5 [1307857, 1308626] - | 751584
---
seqlengths:
chr1 ... chrZ_KB442450_random
118548696 ... 2283
> bamWhich(GenomicAlignments:::.normarg_param(bdnf2))
IRangesList of length 37096
$chr1
IRanges of length 0
$chr2
IRanges of length 0
$chr3
IRanges of length 0
...
<37093 more elements>
But this param instance still technically fulfills the criteria for
stackStringsFromBam:
> unlist(bamWhich(GenomicAlignments:::.normarg_param(bdnf2)))
IRanges of length 1
start end width names
[1] 1307857 1308626 770 chr5.751584
Best,
Jim
On Wed, Sep 10, 2014 at 7:33 PM, Hervé Pagès <hpages at="" fhcrc.org="">
wrote:
> I should add that the code I sent previously makes sense if
> the genes in your individual contains only a "small" number of
> substitutions with respect to the reference genes. If they
> contain indels and/or a high number of substitutions, it would
> not be appropriated (it would still "work" but the consensus
> sequences could not be trusted). The problem with indels is that
> the alphabetFrequencyFromBam() function doesn't know how to
> handle them (because it only knows how to report things that
> align with nucleotides in the reference genome, but insertions
> don't). Also having too many substitutions would make the
> alignment process less reliable (would make it less likely
> that reads coming from gene X align to the region
> corresponding to gene X in the reference).
>
> H.
>
>
>
> On 09/10/2014 03:44 PM, Hervé Pagès wrote:
>
>> Hi Jim,
>>
>> On 09/08/2014 07:32 PM, James W. MacDonald wrote:
>>
>>> I have a use case that I am not sure the best way to proceed.
>>>
>>> I have a non-model organism for which we would like to know the
sequence
>>> for a set of genes. It's a bird species, and I can for example
align
>>> to the
>>> zebra finch genome, then run bam_tally over a region to get
variants
>>> where
>>> my species differs from zebra finch, and then infer the sequence
based on
>>> the reference and the variants.
>>>
>>> But that seems harder than it should be. Is there some function
that I am
>>> missing that I can feed a GRanges and get back the consensus
sequence for
>>> that region, based on say a simple vote of the reads that overlap
it?
>>>
>>
>> So basically you're going to use the same approach as if you had
reads
>> from a zebra finch individual. The real gene sequences for that
>> individual would probably also differ from the reference sequence,
>> maybe less than for your bird that is not zebra finch, but that
does
>> not really change the overall game right?
>>
>> I agree you should try to avoid the intermediate step of variant
>> calling. Sounds a little bit simpler to aim directly for the
>> consensus sequence of your regions of interest. Here is one way
>> to do this:
>>
>> library(GenomicAlignments)
>>
>> ## Uses pileLettersAt() and alpahabetFrequency() internally.
>> ## Arguments in ... are passed to alpahabetFrequency().
>> ## The 'param' arg must be like in stackStringsFromBam() (see
>> ## ?stackStringsFromBam for the details).
>> alphabetFrequencyFromBam <- function(file, index=file, param,
...)
>> {
>> param <- GenomicAlignments:::.normarg_param(param)
>> region <- bamWhich(param)
>> bamWhat(param) <- "seq"
>> gal <- readGAlignmentsFromBam(file, index=index, param=param)
>> seqlevels(gal) <- names(region)
>> qseq <- mcols(gal)$seq
>> at <- GRanges(names(region),
>> IRanges(start(region[[1L]]):end(region[[1L]]),
>> width=1L))
>> piles <- pileLettersAt(qseq, seqnames(gal), start(gal),
cigar(gal),
>> at)
>> alphabetFrequency(piles, ...)
>> }
>>
>> ## A naive consensus string extraction: at each position, select
>> ## the letter with highest frequency. Insert letter "." if all
>> ## frequencies are 0 at a given position. Insert letter "*" is
>> ## there is a tie.
>> consensusStringFromFrequencyMatrix <- function(fm)
>> {
>> selection <- apply(fm, 2,
>> function(x) {
>> i <- which.max(x)
>> if (x[i] == 0L)
>> return(5L)
>> if (sum(x == x[i]) != 1L)
>> return(6L)
>> i
>> })
>> paste0(c(DNA_BASES, ".", "*")[selection], collapse="")
>> }
>>
>> Then:
>>
>> ## Get some genes.
>> library(GenomicFeatures)
>> txdb <- makeTranscriptDbFromUCSC(genome="taeGut2",
table="refGene")
>> my_genes <- sample(genes(txdb), 20)
>>
>> ## Extract the corresponding consensus sequences from your BAM
file.
>> my_gene_seqs <- sapply(seq_along(my_genes),
>> function(i) {
>> af <- alphabetFrequencyFromBam(bamfile, param=my_genes[i],
>> baseOnly=TRUE)
>> fm <- t(af[ , DNA_BASES])
>> consensusStringFromFrequencyMatrix(fm)
>> })
>>
>> IMPORTANT: The strand in 'my_genes' is ignored, that is, all
sequences
>> are extracted from the plus strand (and going from 5' to 3' along
that
>> strand), even for genes that belong to the minus strand.
>>
>> It's not going to be very efficient (approx. 1 s per gene).
>>
>> Cheers,
>> H.
>>
>>
>>> Thanks,
>>>
>>> Jim
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
[[alternative HTML version deleted]]
Hi Jim,
Thanks for the feedback. Yesterday I added alphabetFrequencyFromBam()
to GenomicAlignments (in devel). Just applied your fix to it now.
Cheers,
H.
On 09/12/2014 08:06 AM, James W. MacDonald wrote:
> Hi Herve,
>
> Thanks! That works better than the kludgetastic function I set up.
>
> But note I had to add one line:
>
> alphabetFrequencyFromBam <- function(file, index=file, param, ...){
> seqlevels(param) <- seqlevelsInUse(param)
> ## added this
> param <- GenomicAlignments:::.normarg_param(param)
> region <- bamWhich(param)
> bamWhat(param) <- "seq"
> gal <- readGAlignmentsFromBam(file, index=index, param=param)
> seqlevels(gal) <- names(region)
> qseq <- mcols(gal)$seq
> at <- GRanges(names(region),
> IRanges(start(region[[1L]]):end(region[[1L]]),
> width=1L))
> piles <- pileLettersAt(qseq, seqnames(gal), start(gal),
cigar(gal),
> at)
> alphabetFrequency(piles, ...)
> }
>
> because of how normarg_param() treats GRanges with multiple
seqlevels:
>
> >
bamWhich(GenomicAlignments:::.normarg_param("chr5:1307857-1308626"))
> IRangesList of length 1
> $chr5
> IRanges of length 1
> start end width
> [1] 1307857 1308626 770
>
> > bdnf <- GRanges("chr5", IRanges(1307857,1308626))
> > bamWhich(GenomicAlignments:::.normarg_param(bdnf))
> IRangesList of length 1
> $chr5
> IRanges of length 1
> start end width
> [1] 1307857 1308626 770
>
> > gns <- genes(txdb)
> > bdnf2 <- gns["751584"]
> > bdnf2
> GRanges with 1 range and 1 metadata column:
> seqnames ranges strand | gene_id
> <rle> <iranges> <rle> | <characterlist>
> 751584 chr5 [1307857, 1308626] - | 751584
> ---
> seqlengths:
> chr1 ... chrZ_KB442450_random
> 118548696 ... 2283
> > bamWhich(GenomicAlignments:::.normarg_param(bdnf2))
> IRangesList of length 37096
> $chr1
> IRanges of length 0
>
> $chr2
> IRanges of length 0
>
> $chr3
> IRanges of length 0
>
> ...
> <37093 more elements>
>
> But this param instance still technically fulfills the criteria for
> stackStringsFromBam:
>
> > unlist(bamWhich(GenomicAlignments:::.normarg_param(bdnf2)))
> IRanges of length 1
> start end width names
> [1] 1307857 1308626 770 chr5.751584
>
> Best,
>
> Jim
>
>
>
>
> On Wed, Sep 10, 2014 at 7:33 PM, Hervé Pagès <hpages at="" fhcrc.org=""> <mailto:hpages at="" fhcrc.org="">> wrote:
>
> I should add that the code I sent previously makes sense if
> the genes in your individual contains only a "small" number of
> substitutions with respect to the reference genes. If they
> contain indels and/or a high number of substitutions, it would
> not be appropriated (it would still "work" but the consensus
> sequences could not be trusted). The problem with indels is that
> the alphabetFrequencyFromBam() function doesn't know how to
> handle them (because it only knows how to report things that
> align with nucleotides in the reference genome, but insertions
> don't). Also having too many substitutions would make the
> alignment process less reliable (would make it less likely
> that reads coming from gene X align to the region
> corresponding to gene X in the reference).
>
> H.
>
>
>
> On 09/10/2014 03:44 PM, Hervé Pagès wrote:
>
> Hi Jim,
>
> On 09/08/2014 07:32 PM, James W. MacDonald wrote:
>
> I have a use case that I am not sure the best way to
proceed.
>
> I have a non-model organism for which we would like to
know
> the sequence
> for a set of genes. It's a bird species, and I can for
> example align
> to the
> zebra finch genome, then run bam_tally over a region to
get
> variants
> where
> my species differs from zebra finch, and then infer the
> sequence based on
> the reference and the variants.
>
> But that seems harder than it should be. Is there some
> function that I am
> missing that I can feed a GRanges and get back the
consensus
> sequence for
> that region, based on say a simple vote of the reads
that
> overlap it?
>
>
> So basically you're going to use the same approach as if you
had
> reads
> from a zebra finch individual. The real gene sequences for
that
> individual would probably also differ from the reference
sequence,
> maybe less than for your bird that is not zebra finch, but
that does
> not really change the overall game right?
>
> I agree you should try to avoid the intermediate step of
variant
> calling. Sounds a little bit simpler to aim directly for the
> consensus sequence of your regions of interest. Here is one
way
> to do this:
>
> library(GenomicAlignments)
>
> ## Uses pileLettersAt() and alpahabetFrequency()
internally.
> ## Arguments in ... are passed to alpahabetFrequency().
> ## The 'param' arg must be like in stackStringsFromBam()
(see
> ## ?stackStringsFromBam for the details).
> alphabetFrequencyFromBam <- function(file, index=file,
> param, ...)
> {
> param <- GenomicAlignments:::.normarg___param(param)
> region <- bamWhich(param)
> bamWhat(param) <- "seq"
> gal <- readGAlignmentsFromBam(file, index=index,
param=param)
> seqlevels(gal) <- names(region)
> qseq <- mcols(gal)$seq
> at <- GRanges(names(region),
>
IRanges(start(region[[1L]]):__end(region[[1L]]),
> width=1L))
> piles <- pileLettersAt(qseq, seqnames(gal),
start(gal),
> cigar(gal),
> at)
> alphabetFrequency(piles, ...)
> }
>
> ## A naive consensus string extraction: at each
position, select
> ## the letter with highest frequency. Insert letter "."
if all
> ## frequencies are 0 at a given position. Insert letter
"*" is
> ## there is a tie.
> consensusStringFromFrequencyMa__trix <- function(fm)
> {
> selection <- apply(fm, 2,
> function(x) {
> i <- which.max(x)
> if (x[i] == 0L)
> return(5L)
> if (sum(x == x[i]) != 1L)
> return(6L)
> i
> })
> paste0(c(DNA_BASES, ".", "*")[selection], collapse="")
> }
>
> Then:
>
> ## Get some genes.
> library(GenomicFeatures)
> txdb <- makeTranscriptDbFromUCSC(__genome="taeGut2",
> table="refGene")
> my_genes <- sample(genes(txdb), 20)
>
> ## Extract the corresponding consensus sequences from
your
> BAM file.
> my_gene_seqs <- sapply(seq_along(my_genes),
> function(i) {
> af <- alphabetFrequencyFromBam(__bamfile,
param=my_genes[i],
> baseOnly=TRUE)
> fm <- t(af[ , DNA_BASES])
> consensusStringFromFrequencyMa__trix(fm)
> })
>
> IMPORTANT: The strand in 'my_genes' is ignored, that is, all
> sequences
> are extracted from the plus strand (and going from 5' to 3'
> along that
> strand), even for genes that belong to the minus strand.
>
> It's not going to be very efficient (approx. 1 s per gene).
>
> Cheers,
> H.
>
>
> Thanks,
>
> Jim
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org="">
> https://stat.ethz.ch/mailman/__listinfo/bioconductor
> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
> Search the archives:
>
http://news.gmane.org/gmane.__science.biology.informatics.__conductor
>
<http: news.gmane.org="" gmane.science.biology.informatics.conductor="">
>
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org="">
> Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
> Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319