Converting annotate lists to a matrix
3
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.3 years ago
Hi This is kind of an R problem, but on bioconductor data. For example, I have the hu6800PATH environment from the hu6800 annotation package. The example in the help is this: xx <- as.list(hu6800PATH) xx <- xx[!is.na(xx)] What I actually want is a matrix with two columns, the first being probe id and the second being pathway id - I'm going to do some relational joins with this data using merge(). I've got as far as: as.matrix(unlist(xx)) But that doesn't give me exactly what I want. The rownames of the resulting matrix are set to the probe_ids but where there are duplicate probe ids (where probes are in >1 pathway) then R appends a numerator on the end. Can anyone help me convert the list format from an annotation package to a matrix as I describe above? Thanks Mick
Annotation hu6800 convert Annotation hu6800 convert • 1.2k views
ADD COMMENT
0
Entering edit mode
John Zhang ★ 2.9k
@john-zhang-6
Last seen 10.3 years ago
>This is kind of an R problem, but on bioconductor data. For example, I >have the hu6800PATH environment from the hu6800 annotation package. The >example in the help is this: > >xx <- as.list(hu6800PATH) >xx <- xx[!is.na(xx)] > >What I actually want is a matrix with two columns, the first being probe >id and the second being pathway id - I'm going to do some relational >joins with this data using merge(). You may try: > xx <- as.list(hu6800PATH) > xx <- unlist(xx, use.names = TRUE) > xx <- cbind(names(xx), xx) The first column of xx will be probe ids with an integer appended to the end if a probe has multiple mappings. Use pattern match to remove the trailing integers from the first column then you are done. > >I've got as far as: > >as.matrix(unlist(xx)) > >But that doesn't give me exactly what I want. The rownames of the >resulting matrix are set to the probe_ids but where there are duplicate >probe ids (where probes are in >1 pathway) then R appends a numerator on >the end. > >Can anyone help me convert the list format from an annotation package to >a matrix as I describe above? > >Thanks >Mick > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Michael, try this: res = do.call("rbind", args=lapply(seq(along=xx), function(i) cbind(names(xx)[i], xx[[i]]))) > res[1:5,] [,1] [,2] [1,] "Z22536_at" "04010" [2,] "Z22536_at" "04060" [3,] "Z22536_at" "04350" [4,] "X60221_at" "00190" [5,] "X60221_at" "00193" Michael watson (IAH-C) wrote: > Hi > > This is kind of an R problem, but on bioconductor data. For example, I > have the hu6800PATH environment from the hu6800 annotation package. The > example in the help is this: > > xx <- as.list(hu6800PATH) > xx <- xx[!is.na(xx)] > > What I actually want is a matrix with two columns, the first being probe > id and the second being pathway id - I'm going to do some relational > joins with this data using merge(). > > I've got as far as: > > as.matrix(unlist(xx)) > > But that doesn't give me exactly what I want. The rownames of the > resulting matrix are set to the probe_ids but where there are duplicate > probe ids (where probes are in >1 pathway) then R appends a numerator on > the end. > > Can anyone help me convert the list format from an annotation package to > a matrix as I describe above? -- Best regards Wolfgang ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Fax: +44 1223 494486 Http: www.ebi.ac.uk/huber
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.3 years ago
Hi Thanks for that :-) It was actually an easy and quick way to do the latter that I was looking for. I can't just undiscrinately get rid of all integers if they appear at the end of an id in case there are ids that have integers at the end and are perfectly valid. So I am left faced with writing some kind of loop(), which is what I wanted to avoid in the first place. I don't want to annoy anyone, but am I the only person who finds the lists from bioconductor annotation packages a little unhelpful and hard to work with? Every example in the help, the first thing they do is unlist() the list; so why is it a list in the first place??? Thanks Mick -----Original Message----- From: John Zhang [mailto:jzhang@jimmy.harvard.edu] Sent: 10 February 2005 13:44 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Converting annotate lists to a matrix >This is kind of an R problem, but on bioconductor data. For example, I >have the hu6800PATH environment from the hu6800 annotation package. >The example in the help is this: > >xx <- as.list(hu6800PATH) >xx <- xx[!is.na(xx)] > >What I actually want is a matrix with two columns, the first being >probe id and the second being pathway id - I'm going to do some >relational joins with this data using merge(). You may try: > xx <- as.list(hu6800PATH) > xx <- unlist(xx, use.names = TRUE) > xx <- cbind(names(xx), xx) The first column of xx will be probe ids with an integer appended to the end if a probe has multiple mappings. Use pattern match to remove the trailing integers from the first column then you are done. > >I've got as far as: > >as.matrix(unlist(xx)) > >But that doesn't give me exactly what I want. The rownames of the >resulting matrix are set to the probe_ids but where there are duplicate >probe ids (where probes are in >1 pathway) then R appends a numerator >on the end. > >Can anyone help me convert the list format from an annotation package >to a matrix as I describe above? > >Thanks >Mick > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084
ADD COMMENT

Login before adding your answer.

Traffic: 593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6