Question

Need assistance with the file preparation for ChIPpeakAnno. (Jennifer Yang from University of California, Santa Barbara)

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 13 months ago

United States

Dear Jennifer, Thank you very much for the positive feedback! Regarding the annotation file you downloaded from fRNAdb, did you try to save the first 6 columns as comma separated file (CSV)? If you still encounter problem, please send the bioconductor list with error message to seek input from those who are expert in read in data. Best regards, Julie On 1/17/11 2:41 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > I have been using ChIPpeakAnno to annotate my ChIP-seq data and found it > is very helpful. I also have tried to annotate my ChIP-seq data to > custom annotation files in BED format and they also worked. > > However, recently I encounter a problem when trying to annotate my > ChIP-seq data to a custom annotation file in BED format downloaded from > fRNAdb. I think there is something wrong with the data frame of this > file. I tried different approaches and tricks, but nothing worked with > this file when using read.table. I also tried to export it to excel, > add header, keep the first 6 columns (chromosome, start, end, name, > score, and strand) and then saved it as Text (tab delimited) and then > tried to input it into R again, but it still did not work. > > Would you please help out with this question? The size of the original > BED file is around 7 Mb. May I email you the file? > > Thank you, > > Jennifer

Annotation annotate ChIPpeakAnno Annotation annotate ChIPpeakAnno • 1.4k views

ADD COMMENT • link 13.9 years ago Julie Zhu ★ 4.3k

score 0 · Answer 1 · 2011-01-19

Jennifer, Please use Bed2RangedData function. Thanks! Please also take a look at the following post http://permalink.gmane.org/gmane.science.biology.informatics.conductor /28497 Best regards, Julie On 1/19/11 12:48 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > From the example on the 3rd page of "The ChIPpeakAnno user's guide", it > seems that the user can pass custom annotation data into the function > annotatePeakInBatch. I tried to pass a custom annotation data file with > the following commands, > >> mm9=read.table("/home/Jennifer/mm9_bed/mm9.bed",header=FALSE) >> Reference=RangedData(IRanges(start=[ ,2], end=[ ,3], names=[ ,4], > space=[ ,1], strand=[ ,6]) > Error: unexpected '[' in "Reference=RangedData(IRanges(start=[" >> > > I am wondering how I should define the vector so that the bed format can > be converted into RangedData with this approach. > > Thank you, > > Jennifer > > Zhu, Lihua (Julie) said the following on 1/18/2011 9:19 AM: >> Dear Jennifer, >> >> You may also want to contact the data provider as well. >> >> Best regards, >> >> Julie >

score 0 · Answer 2 · 2011-01-19

Jennifer, The name column name=mm9[ ,4] has duplicates (needs to be unique). Best regards, Julie On 1/19/11 4:47 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > Thanks for the prompt reply and assistance. I checked the link and > tried the following commands and here is the error message I obtained. > Would you please help me check it? > > Thank you, > > Jennifer > >> mm9 <- read.table(file="/home/Jennifer/mm9_bed/mm9.bed",header=FALSE) >> test.bed=data.frame(cbind(chrom=mm9[ ,1], chromStart=mm9[ ,2], > chromEnd=mm9[ ,3], name=mm9[ ,4], strand=mm9[ ,6])) >> test.rangedData=BED2RangedData(test.bed) > Error in `rownames<-`(`*tmp*`, value = c("82082", "82083", "82084", > "82084", : > duplicate rownames not allowed >> > > > >

score 0 · Answer 3 · 2011-01-20

Dear Jennifer, I am wondering what is the rational to have multiple coordinates for the same feature name. We could append a serial number to the feature names with multiple coordinates in the future release. Best regards, Julie On 1/20/11 8:35 PM, "Chu-Ya (Jennifer) Yang" <chu-ya.yang at="" lifesci.ucsb.edu=""> wrote: > Dear Prof. Zhu, > > Thank you for the assistance. > > We tried to convert the features with unique names and then convert to > ranged data format and then convert back to the original names of the > features. And then annotate the data with ChIPpeakAnno. It finally > worked, and was not easy for us to do. > > Since ChIPpeakAnno is a very powerful annotation algorithm and it is > very helpful to our data analysis, I am wondering whether it is possible > that you would please consider to modify the BED2RangedData to > accommodate more flexibility of the bed files, such as allow multiple > rows with identical name identifier. > > Thank you, > > Jennifer > > Zhu, Lihua (Julie) said the following on 1/19/2011 2:19 PM: >> Jennifer, >> >> The name column name=mm9[ ,4] has duplicates (needs to be unique). >> >> Best regards, >> >> Julie >