Question

How to apply ChIPpeakAnno to your Granger object to annotate the peaks?

0

Entering edit mode

Chris ▴ 20

@3fdb6f97

Last seen 3 months ago

United States

Hi all,

I try to follow this vignettes to annotate the peak, but I have difficulty applying the code to my data. Would you have a suggestion on what I should run next after getting the table below? Thank you so much!

https://www.bioconductor.org/packages/devel/bioc/vignettes/ChIPpeakAnno/inst/doc/ChIPpeakAnno.html#Quick_start

enter image description here

Could you visit this site? I can't load it.

https://rdrr.io/bioc/ChIPpeakAnno/f/vignettes/quickStart.Rmd

ChIPpeakAnno • 2.2k views

ADD COMMENT • link 19 months ago Chris ▴ 20

2

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 18 minutes ago

United States

The vignette you point to is meant to help a beginner learn how to use the software. The best suggestion is for you to read the vignette as many times as necessary for you to understand how to annotate your peaks. It's a step-by-step instruction, so that's what you should be reading.

ADD COMMENT • link 19 months ago James W. MacDonald 67k

0

Entering edit mode

Got it. Thank you for your suggestion.

ADD REPLY • link 19 months ago Chris ▴ 20

0

Entering edit mode

After reading, for system.file() function, would you tell me the input I need to modify on parameters?

bed <- system.file("extdata", "MACS_output.bed", package="ChIPpeakAnno")

ADD REPLY • link 19 months ago Chris ▴ 20

1

Entering edit mode

You should almost never need to use system.file. It's meant to allow developers to put example data in their package that can then be used for vignettes or examples. What that line of code does is read an example bed file that the package author then uses for examples.

Your data will not be in the R package library directory (don't put anything in there), but instead should be in your working directory. You have a GRanges object that presumably you read in from your working directory. The idea behind a vignette is to provide an example that you can then emulate, so the expectation is that you will do something similar to what the vignette author did, but using your own data instead.

ADD REPLY • link 19 months ago James W. MacDonald 67k

0

Entering edit mode

Thank you so much for the quick reply!

ol <- findOverlapsOfPeaks(gr1, gr2)

I have only one Grange object, which is combined from 2 conditions. So could I use the function above? (a beginner question)

ADD REPLY • link 19 months ago Chris ▴ 20

1

Entering edit mode

If you have only one GRanges object, what are gr1 and gr2?

Also, you seem to be doing things backwards. Usually people have a goal they are trying to accomplish and then they try to figure out how to accomplish it. I have no idea what you are trying to accomplish, or why. So if you want to know what peaks overlap in gr1 and gr2, then what you ask is a fine way to do that. But why do you want to know that?

ADD REPLY • link 19 months ago James W. MacDonald 67k

0

Entering edit mode

Thank you for your reply. I use dba.report() from DiffBind to get the Grange object then I try to follow the vignette to annotate the result from DiffBind. Is that correct?

ADD REPLY • link 19 months ago Chris ▴ 20

1

Entering edit mode

Again, you are either being mysterious about what you are trying to do, or you have no idea, and are just trying things. Neither is good. You cannot expect people to help you if you are unwilling to say what you are trying to do, and if you have no idea what you are trying to do, then this isn't the place either. Nobody can do a knowledge transfer over the intertubes, and it will be up to you to read things and try to understand as best you can. The nice thing about Open Source software is that you can just download and use it. But that's not to say it's free! It will cost you the time and effort required to get yourself up to speed and fully understanding what you are doing. That's not a trivial process, and nobody can help you. It's on you to read and understand. If you cannot do that, you need to find someone locally who does and have them either do the work for you or tutor you through it.

From what you have said, you have used DiffBind to find differentially bound peaks from a ChIP-Seq experiment (or so I assume). And now you want to annotate those peaks. ChIPpeakAnno is designed to annotate peaks from a ChIP-Seq experiment and has like three different vignettes that show how to do so, depending on what you are trying to do. If you are asking if you can use that package to do what it says it does, then yes! That's what it is designed to do!

You have a GRanges object that is not dissimilar to what they use in the vignettes (they start with a GRanges object as well). And they show, step by step, how to use a GRanges object to annotate. You will have to follow along, using your data, and the correct TxDb object. If you get jammed up and have a specific question about a specific problem, then please post a question.

ADD REPLY • link 19 months ago James W. MacDonald 67k

0

Entering edit mode

Yes, I have ATAC seq data from 2 samples (diseased and healthy). I ran Nextflow ATAC pipeline and got the output files (including bams and broad peaks files). I used these output files as input for DiffBind. DiffBind then created the Grange object that I show here. The vignette has gf1 and gf2, and used bed and gff file as input which I haven't understood to follow yet because I don't have these files. Do I need to convert my data into gff and bed files? Thank you!

ADD REPLY • link 19 months ago Chris ▴ 20

score 2 · Accepted Answer · 2023-04-14

2

Entering edit mode

Ou, Jianhong ★ 1.3k

@ou-jianhong-4539

Last seen 2 hours ago

United States

Hi Chris, Thank you for trying ChIPpeakAnno to annotate your data. If you are trying Nextflow nf-core/atacseq pipeline, you may want to try the code at https://github.com/jianhong/atacseq/blob/master/bin/diffbind.r which is similar to the nf-core pipeline. Jianhong.

ADD COMMENT • link 19 months ago Ou, Jianhong ★ 1.3k

0

Entering edit mode

Hi Jianhong, thank you so much for sharing!

ADD REPLY • link 19 months ago Chris ▴ 20

0

Entering edit mode

Hi Jianhong,

Do you think this looks like the correct output for annotatePeakInBatch() Would you please suggest a way to convert Ensembl ID to gene ID in this case?

atac.anno$symbol <- mapIds(org.Hs.eg.db, keys = rownames(atac.anno), keytype = "ENSEMBL", column = "SYMBOL", multiVals = "first")

The row name has a prefix so it is not Ensembl ID to apply the code above.

Thank you so much.

enter image description here

ADD REPLY • link 19 months ago Chris ▴ 20

1

Entering edit mode

The "prefix" in the row name supposes to the your peak id. You annotate each peak to its associated features, that is why they are displayed in pair separated by a period. You can add a Gene symbol column to your annotated peak with ChIPpeakAnno::addGeneIDs, try ?addGeneIDs to see examples.

ADD REPLY • link 19 months ago hukai916 ▴ 10

0

Entering edit mode

Thank you so much for your suggestion! It worked. Would you tell me how to interpret the fold in this case?

ADD REPLY • link 19 months ago Chris ▴ 20

1

Entering edit mode

The "Fold" column belongs to your peak meta columns, it has nothing to do with ChIPpeakAnno, instead, it was generated by DiffBind, according to its manual:

The Fold column shows the log fold changes (LFCs) between the two groups, as calculated by the DESeq2 analysis. DiffBind

ADD REPLY • link 19 months ago Kai Hu ▴ 70

0

Entering edit mode

Thank you so much! The Conc and Conc_diseased columns made me unsure about the Fold column. Hope the number of genes will not be too many when I filter out NA and keep only unique gene names.

ADD REPLY • link 19 months ago Chris ▴ 20