Question

Saving the CSAW results to file

0

Entering edit mode

gthm ▴ 30

@gthm-8377

Last seen 5.6 years ago

spain

I am trying to use csaw, but I have not spent enough time on reading documentation as I am in little hurry. I would like to know how should I save the csaw results to a file. I could see the window number in etable but I can't make out which window it is actually (chrom, pos etc). Here is my code: I also would like to know if I am making proper paired analysis and looking for treated vs untreated conditions.

bam.files <- c("1_high.bam","1_low.bam","2_high.bam","2_low.bam","3_high.bam","3_low.bam","4_high.bam","4_low.bam","5_high.bam","5_low.bam","6_high.bam","6_low.bam","7_high.bam","7_low.bam","8_high.bam","8_low.bam")
require(csaw)
require(edgeR)

dedup.param <- readParam(minq=10, dedup=TRUE)
data <- windowCounts(bam.files, ext=300, width=200, param=dedup.param)

treat <- c("treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated")
subjects <- factor(c(rep( (1:8), each=2)))
design <- model.matrix(~subjects+treat)

keep <- abundances > aveLogCPM(5, lib.size=mean(data$totals))
data <- data[keep,]
binned <- windowCounts(bam.files, bin=TRUE, width=10000, param=dedup.param)
normfacs <- normOffsets(binned)

y <- asDGEList(data, norm.factors=normfacs)
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
y <- estimateGLMTrendedDisp(y, design)
y <- estimateGLMTagwiseDisp(y, design)
fit <- glmFit(y, design)
lrt <- glmLRT(fit)

etable <- topTags(lrt, n=nrow(y))$table
etable <- etable[order(etable$FDR), ]

#Write the results to a file
write.table(etable,file="abaundance_DE_edgeR.csv")

edger csaw • 1.5k views

ADD COMMENT • link updated 8.9 years ago by Aaron Lun ★ 28k • written 8.9 years ago by gthm ▴ 30

1

Entering edit mode

There's a saying I learned from my father: "I'm taking my time because I'm in a rush"

You're fooling yourself into thinking that you're actually saving time here. Do yourself a favor and invest the time now into reading and understanding the resources made available to you. You will move a lot quicker, make fewer mistakes, and (importantly) present with confidence the results you're generating, which you are likely asking others to invest the time into reading about / listening to.

ADD REPLY • link 8.9 years ago Steve Lianoglou ★ 13k

score 5 · Answer 1 · 2015-12-18

I would suggest you read the documentation, especially Chapter 6 of the user's guide which describes how to summarize windows into regions. It rarely makes sense to report results for individual windows. Some of your other settings are also cause for concern, especially given that you haven't read the user's guide:

You've set dedup=TRUE. This is only recommended in a DB analysis if your data is of substantially poor quality. Check out Section 2.2.2 in the user's guide.
I would suggest switching to the QL framework (i.e., glmQLFit and glmQLFTest) with estimateDisp, rather than using the LRT. Check out Chapter 5.
You should filter more aggressively, based on a fold-increase over background regions rather than just using an average abundance threshold corresponding to a count of 5. Check out Chapter 3.

There's really no excuse for not reading the documentation before posting a question; especially during the upcoming holidays, when you should have plenty of free time. If you don't want to take the time to read the docs, why should we take the time to help?