Saving the CSAW results to file
1
0
Entering edit mode
gthm ▴ 30
@gthm-8377
Last seen 5.6 years ago
spain

I am trying to use csaw, but I have not spent enough time on reading documentation as I am in little hurry. I would like to know how should I save the csaw results to a file. I could see the window number in etable but I can't make out which window it is actually (chrom, pos etc). Here is my code: I also would like to know if I am making proper paired analysis and looking for treated vs untreated conditions.

 

bam.files <- c("1_high.bam","1_low.bam","2_high.bam","2_low.bam","3_high.bam","3_low.bam","4_high.bam","4_low.bam","5_high.bam","5_low.bam","6_high.bam","6_low.bam","7_high.bam","7_low.bam","8_high.bam","8_low.bam")
require(csaw)
require(edgeR)

dedup.param <- readParam(minq=10, dedup=TRUE)
data <- windowCounts(bam.files, ext=300, width=200, param=dedup.param)

treat <- c("treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated","treated","untreated")
subjects <- factor(c(rep( (1:8), each=2)))
design <- model.matrix(~subjects+treat)

keep <- abundances > aveLogCPM(5, lib.size=mean(data$totals))
data <- data[keep,]
binned <- windowCounts(bam.files, bin=TRUE, width=10000, param=dedup.param)
normfacs <- normOffsets(binned)

y <- asDGEList(data, norm.factors=normfacs)
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
y <- estimateGLMTrendedDisp(y, design)
y <- estimateGLMTagwiseDisp(y, design)
fit <- glmFit(y, design)
lrt <- glmLRT(fit)

etable <- topTags(lrt, n=nrow(y))$table
etable <- etable[order(etable$FDR), ]

#Write the results to a file
write.table(etable,file="abaundance_DE_edgeR.csv")
edger csaw • 1.5k views
ADD COMMENT
1
Entering edit mode

There's a saying I learned from my father: "I'm taking my time because I'm in a rush"

You're fooling yourself into thinking that you're actually saving time here. Do yourself a favor and invest the time now into reading and understanding the resources made available to you. You will move a lot quicker, make fewer mistakes, and (importantly) present with confidence the results you're generating, which you are likely asking others to invest the time into reading about / listening to.

ADD REPLY
5
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 7 hours ago
The city by the bay

I would suggest you read the documentation, especially Chapter 6 of the user's guide which describes how to summarize windows into regions. It rarely makes sense to report results for individual windows. Some of your other settings are also cause for concern, especially given that you haven't read the user's guide:

  • You've set dedup=TRUE. This is only recommended in a DB analysis if your data is of substantially poor quality. Check out Section 2.2.2 in the user's guide.
  • I would suggest switching to the QL framework (i.e., glmQLFit and glmQLFTest) with estimateDisp, rather than using the LRT. Check out Chapter 5.
  • You should filter more aggressively, based on a fold-increase over background regions rather than just using an average abundance threshold corresponding to a count of 5. Check out Chapter 3.

There's really no excuse for not reading the documentation before posting a question; especially during the upcoming holidays, when you should have plenty of free time. If you don't want to take the time to read the docs, why should we take the time to help?

ADD COMMENT

Login before adding your answer.

Traffic: 792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6