GC content track on GVIZ
1
0
Entering edit mode
@15877814
Last seen 8 months ago
Italy

Hi everyone, I'm trying to obtain a GC content track for the M. tuberculosis genome to plot on IGV. Below there is the code I have used to do this. However, when calculating the GC content, R blocks without giving an output. How can I solve the problem?

library(Biostrings)
library(Gviz)
#load FASTA file
genome_seq<-readDNAStringSet("M.tuberculosis_NC000962.3.fasta")
# Calculate genome length 
genome_length <- sum(width(genome_seq))

# GC content for a certain window size 
window_size <- 10000  
gc_content <- numeric(genome_length)
for (i in 1:(genome_length - window_size + 1)) {
  window_seq <- subseq(genome_seq, start = i, end = i + window_size - 1)
  gc_content[i] <- sum(unlist(strsplit(as.character(window_seq), "")) %in% c("G", "C")) / window_size * 100
}

Thank everyone for help

Gviz • 409 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States

This has nothing to do with Gviz. You shouldn't pick an unrelated package for questions because you are A.) directly asking that maintainer to answer an unrelated question and B.) not asking the relevant package's maintainer for help. But anyway.

> library(Biostrings)
## I downloaded the relevant FASTQ from NCBI
> z <- readDNAStringSet("c:/Users/jmacdon/Downloads/sequence.fasta")
## get windows
> starts <- seq(1, width(z), 1000)
> ends <- starts + 1000
## last one is too long, so fix
> ends[length(ends)] <- width(z)
## get the alphabet frequency for each 1000 bp window and rbind into a matrix
> zz <- do.call(rbind, lapply(seq(along = starts), function(x) alphabetFrequency(subseq(z, starts[x], ends[x]))))
## obvs
> zzz <- rowSums(zz[,2:3])/rowSums(zz)
> head(zzz)
[1] 0.5884116 0.5764236 0.6073926 0.5844156 0.6473526 0.6213786

Login before adding your answer.

Traffic: 519 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6