I have found the overlapping regions between the genomic regions I am interested in and genes with 'intersectBed'- external of R.
intersectBed -a regions_of_interest.bed -b genes.bed > overlap_genes.bed
So now the file looks like this
>head overlap_genes.bed
chr1 180110925 180113591 Region_665 0 0 chr1 180110925 180113591 Region_665 0 0 chr1 180110925 180113591 Region_665 0 0 chr11 43113461 43117525 Region_837 0 0
This however is no good to me as I also need to include columns in this bed file which specify the gene that is being overlapped and where it starts and ends....
head genes.bed
chr1 109408122 109422169 + Serpinb2
chr1 174405505 174408706 + Slamf9
chr1 133724589 133743546 + Slc41a1
chr1 173847813 173848810 + Slamf6
i.e.. what I would like to see is:
head overlap_genes.bed
chr1 180110925 180113591 Region_665 0 0 gene_name ..... chr1 180110925 180113591 Region_665 0 0 another_gene chr1 180110925 180113591 Region_665 0 0 one_more_gene
So how do I bind the according gene_names and their start, end and orientation to the overlap_genes object? My first thought was to iterate through each row and check if there were overlaps between the i-th range intervals and if so- bind the concerned gene-name and extra info to that i-th overlap_genes row.
genes=read.table('genes.bed', sep='\t', header=F, stringsAsFactors=F)
overlap_genes=read.table('overlap_genes.bed', sep='\t', header=F, stringsAsFactors=F)
informed=data.frame()
for(i in nrow(overlap_genes)){
apply(genes,1, function(x){tryCatch(
if(intersect(seq(x[2], x[3]), seq(overlap_genes[i,1], overlap_genes[i,2])))
{rbind(informed,cbind(overlap_genes[i,],x))}
, error=function(e) e)})
}
This however has not worked for me ... The informed data frame comes up empty.... which is obviously wrong as the 'overlap_genes' is the product of being overlapped with genes so all of its rows should overlap with 'genes'... I have tried using Iranges but it seems to posses the same functionality as intersectBed but cannot retain extra columns from external data.....
Does anyone know of a way around this?? Thanks
hi thank you- i should have looked through the intersectbed documentation and found out about the -wao option