Homer getDiffExpression.pl issue
1
0
Entering edit mode
@niushengyong-12910
Last seen 7.5 years ago

Hi there,

I'm now using the function getDiffExpression.pl from Homer with the count table generated from annotatePeaks.pl.  

Here is my codes:

​getDiffExpression.pl  countTable.peaks.txt   ND T2D T2D T2D T2D ND  > diffOutput.txt

Here is the errors:

Argument "-319,filename1..." isn't numeric in addition (+) at /share/pkg/homer/4.9/install/bin/getDiffExpression.pl line 339, <IN> line 3.
Argument "-281,filename2..." isn't numeric in addition (+) at /share/pkg/homer/4.9/install/bin/getDiffExpression.pl line 339, <IN> line 3.

...

!!! Warnining - something likely failed during R execution.
!!! R script that was used:
        ####### basic script for running DESeq (generated by HOMER) ########
        library(DESeq2)
        #Read Data in
        countData <- read.delim("0.564077262607501.edgeR.in.data.txt")
        colData <- read.delim("0.564077262607501.edgeR.groups.data.txt")

        dds <- DESeqDataSetFromMatrix(countData, colData,design=~Treatment,tidy=TRUE)
        dds <- DESeq(dds)
        res <- results(dds,tidy=TRUE)
        write.table(res,file="0.564077262607501.edgeR.out.data.txt",sep="\t",row.names=FALSE)


!!! R execution details:
        Loading required package: S4Vectors
        Loading required package: stats4
        Loading required package: BiocGenerics
        Loading required package: parallel

        Attaching package: 'BiocGenerics'

        The following objects are masked from 'package:parallel':

            clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
            clusterExport, clusterMap, parApply, parCapply, parLapply,
            parLapplyLB, parRapply, parSapply, parSapplyLB

        The following objects are masked from 'package:stats':

            IQR, mad, xtabs

        The following objects are masked from 'package:base':

            Filter, Find, Map, Position, Reduce, anyDuplicated, append,
            as.data.frame, as.vector, cbind, colnames, do.call, duplicated,
            eval, evalq, get, grep, grepl, intersect, is.unsorted, lapply,
            lengths, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
            pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
            tapply, union, unique, unlist, unsplit

        Loading required package: IRanges
        Loading required package: GenomicRanges
        Loading required package: GenomeInfoDb
        Loading required package: SummarizedExperiment
        Loading required package: Biobase
        Welcome to Bioconductor

            Vignettes contain introductory material; view with
            'browseVignettes()'. To cite Bioconductor, see
            'citation("Biobase")', and for packages 'citation("pkgname")'.

        Loading required package: Rcpp
        Loading required package: RcppArmadillo
        Error in DESeqDataSet(se, design = design, ignoreRank) :
          some values in assay are negative
        Calls: DESeqDataSetFromMatrix -> DESeqDataSet
        Execution halted
rm: cannot remove `0.564077262607501.edgeR.out.data.txt': No such file or directory

It's a part of my count table. 

PeakID (cmd=annotatePeaks.pl tss hg38 -raw -p 30.bed 32.bed 33.bed 36.bed 37.bed 38.bed) Chr     Start   End     Strand  Not Used        Focus Ratio/Region Size Annotation      Detailed Annotation     Distance to TSS Nearest PromoterID      Entrez ID       Nearest Unigene Nearest Refseq  Nearest Ensembl Gene Name       Gene Alias      Gene Description        Gene Type       30.bed Distance to nearest Peak, Peak ID  32.bed Distance to nearest Peak, Peak ID  33.bed Distance to nearest Peak, Peak ID  36.bed Distance to nearest Peak, Peak ID 37.bed Distance to nearest Peak, Peak ID 38.bed Distance to nearest Peak, Peak ID
NM_001281971    chr19_KI270921v1_alt    34995   38995   +       0       NA      promoter-TSS (NM_001281971)     promoter-TSS (NM_001281971)     0       NM_001281971    3809    Hs.654608
       NM_012314       ENSG00000221957 KIR2DS4 CD158I|KIR-2DS4|KIR1D|KIR412|KKA3|NKAT-8|NKAT8  killer cell immunoglobulin like receptor, two Ig domains and short cytoplasmic tail 4   protein-coding

NM_021107       chr19   38928954        38932954        +       0       NA      promoter-TSS (NM_001145901)     promoter-TSS (NM_001145901)     0       NM_021107       6183    Hs.411125
       NM_021107       ENSG00000128626 MRPS12  MPR-S12|MT-RPS12|RPMS12|RPS12|RPSM12    mitochondrial ribosomal protein S12     protein-coding  -319,filename1_peak_13298       -281,filename2_peak_12994       -361,filename3_peak_17889
       -182,filename4_peak_16289      -295,filename5_peak_17638      -289,filename6_peak_18334

The command I use to generate count table:

annotatePeaks.pl    tss hg38 -raw -p  peak1 peak2 peak3 peak4 peak5 peak6   > countTable.peaks.txt

 

Does any one know how to solve the errors? Also, is the count table in the right format? Thanks!

 

Sincerely,

Simon

homer atac-seq • 2.2k views
ADD COMMENT
0
Entering edit mode
@maciej-jonczyk-6968
Last seen 7.2 years ago
Poland

HI,

for me it seems that `getDiffExpression` complains about wrong data type. And indeed, you have negative raw read counts, moreover with strangely added text string eg  "-295,filename5_peak_17638 "

I think you should check your bam files, it must strictly meet Homer expectations to be parsed properly.

ADD COMMENT

Login before adding your answer.

Traffic: 694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6