Question

countToFPKM and featureLength

1

Entering edit mode

Morris ▴ 10

@morris-13226

Last seen 17 months ago

Italy

Hello everyone, I am struggling to get an FPKM matrix using countToFPKM library. everything seems running here is my script I get error.

    library("devtools")
    library("biomaRt")
    library("dplyr")
    library(countToFPKM)

    file.readcounts<- as.matrix(read.csv("Matrix.csv", header = TRUE, row.names = 1))
    nrow(file.readcounts)
    ens_build = "sep2015"
    dataset="hsapiens_gene_ensembl"
    mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL", dataset = dataset, version = 80)

   gene.annotations <- biomaRt::getBM(mart = mart, attributes=c("ensembl_gene_id", "external_gene_name",
                                                             "start_position", "end_position"))
  gene.annotations <- dplyr::transmute(gene.annotations, external_gene_name,  ensembl_gene_id, length = end_position - start_position)
  convert order column in gene.annotation and make first column as row.name
   # Filter and re-order gene.annotations to match the order in feature counts matrix
   gene.annotations <- gene.annotations %>% dplyr::filter(gene.annotations$ensembl_gene_id %in% row.names(file.readcounts))
    gene.annotations <- gene.annotations[order(match(gene.annotations$ensembl_gene_id, rownames(file.readcounts))),]
    # Assign feature lenghts into a numeric vector.
    featureLength <- gene.annotations$length

the future length seems to be ok as integer value but at the end here is the error

fpkm_matrix <- fpkm (file.readcounts, featureLength=featureLength, meanFragmentLength=NULL)

Error in fpkm(file.readcounts, featureLength = featureLength, meanFragmentLength = NULL) : 
length(featureLength) == nrow(counts) is not TRUE

I can't understand the error the match function seems working fine...

I accept any help, thank you

countToFPKM • 2.7k views

ADD COMMENT • link updated 2.1 years ago by swbarnes2 ★ 1.4k • written 2.1 years ago by Morris ▴ 10

score 1 · Answer 1 · 2022-12-02

1

Entering edit mode

swbarnes2 ★ 1.4k

@swbarnes2-14086

Last seen 5 hours ago

San Diego

You are pulling gene length out of biomart? Why would you do that? Why do you think intron length matters?

ADD COMMENT • link 2.1 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Hi swbarnes2 I was following the script suggested by the author of the library(countToFPKM) to do that

here it is:

https://github.com/AAlhendi1707/countToFPKM/issues/2

he uses BioMart for extracting gene length informations

thanks

ADD REPLY • link 2.1 years ago Morris ▴ 10

0

Entering edit mode

Just because someone posts code doesn't mean it makes sense. Are you really sure that you want gene lengths, and not transcript lengths?

ADD REPLY • link 2.1 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

you're right transcript length still has introns though maybe would be better to have exons lengths

thanks

ADD REPLY • link 2.1 years ago Morris ▴ 10

0

Entering edit mode

Are you sure that just adding up every single exon of a gene is correct?

ADD REPLY • link 2.1 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

What else should I take in consideration? generally for expression analysis as far as I know is considered the processed transcript

ADD REPLY • link 2.1 years ago Morris ▴ 10

0

Entering edit mode

Genes in eukaryotes do not have one single processed transcript per gene. They have many, and they can be different lengths.

What I'm trying to get across to you is that you cannot generate FPKM from gene counts alone.

ADD REPLY • link 2.1 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

What tools do you suggest to introduce all this variables and for a better analysis? I m using what the "market" offers

ADD REPLY • link 2.1 years ago Morris ▴ 10

0

Entering edit mode

Since no one knows what your end analysis goal is, no one can help you. All I can tell you is you can't correct for transcript lengths using gene counts.

ADD REPLY • link 2.1 years ago swbarnes2 ★ 1.4k