Question

DESeq2-result files (.csv): Is there a convention for the IDs (e.g. gene names if produced for genes)

0

Entering edit mode

miriam.mueller • 0

@miriammueller-21440

Last seen 5.7 years ago

Hello,

I am working with DESeq2-result files (.csv) in a bioinformatic context.

I wonder, if there is a convention on how the "features" are named for DESeq2 csv outputs.

In one of my sample datasets for genes for example, genes are named with "geneName.gene". Would an example for CDS be named "cdsName.cds"? Is the .gene an error and the name column would normally contain only the feature name?

Thanks in advance,

Miriam

deseq2 gene format • 1.7k views

ADD COMMENT • link 5.7 years ago miriam.mueller • 0

0

Entering edit mode

miriam.mueller • 0

@miriammueller-21440

Last seen 5.7 years ago

The name/ID in the Deseq2 results depends on the input, Deseq2 does not have a naming convention.

ADD COMMENT • link 5.7 years ago miriam.mueller • 0

score 2 · Accepted Answer · 2019-07-24

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 9 days ago

United States

DESeq2 has no preference on the rownames, and I'm not sure of any convention. Most of the datasets I work with or receive from others use Gencode genes and transcripts, which use Ensembl identifiers ENSG... or ENST...

ADD COMMENT • link 5.7 years ago Michael Love 43k

0

Entering edit mode

So, I have also investigated the count files of that dataset, which troubles me. The genes are named identically (name.gene).

Am I right, that upon running DESeq2 with e.g. those count data, one specifies how the data is read in (haven't used DESeq2 myself, read the docu and examples though) and then DESeq2 takes the "names" or identifiers the user specified? (which actually could be anything, when disregarding biological sense)

ADD REPLY • link 5.7 years ago miriam.mueller • 0

0

Entering edit mode

What's the question exactly? DESeq2 takes count matrices as input (or other various files described in the docs). It doesn't matter to DESeq2 what the rows are named. Is there a question about how to use DESeq2 or what gene names are most common? The latter you can try posting to a general bioinformatics forum such as Biostars. Here you can post specific software questions.

ADD REPLY • link 5.7 years ago Michael Love 43k

0

Entering edit mode

This answered my question. There is "no" convention for row names. It depends on the input and I wasn't sure about that and wanted clarification. Thanks :)

ADD REPLY • link 5.7 years ago miriam.mueller • 0