Help needed to identify common genes among datasets
1
0
Entering edit mode
vavecilla • 0
@vavecilla-8735
Last seen 9.2 years ago
United States

Good Morning,

I need some advice on some gene expression research. I have datasets which are downloaded from GEO and customized into MS excel. I need to identify the common genes across all the datasets. I've been reading that there is a way I can use R/Bioconductor in order to simply this process but still unsure where to begin. Can anyone shed some light and guide me in the right direction? Thanks so much!

datasets genes gene expression • 1.4k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 21 months ago
United States

It's not clear what you mean by "customized into MS excel," but I'm imagining this means that you have gene identifiers (symbols or entrezIDs) in the first column, and data in the rest?

In any case, you'll need to load your data into R as a data.frame (you can use the readxl package (among others)) to do so.

Once you have data.frame(s) for your data, you can combine them using the merge function to combine these datasets.

If you just want iterate over the files and manually take the intersection of identifiers and such, you can use R's set operations.

It sounds like you're new to R (and programming, in general?) I'd recommend going through some R tutorials to get a feel for the language and an overview of some of its basic capabilities.

ADD COMMENT
0
Entering edit mode

Thank you for your response. Yes I have the gene symbols and description in my first two columns followed by the downloaded raw data. 

ADD REPLY

Login before adding your answer.

Traffic: 552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6