Merging two files based on the identifier column (gene symbols)?
0
0
Entering edit mode
@mohammedtoufiq91-17679
Last seen 12 weeks ago
United States

Hi,

I have two different *.csv files with different column headers except one column, i.e, one with the gene symbols and expression data (samples), and the other with the gene symbols and phenotypic data/attributes, in both the files, one column (gene symbols) is same. I would like to merge both the files based on mapping with the gene symbol column and save all the data in one file for further data analysis. I would like to know how this could be done.

Thank you,

Toufiq

merge files R packages gene annotations • 1.8k views
ADD COMMENT
1
Entering edit mode

Please do not cross-post. https://www.biostars.org/p/397989/

ADD REPLY
0
Entering edit mode

You could read both files in and do a match on the two columns of gene symbols, do a cbind and write.csv - That assumes the gene symbols are unique?. I think there is also a merge function. But I would also highly recommend looking into the Bioconductor SummarizedExperiment class that is designed to store data of this type. Perhaps others have more sophisticated ways of achieve this or know of some existing function ... ?

ADD REPLY
0
Entering edit mode

merge() is meant to make these sorts of operations easier; dplyr::left_join() is also very effective

ADD REPLY
0
Entering edit mode

Thank you so much for the suggestions.

ADD REPLY

Login before adding your answer.

Traffic: 797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6