HOW TO FIND THE SIMILAR GENES FROM BETWEEN TWO FILES
4
0
Entering edit mode
weinong han ▴ 270
@weinong-han-1250
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20050727/ 552241ba/attachment.pl
• 1.1k views
ADD COMMENT
0
Entering edit mode
@ting-yuan-liu-fhcrc-1221
Last seen 10.2 years ago
If you can read these two files into R by read.table or read.csv, you can then find out the intersect between the two Genebank-accession-numbers columns. For example, if the two files are read into two objects called file.1 and file.2, and suppose that the first column is the GeneBankID. Then try this: commonGeneID <- intersect(file.1[,1], file.2[,1]) file.1.common <- file.1[sapply(commonGeneID, function(x) which(x==file.1[,1])),] file.2.common <- file.2[sapply(commonGeneID, function(x) which(x==file.2[,1])),] then file.1.common and file.2.common is the intersect of file.1 and file.2. Write them out. Is this what you want? HTH, Ting-Yuan On Wed, 27 Jul 2005, weinong han wrote: > Dear All, > > I am confronted with another problems and need your help again. > I have two .txt files including Genebank accession numbers, respectively. I want to find the genes with the same GB Acession numbers from between two .txt files. Excel cannot open the bigger file, so I cannot run the match using Excel. > > Anybody have the experience? any suggestions and advice will be much appreciated. > > My OS is Windows XP. > > thanks in advance. > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 10.2 years ago
help(merge) might be useful. On Wed, 2005-07-27 at 19:53 -0700, weinong han wrote: > Dear All, > > I am confronted with another problems and need your help again. > I have two .txt files including Genebank accession numbers, respectively. I want to find the genes with the same GB Acession numbers from between two .txt files. Excel cannot open the bigger file, so I cannot run the match using Excel. > > Anybody have the experience? any suggestions and advice will be much appreciated. > > My OS is Windows XP. > > thanks in advance. > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 10.2 years ago
Actually I would mind for two reasons 1) R/BioConductor is a volunteer based mailing list 2) You will not learn till you try it yourself However here is a short document I wrote for a friend a while ago. It might help you get started. There are better documents out there if you look on Documents section of www.R-project.org . Regards, Adai On Wed, 2005-07-27 at 21:21 -0700, weinong han wrote: > Hi, Dr. Adaikalavan Ramasamy, > I am a green hand in R, Would you mind merging the data for me? > > I am looking forward to your reply. > > Thanks in advance. > > > > Adaikalavan Ramasamy <ramasamy at="" cancer.org.uk=""> wrote: > help(merge) might be useful. > > > > On Wed, 2005-07-27 at 19:53 -0700, weinong han wrote: > > Dear All, > > > > I am confronted with another problems and need your help > again. > > I have two .txt files including Genebank accession numbers, > respectively. I want to find the genes with the same GB > Acession numbers from between two .txt files. Excel cannot > open the bigger file, so I cannot run the match using Excel. > > > > Anybody have the experience? any suggestions and advice will > be much appreciated. > > > > My OS is Windows XP. > > > > thanks in advance. > > > > > > Best Regards > > > > Han Weinong > > hanweinong at yahoo.com > > > > __________________________________________________ > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > > ______________________________________________________________________ > Start your day with Yahoo! - make it your home page -------------- next part -------------- A non-text attachment was scrubbed... Name: merge_in_R.pdf Type: application/pdf Size: 43962 bytes Desc: not available Url : https://stat.ethz.ch/pipermail/bioconductor/attachments/20050728 /df3ad9e3/merge_in_R.pdf
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.6 years ago
United States
Hi Han, If they are delimited files (either csv or tab, or ...) you can use read.table or similar functions in R (the man pages will describe alternatives). Make sure you set the as.is parameter (or you get factors), and colClasses, which can allow you to skip over columns (probably what you want, if you only want to find which ones are in common). then you can use match - or %in%, again the man page is going to tell you what arguments you can set best wishes, Robert weinong han wrote: > Dear All, > > I am confronted with another problems and need your help again. > I have two .txt files including Genebank accession numbers, respectively. I want to find the genes with the same GB Acession numbers from between two .txt files. Excel cannot open the bigger file, so I cannot run the match using Excel. > > Anybody have the experience? any suggestions and advice will be much appreciated. > > My OS is Windows XP. > > thanks in advance. > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT

Login before adding your answer.

Traffic: 758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6