Delete all rows which contain a GeneID from a list with several GeneIDs
2
0
Entering edit mode
ChIP-Tease • 0
@chip-tease-8339
Last seen 8.1 years ago
Germany

Hello everybody,

maybe someone has an idea how to solve my problem.

I have a data frame and it contains all GeneIDs and a value for these genes. Like this:

all_genes (around 23000 rows)

GeneID Value
Gene1 4
Gene2 2
Gene3 9
Gene4 0
Gene5 2

I also have a list of GeneIDs: Genelist <- c(Gene2, Gene4, Gene5). Finally my Genelist will be a few thousand genes long. I would like to delete all rows, which contain a gene from the Genelist to get this:

GeneID Value
Gene1 4
Gene3 0

I tried to use setdiff, but this only works if you specify a column.
Something like all_genes[-grep(Genelist, all_genes$GeneID)] only works for 1 gene and not for several ones.

This is the only way i can make it work, but it takes rather long.

for (i in 1:length(Genelist)){
  all_genes <- all_genes[all_genes$GeneID != Genelist[i], ]
}

OR
for (i in 1:length(Genelist)){
  all_genes <- all_genes[-(all_genes$GeneID == Genelist[i]), ]
}

I'd be very happy for some advise, because i face this problem more frequently in different setups.
Thanks a lot, Alex

 

r • 1.4k views
ADD COMMENT
1
Entering edit mode
Jeremy Ng ▴ 180
@jeremy-ng-5464
Last seen 9.5 years ago
Singapore
Does this work? all_genes[!all_genes$GeneID%in%Genelist,] Jeremy On Wed, Jul 8, 2015 at 11:48 PM, ChIP-Tease [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User ChIP-Tease <https: support.bioconductor.org="" u="" 8339=""/> wrote Question: > Delete all rows which contain the GeneID from a list with several GeneIDs > <https: support.bioconductor.org="" p="" 69640=""/>: > > Hello everybody, > > maybe someone has an idea how to solve my problem. > > I have a data frame and it contains all GeneIDs and a value for these > genes. Like this: > > *all_genes* (around 23000 rows) > GeneID Value Gene1 4 Gene2 2 Gene3 9 Gene4 0 Gene5 2 > > I also have a list of GeneIDs: *Genelist* <- c(Gene2, Gene4, > Gene5). Finally my Genelist will be a few thousand genes long. I would like > to delete all rows, which contain a gene from the *Genelist* to get this: > GeneID Value Gene1 4 Gene3 0 > > I tried to use setdiff, but this only works if you specify a column. > Something like all_genes[-grep(Genelist, all_genes$GeneID)] only works for > 1 gene and not for several ones. > > This is the only way i can make it work, but it takes rather long. > > for (i in 1:length(Genelist)){ > all_genes <- all_genes[all_genes$GeneID != Genelist[i], ] > } > > OR > for (i in 1:length(Genelist)){ > all_genes <- all_genes[-(all_genes$GeneID == Genelist[i]), ] > } > > I'd be very happy for some advise, because i face this problem more > frequently in different setups. > Thanks a lot, Alex > > > > ------------------------------ > > Post tags: r > > You may reply via email or visit Delete all rows which contain a GeneID from a list with several GeneIDs >
ADD COMMENT
0
Entering edit mode
ChIP-Tease • 0
@chip-tease-8339
Last seen 8.1 years ago
Germany

Thanks Jeremy, works perfectly, exactly what i was looking for :D

ADD COMMENT

Login before adding your answer.

Traffic: 529 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6