filter low expression tags
1
0
Entering edit mode
@vittoria-roncalli-5633
Last seen 10.1 years ago
Hi, I would like to understand how the filter of low expression tags works. If I run the command >keep <- rowSums (cpm(d)>100) >=2 d <- d[keep,] dim(d) as in the use guide page 32, this means that I am using a cutoff of 100cpm, but how are treated the 2 samples? Did are they averaged and then the low tags are removed? Is each sample considered separate and filtered by itself? Thanks foe the help in advance Vittoria Roncalli -- Vittoria Roncalli Graduate Research Assistant Center Békésy Laboratory of Neurobiology Pacific Biosciences Research Center University of Hawaii at Manoa 1993 East-West Road Honolulu, HI 96822 USA Tel: 808-4695693 [[alternative HTML version deleted]]
• 743 views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 19 months ago
United States
Hi, On Wed, Nov 28, 2012 at 10:14 PM, Vittoria Roncalli <roncalli at="" hawaii.edu=""> wrote: > Hi, > > I would like to understand how the filter of low expression tags works. If > I run the command > >>keep <- rowSums (cpm(d)>100) >=2 > d <- d[keep,] > dim(d) > > as in the use guide page 32, this means that I am using a cutoff of 100cpm, > but how are treated the 2 samples? Did are they averaged and then the low > tags are removed? > Is each sample considered separate and filtered by itself? > Thanks foe the help in advance How many samples (columns) do you have? You should first look at the output of `cpm(d) > 100` to see what you are getting -- this will be a logical (boolean) matrix that has the same dimensionality as `cpm(d)`. rowSums( a logical matrix ) returns a vector that is as long as there are rows in the logical matrix, and each value indicates how many columns are TRUE in that row. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
You keep the genes where at least 2 samples have a cpm greater than 100. rowSums(cpm(d) >100) counts, for each gene (row), how many samples have a cpm >= 100. Kasper On Wed, Nov 28, 2012 at 10:54 PM, Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> wrote: > Hi, > > On Wed, Nov 28, 2012 at 10:14 PM, Vittoria Roncalli <roncalli at="" hawaii.edu=""> wrote: >> Hi, >> >> I would like to understand how the filter of low expression tags works. If >> I run the command >> >>>keep <- rowSums (cpm(d)>100) >=2 >> d <- d[keep,] >> dim(d) >> >> as in the use guide page 32, this means that I am using a cutoff of 100cpm, >> but how are treated the 2 samples? Did are they averaged and then the low >> tags are removed? >> Is each sample considered separate and filtered by itself? >> Thanks foe the help in advance > > How many samples (columns) do you have? > > You should first look at the output of `cpm(d) > 100` to see what you > are getting -- this will be a logical (boolean) matrix that has the > same dimensionality as `cpm(d)`. > > rowSums( a logical matrix ) > > returns a vector that is as long as there are rows in the logical > matrix, and each value indicates how many columns are TRUE in that > row. > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 410 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6