Question

Adding SummarizedExperiment function

0

Entering edit mode

rbronste ▴ 60

@rbronste-12189

Last seen 5.5 years ago

Kind of a basic question, whats the easiest way to add one of these that represents a column in a custom GRanges object? Thanks.

summarizedexperiment • 1.3k views

ADD COMMENT • link 8.3 years ago rbronste ▴ 60

0

Entering edit mode

What does 'add one of these' mean in this context?

ADD REPLY • link 8.3 years ago James W. MacDonald 68k

0

Entering edit mode

I just mean in terms of a GRanges that has for instance columns like seqnames, start, end - and can be queried

with for instance: stuff <- stuff.DB[seqnames(stuff.DB) == 'chrY']

Trying to figure out how to do the same for other columns like FDR etc, that are not in SummarizedExperiments

ADD REPLY • link 8.3 years ago rbronste ▴ 60

score 2 · Answer 1 · 2017-01-25

You can add anything you want in the mcols of the GRanges object and query on that at will. As a test, let's use the example for SummarizedExperiment:

> library(SummarizedExperiment)
> example("SummarizedExperiment")
> rse
class: RangedSummarizedExperiment
dim: 200 6
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment
> rowRanges(rse)
GRanges object with 200 ranges and 1 metadata column:
        seqnames           ranges strand |  feature_id
           <Rle>        <IRanges>  <Rle> | <character>
    [1]     chr1 [556101, 556200]      - |       ID001
    [2]     chr1 [792975, 793074]      - |       ID002
    [3]     chr1 [263755, 263854]      - |       ID003
    [4]     chr1 [714331, 714430]      + |       ID004
    [5]     chr1 [900677, 900776]      - |       ID005
    ...      ...              ...    ... .         ...
  [196]     chr2 [495890, 495989]      - |       ID196
  [197]     chr2 [222582, 222681]      - |       ID197
  [198]     chr2 [666857, 666956]      + |       ID198
  [199]     chr2 [404246, 404345]      - |       ID199
  [200]     chr2 [540493, 540592]      - |       ID200
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> z <- rse[mcols(rse)$feature_id %in% paste0("ID", sprintf("%03d", 1:5)),]
> rowRanges(z)
GRanges object with 5 ranges and 1 metadata column:
      seqnames           ranges strand |  feature_id
         <Rle>        <IRanges>  <Rle> | <character>
  [1]     chr1 [556101, 556200]      - |       ID001
  [2]     chr1 [792975, 793074]      - |       ID002
  [3]     chr1 [263755, 263854]      - |       ID003
  [4]     chr1 [714331, 714430]      + |       ID004
  [5]     chr1 [900677, 900776]      - |       ID005
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths
> assays(z)[[1]]
            A        B        C        D        E        F
[1,] 9.390704 9.088845 9.726846 9.569678 9.744423 9.664979
[2,] 9.823552 7.222012 5.752299 9.486667 9.746595 8.313257
[3,] 9.496478 7.672814 9.604351 8.800272 8.292126 9.857548
[4,] 8.580828 9.613288 9.681698 9.270826 8.690414 9.233475
[5,] 9.596227 8.729721 9.739728 8.628168 8.309004 6.797500
> colData(z)
DataFrame with 6 rows and 1 column
    Treatment
  <character>
A        ChIP
B       Input
C        ChIP
D       Input
E        ChIP
F       Input

And you can have as many columns in the mcols slot, and add them whenever

> mcols(rse)$whatevs <- rnorm(nrow(rse))
> mcols(rse)$addonemore <- rnorm(nrow(rse))
> rowRanges(rse)
GRanges object with 200 ranges and 3 metadata columns:
        seqnames           ranges strand |  feature_id     whatevs  addonemore
           <Rle>        <IRanges>  <Rle> | <character>   <numeric>   <numeric>
    [1]     chr1 [556101, 556200]      - |       ID001 -0.05584487  -0.6773722
    [2]     chr1 [792975, 793074]      - |       ID002  1.01721394  -0.8628047
    [3]     chr1 [263755, 263854]      - |       ID003  0.67180836   0.4902122
    [4]     chr1 [714331, 714430]      + |       ID004  0.03497479  -2.5660873
    [5]     chr1 [900677, 900776]      - |       ID005 -1.58957034   1.3208983
    ...      ...              ...    ... .         ...         ...         ...
  [196]     chr2 [495890, 495989]      - |       ID196 -0.06389269 -2.75149592
  [197]     chr2 [222582, 222681]      - |       ID197 -1.55996247  1.27020433
  [198]     chr2 [666857, 666956]      + |       ID198  0.36173020  0.49610959
  [199]     chr2 [404246, 404345]      - |       ID199 -1.24144376 -0.31007126
  [200]     chr2 [540493, 540592]      - |       ID200 -0.60194563  0.02290882
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

score 0 · Answer 2 · 2017-01-25

0

Entering edit mode

rbronste ▴ 60

@rbronste-12189

Last seen 5.5 years ago

I guess I am still a little confused. I am using a DiffBind output that has a number of columns according to they sampleSheet. The basic thing I want to do is to be able to filter and sort by any specific column or multiple columns simultaneously - such as FDR and fold change.

ADD COMMENT • link 8.3 years ago rbronste ▴ 60

0

Entering edit mode

If you want to make a comment, please use the ADD COMMENT button rather than the 'Add your answer' box, which is intended for answers, not comments.

Your questions are needlessly mysterious. If you have an example of what you are trying to do, then maybe we can give some pointers.

But so far you are asking generalized questions like 'I want to filter and sort' which are just basic R manipulations. If you are having problems with basic R stuff, you should read 'An Introduction to R', and note that a SummarizedExperiment is intended to act as if it were a data.frame, so anything you can do with a data.frame will work pretty much the same way.

ADD REPLY • link 8.3 years ago James W. MacDonald 68k