I’m working on a problem where I need to look at simultaneity across drug regimens. I’ve coded a solution in sql. It works, but It is woefully inefficient. I was hoping to adapt the iRanges package to solve the problem instead. I was wondering if you might be able to chime in on whether this was a reasonable thing to attempt, since the package was crafted with other things in mind.
By way of example, I need to count the maximum number of simultaneous drug regimens a patient was on over a period defined by the input file (simultaneous for this use case is defined as any therapy overlapping for >30 days).The input is reflected in the synthetic example below, where, in this case, the subject was simultaneously on 5 drugs.
DRUGGROUP | IntervalStart | IntervalEnd | IntervalWidth |
ACE_ARB | 7/18/2011 | 9/30/2015 | 1535 |
Aldosterone_antagonists | 7/21/2011 | 9/30/2015 | 1532 |
Beta_Blockers | 8/11/2011 | 12/3/2011 | 114 |
Beta_Blockers | 10/19/2012 | 9/30/2015 | 1076 |
Dihydro_CCB | 7/23/2011 | 9/30/2015 | 1530 |
Diuretics | 7/21/2011 | 9/30/2015 | 1532 |
I truly appreciate any thoughts
Hi,
It's not clear what you mean by "counting the maximum number of simultaneous drug regimens a patient was on". Let's say you've managed to store the above input in an IRanges object where the
start
andend
are theIntervalStart
andIntervalEnd
counted in number of days since January 1st, 2011. The object would look something like this:Note that you don't see it here but let's say that this object has a metadata column showing the drug group used on each time interval:
Can you clarify the output you would expect from "counting the maximum number of simultaneous drug regimens a patient was on"?
FWIW here are some basic operations you can do on this IRanges object. For example you can get the nb of drug regimens the patient was on at any given time with
coverage()
:The run lengths of this Rle object are numbers of days. A more user-friendly representation of this is with the following data.frame:
You can also use
slice()
oncvg
to find the intervals of time when the patient was on a given number of drug regimens:Note that you can use
start()
andend()
on the output ofslice()
to get the interval starts and ends. Then addas.Date("2011/1/1")
to them to turn them into dates again etc...Cheers,
H.