Identify recurrent isoform switches using IsoformSwitchAnalyzeR in multiple rectal tumor samples
1
0
Entering edit mode
asmita • 0
@asmita-22096
Last seen 5.3 years ago

Dear All,

I am using IsoformSwitchAnalyzeR package to identify recurrent isoform switches in a set of 24 rectal tumor samples. The isoform and gene quantification has been done in RSEM.

However, the package requires specifying two different conditions for making pairwise comparisons in the design matrix under $condition column. Since all my samples are rectal tumor samples, I was interested in all possible pairwise comparisons between these samples to identify recurrently occurring switches. I created the $condition column with only one label "CRC", and the program terminated with error. This is the code which I was using -

library(IsoformSwitchAnalyzeR)
library(factoextra)

## create isoform quantification data (step 1: importIsoformExpression())

myquant <- importIsoformExpression("RSEM_isoform_files/", normalizationMethod = 'TMM', showProgress = TRUE)
meta_data <- read.csv('transcriptome_meta_data_test.tsv', sep='\t')

# Test whether the program's normalization is working by calculating PCA
tpm_mat <- myquant$abundance    
row_head <- tpm_mat[,1]                       # convert column1 (isoform identifiers) as row headers
row.names(tpm_mat) <- row_head
tpm_mat[,1] <- NULL

pca <- prcomp(t(tpm_mat))
fviz_pca_ind(pca,col.ind = "contrib", pointsize ="contrib", 
             gradient.cols = brewer.pal(10, "Spectral"),
             repel = TRUE, labelsize = 4)

# create design matrix for importR Data , for next step
sampleID <- colnames(myquant$counts)[-1]
condition <- rep('CRC', each=24)
intron_level <- c(rep("high", each=6), 'low', rep('high', each=7), rep('low', each=10))
batch <- as.character(meta_data$BATCH)

design_matrix <- cbind(sampleID, condition, intron_level, batch)
design_matrix <- data.frame(design_matrix) 

# import R data from previous step to create a switchAnalyzeRlist list (step 2: importRdata())

myswitchlist <- importRdata(
    isoformCountMatrix = myquant$counts,
    isoformRepExpression = myquant$abundance,
    designMatrix = design_matrix,
    isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf",
    showProgress = TRUE,
    ignoreAfterPeriod = TRUE
)

I got this error at the second step - importRdata()

Step 1 of 6: Checking data...
Error in importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance,  : 
  The supplied 'designMatrix' only contains 1 condition

output of traceback()

2: stop("The supplied 'designMatrix' only contains 1 condition")
1: importRdata(isoformCountMatrix = myquant$counts, isoformRepExpression = myquant$abundance, 
       designMatrix = pheno_data, isoformExonAnnoation = "Homo_sapiens.GRCh38.94.gtf", 
       showProgress = TRUE, ignoreAfterPeriod = TRUE)

Sample of my design matrix -

sampleID condition intronlevel . batch RIT1 CRC RIT2 CRC RIT3 . CRC RIT4 . CRC RIT5 . CRC ... ... ... ... ... ...

I don't have multiple conditions across which I can do a comparison.

Is there a way in which I can do all possible pairwise comparisons between samples?? how can I modify the requirements of condition column in design matrix?

cancer Isoform limma IsoformSwitchAnalyzeR • 1.5k views
ADD COMMENT
0
Entering edit mode

Could you elaborate on what you mean by "recurrently occurring switches"?

ADD REPLY
0
Entering edit mode

I have RNA-Seq data from 24 tumor samples. By "recurrent switches", I mean to identify an isoform switch or in other words, a preferential isoform usage across samples.

For example - If a gene has 2 isoforms A and B, I would like to identify the % of cases or samples where isoform A is used more than B or vice versa; showing switch between these two forms.

The idea is - if 10 out of 24 samples show a higher or positive dIF score for isoform A than B, I can then take a step back and identify whether these 10 samples belong to a specific pathologic group (tumor grade/stage etc.), rather than moving in opposite direction i.e. classify the samples into groups according to their conditions and then perform pairwise comparisons between these conditions.

The usual workflow in IsoformSwitchAnalyzeR involves comparisons between conditions to identify switch. Here, I want to make all possible comparisons between 24 samples itself and then classify them into different conditions. I hope I am making some sense here.

ADD REPLY
1
Entering edit mode
@kvittingseerup-7956
Last seen 17 months ago
European Union

Hi Asmita

Thanks for reaching out. You describe an interesting, but very hard to implement, idea as switches in two different genes could use different groupings. IsoformSwitchAnalyzeR unfortunately does not support such analysis - it requires the groupings up front. Doing all possible devisions of 24 samples into two groups results in more than 16 million possible groupings so brute forcing it is not feasible either.

The only suggestion I have is that you can try to do unsupervised clustering (PCA, dendrogram etc) on the isoform fractions to see if when you analyse the relative isoform usage some grouping will show itself.

To get the isoform fractions from the TPM matrix you can use the isoformToIsoformFraction()function from IsoformSwitchAnalyzeR.

Cheers Kristoffer

ADD COMMENT
0
Entering edit mode

Thanks for the response! I will try it out.

ADD REPLY

Login before adding your answer.

Traffic: 579 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6