Ballgown from StringTie results transcripts which are not expressed in both conditions as Differentially expressed ones
1
0
Entering edit mode
alva.james • 0
@alvajames-6967
Last seen 6.3 years ago
Germany

Hello All,

I have tried ballgown from StringTie on STAR-align->StringTie data results.And with statest I tried get differentially expressed transcripts. But when I take a median between both conditions I see for several transcripts the median is 0 for both condition  and still its has p_value less than 0.05 .

And so  I classified as DE transcripts. I would like to have Differentially expressed transcripts from the results of StringTie and from the github explannation I have understood Stattest does it . And I wondering how does it works likeDeseq , egeR etc takes fold change into account 

 

 

ballgown stringtie • 2.0k views
ADD COMMENT
0
Entering edit mode
Jeff Leek ▴ 650
@jeff-leek-5015
Last seen 3.8 years ago
United States

This seems a little unusual - but we have seen it happen when a transcript has zero expression in one group and moderate expression in another, resulting in a relatively strong differential expression signal but a median expression of zero. I wonder if you could look and see what the expression values were for that transcript across samples? 

 

Jeff

ADD COMMENT
0
Entering edit mode

. I wonder if you could look and see what the expression values were for that transcript across samples?  -->the expression values are also zero across the samples for those transcripts which are identified as Significantly DE ones

 

ADD REPLY
0
Entering edit mode

That seems very strange that they have entirely zero values but a small p-value. This is just a simple linear model in Stringtie. Can you please post your data/code so I can try to assist? 

 

Jeff

ADD REPLY
0
Entering edit mode

Code is just as it is in GitHub I followed ,

pData(bg) =data.frame(id=sort(sampleNames(bg)),group=sort(sampleNames(bg)))
 pData(bg) <-cbind(pData(bg) ,as.data.frame(str_split_fixed(pData(bg)$group,"_",2)))
 pData(bg)$group<-NULL
pData(bg)$V1<-NULL
colnames(pData(bg))<-c("id","group")

head(pData(bg))

         id group
1   AE02_ID    ID
2  AE02_REL   REL
3   AE04_ID    ID
4  AE04_REL   REL
5   AE05_ID    ID
6  AE05_REL   REL
7   AE10_ID    ID
8  AE10_REL   REL
# here I just replced ID and REL with 0 an 1 just to make sure its as the gihub explannation

pData(bg)$group<- str_replace_all(pData(bg)$group, "ID", "1")
pData(bg)$group<- str_replace_all(pData(bg)$group, "REL", "0")

stat_results = stattest(bg, feature='transcript', meas='FPKM', covariate='group')
head(stat_results)

 head(stat_results)
      feature id      pval      qval
6  transcript  6 0.3325078 0.8064427
11 transcript 11 0.8350343 0.9564246
17 transcript 17 0.2149321 0.8064427
19 transcript 19 0.3622309 0.8064427
20 transcript 20 0.8265769 0.9538413
21 transcript 21 0.1647989 0.8064427

 

# and then I filtered and annotated the resulted data frame

stat_results_filtered=stat_results[stat_results[["pval"]] <=0.05, ]

results_withFPKM=merge(stat_results_filtered,transcript_data_frame,by='t_id')

 

 

 

 

ADD REPLY

Login before adding your answer.

Traffic: 886 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6