Dear All,
i would like to ask a very specific question about merging common rows of multiple data.frames regarding some statistics on the columns. In detail, after some pre-processesing procedures, i have acquired 2 data frames, which in the rows have gene symbols and in the columns various statistics like mean, standard deviation etc. Thus, in order to merge the two dataframes by keeping only the common rows-symbols and the relevant statistics, i tried:
dim(a2) [1] 18172 6
head(a2) # the first data frame Control.Mean_GSE8993 Control.SD_GSE8993 IR.Mean_GSE8993 IR.SD_GSE8993 CXCL3 11.040152 1.926972 12.853060 1.2865673 CXCL8 11.023752 2.053631 12.215751 1.3443649 CXCL2 12.955409 1.644389 14.533220 1.0109802 AREG 12.570493 1.892597 14.029891 0.6818683 NR4A2 8.366594 1.357853 9.642525 1.3426967 CXCL6 11.827066 1.882204 12.727931 0.9607712 BY.Mean_GSE8993 BY.SD_GSE8993 CXCL3 8.421211 0.7262789 CXCL8 7.999791 0.7654306 CXCL2 10.828815 0.7256327 AREG 10.439334 0.6265784 NR4A2 7.282875 0.9353340 CXCL6 9.592875 0.5901343
& the second data frame
dim(d2) [1] 18173 6 head(d2) Control.Mean Control.SD IR.Mean IR.SD BY.Mean BY.SD LENG8 10.953919 2.044573 10.850738 2.283272 10.445768 1.946263 FOSB 8.944820 2.113943 9.509101 1.956087 9.099309 2.023522 FOXE1 9.940223 1.966307 10.307348 1.968307 9.783286 1.594923 CACNA1E 11.123550 1.915697 11.471386 1.898161 10.892187 1.528324 CYB561D1 11.285938 2.024681 11.496708 2.184631 10.813287 1.473858 IL6 11.551701 1.631311 12.415631 2.638829 11.385419 1.902916
and then i tried
m <- combine(a2,d2) dim(m) [1] 18178 12
But the main issue that concerns me, is that when i tried a venn diagram, the above 2 data frames have in common 18165 gene symbols, while the merged resulted data frame has 18178 rows-symbols. So, what's wrong with this approach ? Should i use another function or approach ?
And finally, could this applied to more than 2 data.frames ??
Any ideas or help are appreciated !!
Konstantinos
I do not know of a way to merge multiple data sets; do them sequentially.