What is the relationship between sample_id and NBCI_accession variables in the sampleMetadata? For most samples, there is one NCBI_accession listed per sample_id. For some samples, however, NCBI_accession contains a semi-colon delimited list of accessions. For example, here is what the sample_id/NCBI_accession fields look like for VogtmannE_2016:
> sampleMetadata %>% filter(study_name == "VogtmannE_2016") %>%select(sample_id, NCBI_accession) %>% head
sample_id NCBI_accession
1 MMRS11288076ST-27-0-0ERR1293500;ERR1293499;ERR1293498;ERR1293497;ERR1293059;ERR1293058;ERR1293057;ERR1293056
2 MMRS11664448ST-27-0-0ERR1293861;ERR1293860;ERR1293859;ERR1293858;ERR1293420;ERR1293419;ERR1293418;ERR1293417
3 MMRS11932626ST-27-0-0ERR1293881;ERR1293880;ERR1293879;ERR1293878;ERR1293440;ERR1293439;ERR1293438;ERR1293437
4 MMRS12272136ST-27-0-0ERR1293877;ERR1293876;ERR1293875;ERR1293874;ERR1293436;ERR1293435;ERR1293434;ERR1293433
5 MMRS14379078ST-27-0-0ERR1293548;ERR1293547;ERR1293546;ERR1293545;ERR1293107;ERR1293106;ERR1293105;ERR1293104
6 MMRS14602194ST-27-0-0ERR1293813;ERR1293812;ERR1293811;ERR1293810;ERR1293372;ERR1293371;ERR1293370;ERR1293369
How should I interpret multiple NCBI accessions per sample? Are these multiple sequencing runs of the same library? For bioinformatic analyses, were these samples concatenated?
This makes sense to me. Thanks both for your replies.