I am following this to make pseudobulk and perform DESeq2 https://hbctraining.github.io/scRNA-seq/lessons/pseudobulk_DESeq2_scrnaseq.html
Here they perform the DE analysis on B cells. I believe we should be able to do this on multiple cell types at the same time? I was able to subset the metadata to 3 celltypes (for example, B, CD4, CD8 T cells). I was also able to subset the counts to only these cells.
Below is the original code on B cell cluster only. I am stuck at "cluster_counts <- data.frame(counts[, which(colnames(counts) %in% rownames(cluster_metadata))])" for my 3-cell analysis because I can't have duplicate rownames for cluster_metadata. The row names of the metadata are sample names. But the samples would be the same for each cell type so there would be 3 duplicates for each sample (one for every cell type).
Is there another way to look at multiple subsets of cells at once? Or should I go back to the aggregation step and only aggregate the 3 cell types I want?
# Subset the metadata to only the B cells
cluster_metadata <- metadata[which(metadata$cluster_id == clusters[1]), ]
head(cluster_metadata)
# Assign the rownames of the metadata to be the sample IDs
rownames(cluster_metadata) <- cluster_metadata$sample_id
head(cluster_metadata)
# Subset the counts to only the B cells
counts <- pb[[clusters[1]]]
cluster_counts <- data.frame(counts[, which(colnames(counts) %in% rownames(cluster_metadata))])
# Check that all of the row names of the metadata are the same and in the same order as the column names of the counts in order to use as input to DESeq2
all(rownames(cluster_metadata) == colnames(cluster_counts))