Cell sample size vs sequencing depth: find your compromise.

With newer technologies enabling the screening of an ever-higher number of cells at a cheaper cost, long gone are the times of intensive labor on a small number of cells. ScRNA-seq can now potentially support a wide range of options – typically from 10^2 to 10^6 cells being processed in parallel – but in-depth sequencing of hundreds of thousands of separate cells would overload most sequencing platforms, while also considering the huge overhead costs and the resulting massive datasets to analyze.

 

A better variable to consider for your experiments is the number of reads per cell, which you can adapt depending on the biological purpose of your study. A smaller number of cells with a high sequencing depth should provide more robust transcriptomic data, filtering out the technical noise and providing a more reliable snapshot of the transcriptional state of each cell. On the opposite, a larger number of cells at the cost of a low sequencing depth is a better representation of a cell population, particularly in the presence of multiple subtypes or even potential rare cell types.

 

This Nature Communications paper suggests that “given a fixed [sequencing] budget, sequencing as many cells as possible at approximately one read per cell per gene is optimal, both theoretically and experimentally.”. From this cue, always keep in mind the aim of your research question. For example, if you are trying to identify novel cellular subtypes or quantify the number of rare cells in a biological sample, then obviously plan towards the higher cell number limit your ScRNA-seq protocol can handle (10,000 to 50,000 reads per cell can suffice for this purpose [1][2]). If you are trying to characterize a cellular subtype through accurate gene expression estimation, then plan for deeper sequencing depth.

 

Those are general guidelines, from which you need to take practical decisions depending on your sample and the limitations of the cell capture methods you have chosen or you have access to (see below). Here are some examples:

  • Regarding the nature and the extraction method of your sample, how many biological and technical replicates can you afford in a single ScRNA-seq run?
  • If you expect a heterogeneous tissue with numerous subtypes and rare cells, what should be the minimal amount of cells to process to have a high chance of identifying them? (Hint: you can play around with this tool for rough estimations).
  • How much RNA does your sample tissue typically yields? Organs such as the heart, the spleen, the liver or kidneys are usually bountiful (in humans) but on the opposite muscles, bone and adipose tissues provide up to ten times fewer RNA molecules.
  • Are you aiming for 3′ sequencing (low depth with ≈50,000 reads per cell) or for full-transcript sequencing (high depth with ≈ 1,000,000 read per cell)?