A swathe of 17 linked studies took over an entire issue of Nature last October. The feat, from the BRAIN Initiative Cell Census Network (BICCN), an international consortium led by the Allen Institute for Brain Science in Seattle, US, and funded by the NIH’s BRAIN initiative, represents the largest-scale demonstration of the power of single-cell sequencing to date.
The aim was to produce a comprehensive census of all cell types in one brain region – the primary motor cortex – across mice, non-human primates, and humans. This effort is crucial to advancing neuroscience. “If we can all agree on the components the problems become different,” says neuroscientist Prof. Ed Lein, of the Allen Institute, who helped coordinate the project and led two of the studies. “We’re not still in the ‘defining the system’ phase, we’re in the ‘understanding how the system works – and how it can go awry‘ phase.”
A Singular Focus
At the heart of the work, is single-cell RNA sequencing. “What single-cell genomics gives you, is on the one hand remarkable throughput, so you have the scale to tackle very complex systems,” says Lein. On the other hand, is the resolution needed to characterize individual cells. “We need to understand how each cell type deploys its genome differently,” says neuroscientist Dr. Fenna Krienen of Harvard Medical School, who worked on the cross-species study. “That’s what single-cell resolution enables.”
Focussing on a single brain region allowed the project to coordinate efforts and standardize methods. “The motor cortex was a great starting point for aligning efforts across different groups because, of all the parts of the brain, it is anatomically relatively well-defined,” says neuroscientist Dr. Aparna Bhaduri at the University of California, Los Angeles, who led a study of human brain development. “Such that we could develop the tools to define that anatomy, then start to scale to the rest of the mouse brain as well as larger organisms.” The package also included studies asking questions about human brain development and evolution, as well as looking at a few other brain regions.
“We don’t claim to have the largest number of cells profiled with each modality. But collectively, this package is the largest single-cell genomics study of cells.” Dr. Hongkui Zeng, Executive Vice President, Director of the Allen Institute for Brain Science
A Wealth of Tools
Single-cell RNA sequencing generates gene activity profiles, which, together with epigenetic markers (which alter gene activity without changing the underlying genetic code), provide a truly mechanistic basis for classifying cells. “Genomic profiling gets to the root of the identity of cells,” says neuroscientist Dr. Hongkui Zeng, Executive Vice President, Director of the Allen Institute for Brain Science, who co-led a study that integrated data from seven different RNA sequencing tools and two epigenetic techniques to derive a comprehensive cell census of the mouse primary motor cortex. The tools included SMART-Seq, which provides full-genome length sequencing, as well as platforms that can analyze many more cells for the same cost, albeit with shallower sequencing. “If you integrate those two datasets together they compensate for each other,” says Zeng. They also used single-nucleus sequencing, which allows frozen samples to be used, as the damage to cellular integrity caused by the cryofreezing process mainly affects the cytoplasmic mRNA. “[This] opened it up to being able to use any species, any archive material, disease material, anything,” says Lein. The epigenetic methods assayed DNA methylation (chemical markers that down-regulate gene function), and regions of open chromatin, which reveal where the genome is accessible for transcription. These provide information on cell-type-specific regulatory mechanisms.
The project marshaled a greater number of methods for probing the features of brain cells than ever before. In addition to transcriptomics and epigenomics, other tools probed cell shape, neuron firing properties, and connectivity. Some even assessed multiple features simultaneously. A technique called Patch-seq, for instance, allows a cell’s electrical properties, gene expression, and 3D shape to all be measured. Using transcriptomes as the “ground truth” for defining cell types allowed the researchers to link information across multiple properties. Spatial transcriptomics methods, which combine RNA sequencing with imaging, were used to determine the spatial distribution of cell types in the mouse brain. “Some of the technologies are getting us to the point where, instead of just having the data on a computer, you can actually see it within the architecture of the brain,” says Bhaduri. “That will integrate a lot of our understanding of what cell types exist, but also how they work because where they’re located in the brain is really important.” Work has begun on embedding data in a 3D coordinate space – the beginnings of a cell atlas. “That will be transformative,” says Bhaduri.
A Treasure Trove of Data
These techniques had all been used before, but never together in a coordinated effort of this scale. “We don’t claim to have the largest number of cells profiled with each modality,” says Zeng. “But collectively, this package is the largest single-cell genomics study of cells.” By integrating datasets, the consortium derived a hierarchical taxonomy of brain cell populations, with major branches reflecting groupings with shared developmental origins. A first branch separates neural and non-neural cells, and a second divide neuronal and non-neuronal (glial) types. Neurons then split into excitatory (glutamatergic) and inhibitory (GABAergic) types. These then divide into 24 major “subclasses” (including the non-neural and glial types). These are further divided to arrive at the final level, called “t-types”, (for “transcriptional”). At the sub-class level, types are highly conserved, but details start to diverge significantly at the level of t-types, reflecting species specializations. The number of t-types differs between species (116 in mice, 127 in humans, 94 in marmosets), but by integrating data across species the researchers found 45 conserved t-types.
The project datasets are centrally organized by the BRAIN Cell Data Centre (BCDC) and have been made publically available via the BICCN web portal for researchers to scrutinize and make use of. The data provides researchers with molecular cell type markers, which will enable them to genetically target individual cell types, using, for example, genetically engineered mice, or viruses. Future work using these tools will advance our understanding of, and ability to treat, disease.
A Common Language
The NIH has already funded the next stage of the project, called the BRAIN Initiative Cell Atlas Network (BICAN). That next leap in scale will come with new demands. “We’ll be dealing with a much larger dataset; how do we do computational analysis across multi-millions of cells?” asks Zeng. “A major need is computational power.” As the project progresses, the basic tools are likely to improve. “Full-length sequencing approaches are still very expensive, but we hope it will become more affordable and scalable,” says Zeng. One of the most important, and fastest developing, areas, is spatial transcriptomics. “It’s the future of cell genomics,” says Zeng. “Because you want to know, not only the identities of cells, but also where they are, how they’re distributed, what their neighbors are; you lose all that spatial information when you dissociate the tissue.”
Ultimately, the project aspires to provide a foundational reference for neuroscience. “Much like the human genome is a foundational reference for all of genetics, now we’re going to have a foundational cell type classification,” says Lein. None of this would have been possible without single-cell sequencing. “Single-cell genomics is really transforming this field, and many other fields of biology,” concludes Lein. “It has provided a common language for describing cellular diversity.”