The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

J S D ( P R , P C ) = H ( P R + P C 2 ) H ( P R ) + H ( P C ) 2

where H ( P ) = -Σ p i log( p i ) represents the Shannon entropy of a probability distribution P. The range of JSD values is between 0 and 1, where 0 means identical distribution and 1 means extreme difference. Finally, the regulon specificity score (RSS) is defined by converting JSD to a similarity score:

R S S ( R , C ) = 1 J S D ( P R , P C )

For each cell type, the essential regulators are predicted as those associated with the highest cell-type specific scores.

Functional validation

We apply the following two methods to validate whether the predicted regulons are functional related to their associated cell types: (1) SEEK analysis ( Zhu et al., 2015 ), and (2) CoCiter analysis ( Qiao et al., 2013 ). First, SEEK ( http://seek.princeton.edu/modSeek/mouse/ ) is a tool that provides the gene co-expression search function for over ~2000 mouse datasets from the Gene Expression Omnibus (GEO). We used the mouse version of SEEK to evaluate whether the genes in a regulon are co-expressed, and if so whether the datasets supporting the co-expression are associated with an interested cell type. If genes are significantly co-expressed in many datasets related to a certain cell type, it could be inferred that the function of this regulon is highly related to this cell type. Taking the erythroblast for example, we input gene list of regulon Lmo2 to SEEK web server and search for erythroblast related keywords (such as ‘erythroblast’, ‘hematopoietic’, etc.) from the complete dataset-list ranked by query-coexpression score. Then we choose a p < 0.01 cutoff to select significant datasets and finally use Fisher’s exact test to evaluate whether the selected datasets are significantly enriched in the top ranks. Second, the CoCiter ( Qiao et al., 2013 ) is a text mining approach against the up-to-date Medical Literature Analysis and Retrieval System Online (MEDLINE) literature database to evaluate the co-citation impact (CI, log-transformed paper count) between a gene list and a term. To assess significance of co-citation, a Monte Carlo approach is used to evaluate random expectations by randomly selecting 1000 gene sets with the same size as input gene list and then a permutation p value is calculated as the number of times that CI random > CI true divided by 1000. Here we used the function “gene-term” in CoCiter (use default parameters but set organism as mouse, http://www.picb.ac.cn/hanlab/cociter ) to check whether the genes in a regulon are significantly co-cited with a certain cell type in literatures.

Regulon module analysis

Regulon modules were identified based on the Connection Specificity Index (CSI) ( Fuxman Bass et al., 2013 ), which is a context-dependent measure for identifying specific associating partners. The evaluation of CSI involves two steps. First, the Pearson correlation coefficient (PCC) of activity scores is evaluated for each pair of regulons. Next, for a fixed pair of regulons, A and B, the corresponding CSI is defined as the fraction of regulons whose PCC with A and B is lower than the PCC between A and B.

Hierarchical clustering with Euclidean distance was performed based on CSI matrix to identify different regulon modules. We also used CSI > 0.7 as a cutoff to build the regulon association network to investigate the relationship of different regulons. The result was visualized by Cytoscape ( Shannon et al., 2003 ). We used the same strategy to identify submodules within M7. For each regulon module, its activity score associated with a cell type is defined as the average of the activity scores of its regulon members in all cells within this cell type. Then the top ranked cell types are identified for each module.

Quantifying cell type relationship

Using the gene regulatory network analysis as a guide, we quantified the relationship between different cell-types based on the similarity of the overall regulon activities, which is quantified by the Spearman correlation coefficient. The results were represented as a network, where a pair of cell types were connected if the Spearman correlation coefficient is greater than 0.8. Again, the result was visualized by using Cytoscape. Groups of related cell-types were identified by using the Markov Clustering Algorithm (MCL) ( van Dongen and Abreu-Goodger, 2012 ), as implemented in the ClusterMaker application in Cytoscape. We used the default setting except setting the inflation parameter as 2.

Web service

We created an interactive, web-based portal to explore the network atlas in this study (URL: http://regulon.rc.fas.harvard.edu ). This interactive website is constructed with some of latest technologies including JavaScript libraries jQuery 3.3, Bootstrap 4, and Leaflet 1.3. Together these libraries provide efficient client-side search, zooming functions for the large cell type network. The site is hosted on an Apache web server running the Apache Tomcat which provides the necessary back-end support for the web server. Users can zoom-in on a part of network, mouse-over, click on a cell type in the network, and browse information about the associated regulons and other most similar cell types. The website also provides a complete, downloadable list of pairwise regulon-cell type associations.

QUANTIFICATION AND STATISTICAL ANALYSIS

Details of the statistical tests used in this study are described briefly in the main text and more in-depth in the subsections above. They are also summarized below:

  • (1)
    To evaluate the consistency of identified regulons in three Avg20 replicates in each of bladder, kidney and bone marrow tissue, we counted the number of overlapped regulon TFs between different replicates and applied the one-sided Fisher’s exact test to evaluate statistical significance.
  • (2)
    t test was used evaluate whether the performance of Avg20 approach is better (p < 0.05) than that of using all single cells in each of bladder, kidney and bone marrow tissue based on silhouette value.
  • (3)
    When applying SEEK to test whether the genes in regulons are co-expressed in certain cell type, we chose correlation significant datasets (p < 0.01) and then used one-sided Fisher’s exact test to evaluate whether datasets related to interested cell type are significantly enriched (p < 0.01) in the top ranks.
  • (4)
    In CoCiter analysis, a permutation p value was introduced. It randomly selected 1000 gene sets with the same size of tested regulon and the p value was calculated as the number of times that co-citation impact of “random” larger than “true” divided by 1000.

ADDITIONAL RESOURCES

We created an interactive, web-based portal for community to explore the network atlas in this study. URL: http://regulon.rc.fas.harvard.edu .

Highlights

  • Computational reconstruction of gene regulatory networks for major cell types in mouse
  • Cell-type-specific regulons are organized into eight combinatorial modules
  • Prediction of a small set of essential regulators for each cell type
  • An interactive web portal for navigating predicted mouse cell network atlas

ACKNOWLEDGMENTS

We thank Dr. Zhe Li from Brigham and Women’s Hospital and Dr. Ruben Dries from Dana-Farber Cancer Institute for helpful discussion about this study. This research was supported by a Claudia Barr Award and NIH (R01HL119099 to G.-C.Y.).

Footnotes

SUPPLEMENTAL INFORMATION

Supplemental Information includes four figures and four tables and can be found with this article online at https://doi.org/10.1016/j.celrep.2018.10.045 .

DECLARATION OF INTERESTS

The authors declare no competing interests.

REFERENCES

  • Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, Rambow F, Marine JC, Geurts P, Aerts J, et al. (2017). SCENIC: single-cell regulatory network inference and clustering . Nat. Methods 14 , 1083–1086. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bjerregaard MD, Jurlander J, Klausen P, Borregaard N, and Cowland JB (2003). The in vivo profile of transcription factors during neutrophil differentiation in human bone marrow . Blood 101 , 4322–4332. [ PubMed ] [ Google Scholar ]
  • Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, and Rinn JL (2011). Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses . Genes Dev . 25 , 1915–1927. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C, Furlan SN, Steemers FJ, et al. (2017). Comprehensive single-cell transcriptional profiling of a multicellular organism . Science 357 ,661–667. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chizaki R, Yao I, Katano T, Matsuda T, and Ito S (2012). Restricted expression of Ovol2/MOVO in XY body of mouse spermatocytes at the pachytene stage . J. Androl . 33 , 277–286. [ PubMed ] [ Google Scholar ]
  • Davie K, Janssens J, Koldere D, De Waegeneer M, Pech U, Kreft Ł, Aibar S, Makhzami S, Christiaens V, Bravo González-Blas C, et al. (2018). A single-cell transcriptome atlas of the aging Drosophila brain . Cell 174 , 982–998.e20. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • de Hoon MJ, Imoto S, Nolan J, and Miyano S (2004). Open source clustering software . Bioinformatics 20 , 1453–1454. [ PubMed ] [ Google Scholar ]
  • ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome . Nature 489 , 57–74. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fiers MWEJ, Minnoye L, Aibar S, Bravo González-Blas C, Kalender Atak Z, and Aerts S (2018). Mapping gene regulatory networks from single-cell omics data . Brief. Funct. Genomics 17 , 246–254. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fiévez L, Desmet C, Henry E, Pajak B, Hegenbarth S, Garzé V, Bex F, Jaspar F, Boutet P, Gillet L, et al. (2007). STAT5 is an ambivalent regulator of neutrophil homeostasis . PLoS ONE 2 , e727. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fincher CT, Wurtzel O, de Hoog T, Kravarik KM, and Reddien PW (2018). Cell type transcriptome atlas for the planarian Schmidtea mediterranea . Science 360 , Published online April 19, 2018. https://doi.org/10.1126/science.aaq1736 . [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fuxman Bass JI, Diallo A, Nelson J, Soto JM, Myers CL, and Walhout AJ (2013). Using networks to measure similarity between genes: association index selection . Nat. Methods 10 , 1169–1176. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Galván JA, Helbling M, Koelzer VH, Tschan MP, Berger MD, Hädrich M, Schnüriger B, Karamitopoulou E, Dawson H, Inderbitzin D, et al. (2015). TWIST1 and TWIST2 promoter methylation and protein expression in tumor stroma influence the epithelial-mesenchymal transition-like tumor budding phenotype in colorectal cancer . Oncotarget 6 , 874–885. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Grabowska MM, Elliott AD, DeGraff DJ, Anderson PD, Anumanthan G, Yamashita H, Sun Q, Friedman DB, Hachey DL, Yu X, et al. (2014). NFI transcription factors interact with FOXA1 to regulate prostate-specific gene expression . Mol. Endocrinol . 28 , 949–964. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Han DW, Tapia N, Hermann A, Hemmer K, Höing S, Araúzo-Bravo MJ, Zaehres H, Wu G, Frank S, Moritz S, et al. (2012). Direct reprogramming of fibroblasts into neural stem cells by defined factors . Cell Stem Cell 10 , 465–472. [ PubMed ] [ Google Scholar ]
  • Han X, Wang R, Zhou Y, Fei L, Sun H, Lai S, Saadatpour A, Zhou Z, Chen H, Ye F, et al. (2018). Mapping the mouse cell atlas by microwell-seq . Cell 172 , 1091–1107. [ PubMed ] [ Google Scholar ]
  • Ieda M, Fu JD, Delgado-Olguin P, Vedantham V, Hayashi Y, Bruneau BG, and Srivastava D (2010). Direct reprogramming of fibroblasts into functional cardiomyocytes by defined factors . Cell 142 , 375–386. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kaga K, Inoue KI, Kaga M, Ichikawa T, and Yamanishi T (2017). Expression profile of urothelial transcription factors in bladder biopsies with interstitial cystitis . Int. J. Urol . 24 , 632–638. [ PubMed ] [ Google Scholar ]
  • Kiselak EA, Shen X, Song J, Gude DR, Wang J, Brody SL, Strauss JF 3rd, and Zhang Z (2010). Transcriptional regulation of an axonemal central apparatus gene, sperm-associated antigen 6, by a SRY-related high mobility group transcription factor, S-SOX5 . J. Biol. Chem . 285 , 30496–30505. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kurachi M, Barnitz RA, Yosef N, Odorizzi PM, DiIorio MA, Lemieux ME, Yates K, Godec J, Klatt MG, Regev A, et al. (2014). The transcription factor BATF operates as an essential differentiation checkpoint in early effector CD8+ T cells . Nat. Immunol . 15 , 373–383. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lee RH, Seo MJ, Reger RL, Spees JL, Pulin AA, Olson SD, and Prockop DJ (2006). Multipotent stromal cells from human marrow home to and promote repair of pancreatic islets and renal glomeruli in diabetic NOD/scid mice . Proc. Natl. Acad. Sci. USA 103 , 17438–17443. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liu P, Keller JR, Ortiz M, Tessarollo L, Rachel RA, Nakamura T, Jenkins NA, and Copeland NG (2003). Bcl11a is essential for normal lymphoid development . Nat. Immunol . 4 , 525–532. [ PubMed ] [ Google Scholar ]
  • Mason RJ (2006). Biology of alveolar type II cells . Respirology 11 ( Suppl ), S12–S15. [ PubMed ] [ Google Scholar ]
  • Murakami T, Saito A, Hino S, Kondo S, Kanemoto S, Chihara K, Sekiya H, Tsumagari K, Ochiai K, Yoshinaga K, et al. (2009). Signalling mediated by the endoplasmic reticulum stress transducer OASIS is involved in bone formation . Nat. Cell Biol . 11 , 1205–1211. [ PubMed ] [ Google Scholar ]
  • Nechanitzky R, Akbas D, Scherer S, Györy I, Hoyler T, Ramamoorthy S, Diefenbach A, and Grosschedl R (2013). Transcription factor EBF1 is essential for the maintenance of B cell identity and prevention of alternative fates in committed cells . Nat. Immunol . 14 , 867–875. [ PubMed ] [ Google Scholar ]
  • Nerlov C, McNagny KM, Döderlein G, Kowenz-Leutz E, and Graf T (1998). Distinct C/EBP functions are required for eosinophil lineage commitment and maturation . Genes Dev . 12 , 2413–2423. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Okada A, Ohta Y, Brody SL, and Iguchi T (2004). Epithelial c-jun and c-fos are temporally and spatially regulated by estradiol during neonatal rat oviduct differentiation . J. Endocrinol . 182 , 219–227. [ PubMed ] [ Google Scholar ]
  • Orkin SH, and Zon LI (2008). Hematopoiesis: an evolving paradigm for stem cell biology . Cell 132 , 631–644. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Osswald CD, Xie L, Guan H, Herrmann F, Pick SM, Vogel MJ, Gehringer F, Chan FC, Steidl C, Wirth T, and Ushmorov A (2018). Fine-tuning of FOXO3A in cHL as a survival mechanism and a hallmark of abortive plasma cell differentiation . Blood 131 , 1556–1567. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Plass M, Solana J, Wolf FA, Ayoub S, Misios A, Glazar P, Obermayer B, Theis FJ, Kocks C, and Rajewsky N (2018). Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics . Science 360 , Published online April 19, 2018. https://doi.org/10.1126/science.aaq1723 . [ PubMed ] [ Google Scholar ]
  • Qiao N, Huang Y, Naveed H, Green CD, and Han JD (2013). CoCiter: an efficient tool to infer gene function by assessing the significance of literature co-citation . PLoS ONE 8 , e74074. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, Bodenmiller B, Campbell P, Carninci P, Clatworthy M, et al.; Human Cell Atlas Meeting Participants (2017). The human cell atlas . eLife 6 , e27041. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Riddell J, Gazit R, Garrison BS, Guo G, Saadatpour A, Mandal PK, Ebina W, Volchkov P, Yuan GC, Orkin SH, and Rossi DJ (2014). Reprogramming committed murine blood cells to induced hematopoietic stem cells with defined factors . Cell 157 , 549–564. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z, Graybuck LT, Peeler DJ, Mukherjee S, Chen W, et al. (2018). Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding . Science 360 , 176–182. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Saldanha AJ (2004). Java Treeview–extensible visualization of microarray data . Bioinformatics 20 , 3246–3248. [ PubMed ] [ Google Scholar ]
  • Saunders A, Macosko EZ, Wysoker A, Goldman M, Krienen FM, de Rivera H, Bien E, Baum M, Bortolin L, Wang S, et al. (2018). Molecular diversity and specializations among the cells of the adult mouse brain . Cell 174 , 1015–1030. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, and Ideker T (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks . Genome Res . 13 , 2498–2504. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shi HB, Zhang CH, Zhao W, Luo J, and Loor JJ (2017). Peroxisome proliferator-activated receptor delta facilitates lipid secretion and catabolism of fatty acids in dairy goat mammary epithelial cells . J. Dairy Sci . 100 ,797–806. [ PubMed ] [ Google Scholar ]
  • Stubbington MJT, Rozenblatt-Rosen O, Regev A, and Teichmann SA (2017). Single-cell transcriptomics to explore the immune system in health and disease . Science 358 , 58–63. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sun Z, Unutmaz D, Zou YR, Sunshine MJ, Pierani A, Brenner-Morton S, Mebius RE, and Littman DR (2000). Requirement for RORgamma in thymocyte survival and lymphoid organ development . Science 288 , 2369–2373. [ PubMed ] [ Google Scholar ]
  • Svensson V, Vento-Tormo R, and Teichmann SA (2018). Exponential scaling of single-cell RNA-seq in the past decade . Nat. Protoc . 13 , 599–604. [ PubMed ] [ Google Scholar ]
  • Takahashi K, and Yamanaka S (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors . Cell 126 , 663–676. [ PubMed ] [ Google Scholar ]
  • The Tabula Muris Consortium (2018). Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris . Nature 562 , 367–372. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Valledor AF, Borràs FE, Cullell-Young M, and Celada A (1998). Transcription factors that regulate monocyte/macrophage differentiation . J. Leukoc. Biol . 63 , 405–417. [ PubMed ] [ Google Scholar ]
  • van der Maaten LJP, and Hinton GE (2008). Visualizing high-dimensional data using t-SNE . J. Mach. Learn. Res . 9 , 2579–2605. [ Google Scholar ]
  • van Dongen S, and Abreu-Goodger C (2012). Using MCL to extract clusters from networks . Methods Mol. Biol . 804 , 281–295. [ PubMed ] [ Google Scholar ]
  • Wagner DE, Weinreb C, Collins ZM, Briggs JA, Megason SG, and Klein AM (2018). Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo . Science 360 , 981–987. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang N, Verna L, Hardy S, Zhu Y, Ma KS, Birrer MJ, and Stemerman MB (1999). c-Jun triggers apoptosis in human vascular endothelial cells . Circ. Res . 85 , 387–393. [ PubMed ] [ Google Scholar ]
  • Welch JJ, Watts JA, Vakoc CR, Yao Y, Wang H, Hardison RC, Blobel GA, Chodosh LA, and Weiss MJ (2004). Global regulation of erythroid gene expression by transcription factor GATA-1 . Blood 104 , 3136–3147. [ PubMed ] [ Google Scholar ]
  • Wilson NK, Foster SD, Wang X, Knezevic K, Schütte J, Kaimakis P, Chilarska PM, Kinston S, Ouwehand WH, Dzierzak E, et al. (2010). Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis often major transcriptional regulators . Cell Stem Cell 7 , 532–544. [ PubMed ] [ Google Scholar ]
  • Wu W, Morrissey CS, Keller CA, Mishra T, Pimkin M, Blobel GA, Weiss MJ, and Hardison RC (2014). Dynamic shifts in occupancy by TAL1 are guided by GATA factors and drive large-scale reprogramming of gene expression during hematopoiesis . Genome Res . 24 , 1945–1962. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yui MA, and Rothenberg EV (2014). Developmental gene networks: a triathlon on the course to T cell identity . Nat. Rev. Immunol . 74 , 529–545. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zeisel A, Hochgerner H, Lönnerberg P, Johnsson A, Memic F, van der Zwan J, Häring M, Braun E, Borm LE, La Manno G, et al. (2018). Molecular architecture of the mouse nervous system . Cell 174 , 999–1014.e22. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhu Q, Wong AK, Krishnan A, Aure MR, Tadych A, Zhang R, Corney DC, Greene CS, Bongo LA, Kristensen VN, et al. (2015). Targeted exploration and analysis of large cross-platform human transcriptomic compendia . Nat. Methods 12 , 211–214, 3, 214. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zuchero JB, and Barres BA (2013). Intrinsic and extrinsic control of oligodendrocyte development . Curr. Opin. Neurobiol . 23 , 914–920. [ PMC free article ] [ PubMed ] [ Google Scholar ]