Mapping the Functional Landscape of T Cell Receptor Repertoire by Single T Cell Transcriptomics

Assignment of antigen specificity of TCRs of the 10x Genomics scRNA-Seq datasets

The single cell immune profiling datasets released by the 10x Genomics consist of single cell 5’ gene expression libraries, TCR sequencing libraries, and antigen binding affinity measurements of CD8 ⁺ T cells from 4 healthy donors. The antigen binding affinity between the TCR of one T cell and pMHCs is determined by measuring the number of short sequences (‘UMIs’) specifically counted for each one of the 44 pMHC dCODE™ Dextramers® under investigation. In the application note released by the company ( https://www.10xgenomics.com/resources/application-notes/a-new-way-of-exploring-immunity-linking-highly-multiplexed-antigen-recognition-to-immune-repertoire-and-phenotype/ ), their scientists validated the antigen specificities inferred from the UMIs by comparing the inferred pMHC-specific TCRs with those that have been confirmed with experiments in VDJdb ( https://vdjdb.cdr3.net/ ), and they found exactly matching and closely similar sequence pairs (their Fig. 5 and Table 1). In another report by the 10X Genomics on the same technology ( https://www.immudex.com/media/118671/tf119302-sitc-2018-immudex-poster-in-collaboration-with-10x-genomics-dcode-dextramer-technology.pdf ), their research team found that flow cytometry and their feature barcoding technology identify similar dCODE Dextramer®-binding cell populations (their Fig. 4 ). Strikingly, that figure shows that distributions of the flow cytometry intensities (top panel) and the UMI counts (bottom panel) closely resemble each other. Therefore, that figure proves that the UMI counts are rather quantitative (at least as quantitative as conventional flow cytometry), rather than qualitative.

Our methodology to assign antigen specificity basically follows that of the original 10x report. For a T cell to be called antigen specific for one pMHC, that pMHC’s UMI count has to be >=10 and it has to be the largest across all 44 pMHCs. To give a context for the cutoff of 10, the four 10X datasets also include pMHC UMI measurements of six irrelevant peptides as negative controls to assist the detection of specific binding events. Across all cells in the four datasets, 92~97% negative control UMIs are zeros, and the average negative control counts range from 0.05 to 0.16. Importantly, for Fig. 2e , the UMI counts numbers we showed are log-scaled “clone level” UMI counts. For clones that have multiple cells sharing the same TCR, we calculated median UMI counts as clone level UMIs. It is very common for a clone to have, say, 20 cells, but only 10 cells have a specific pMHC that can be assigned according to the rule above. One clone of T cells has the same TCR, so it’s unlikely for these T cells to have different antigen specificity. In such cases, the antigen specificity for this T cell clone will be assigned according to these 10 cells’ antigen specificity (a >90% concordance has to be reached for these 10 cells). Importantly, the “clone”-level UMI counts reported in Fig. 2e will be the median of all T cells’ UMI counts in each clone. We cannot take max of the cell-level UMI counts for each clone, as bigger clones will have higher UMI counts just due to sampling size, which will bring bias into the analysis.

Hierarchical clustering of TCRs based on the weighted embedding and correlation with antigen specificity for the Dash and Glanville datasets

We analyzed the two antigen-specificity datasets (Dash and Glanville) ^{10

,

11} , which provided 276 and 207 TCR sequences with known antigen specificity. In these studies, single T cells from healthy donor PBMCs with known HLA types and infections of common viruses were incubated with engineered pMHCs and sorted with FACS before obtaining TCR sequences from a series of nested PCRs. Unlike scRNA-seq, these T cells do not have matched expression data. Therefore, for these two datasets, we performed hierarchical clustering based on the scaled TCR embeddings with weights learned from the single cell sequencing datasets (the b used for scaling is an average of the b s from all the single T cell sequencing datasets in Extended Data Fig. 3 ). The clustering also resulted in TCR networks that are similar to the TCR networks detected by tessa. Different tree height cutoffs were employed to test the stability of the results. We randomized the cluster labels and performed the same calculation 10,000 times to examine whether the clustering purity was achieved by chance. P-values were calculated as the number of trials that achieved a higher purity than the true hierarchical clustering results, divided by 10,000.

Identifying the T cell neighbours based on tessa-weighted TCR similarity

We identified the ‘neighbours’ for each of the TCR clones in the post-1 and the post-2 subgroups in Fig. 3c , ,d. d . For each clone, we calculated the tessa-weighted TCR distances between that clone and all the other clones with different TCRs from the same patient, and we selected the clone with the smallest TCR distance as the ‘neighbour’ of the previous clone. We counted the number of the neighbours that belong to each subgroup (pre-1, pre-2, post-1, and post-2), and divided those numbers by the total number of clones in that subgroup to obtain percentages.

Construction of gene modules and calculation of gene pathway activity scores

In Fig. 3e - -g g and Extended Data Fig. 6ab , we first selected 11 previously established individual marker genes representing 5 key T cell function pathways, including naive T cell markers ( IL7R ), memory T cell markers ( CXCR3, GZMK ), activated T cell markers ( IFNG, TNF, FOS, JUN ), and exhausted T cell markers ( ITGAE, ENTPD1, GZMB, LAG3 ) defined by Yost et al ³¹ . We also examined the differentially expressed genes between post-1 cells and post-2 cells from responders. We identified TGFB1 as the top highly expressed gene in the post-1 cells that is related to immune pathways ³⁵ . To increase the stability of our analyses, we expanded these individual genes to pathways by including the 13 genes that show the highest levels of positive correlations for each individual gene marker.

In Fig. 4e , the IL-2 signaling pathway #1 included 13 genes from Conley et al ³⁸ and Cho et al ³⁹ . The other four pathways, including the IL-2 pathway #2 (GSE39110_UNTREATED_VS_IL2_TREATED_CD8_TCELL_DAY3_POST_IMMUNIZATION_DN), the IFN-α/β pathway (GSE15930_STIM_VS_STIM_AND_IFNAB_24H_CD8_T_CELL_DN), the IL-12 pathway #1 (GSE22443_NAIVE_VS_ACT_AND_IL12_TREATED_CD8_TCELL_DN) and #2 (GSE13173_UNTREATED_VS_IL12_TREATED_ACT_CD8_TCELL_DN), were selected from version 7.0 of the molecular signature database (MSigDB) ( http://www.broadinstitute.org/gsea/msigdb/index.jsp ): the c7 immunologic signatures. The two negative control pathways were generated with 200 randomly selected genes from all unique genes included in the c7 immunologic signatures. These selected genes in each pathway were shown in Supplementary Table 2 . To determine the pathway activity scores, we normalized the RNA-expression raw counts by dividing raw counts of each gene and each cell by the sum of raw counts of each corresponding cell. The normalized expression values of the genes belonging to the same pathway were then log scaled, summed for each cell, and served as the pathway activity score in that cell. The activity scores of each pathway were scaled by their tenth roots for a better representation.

Diffusion map and pseudotime analysis

For the CD8 ⁺ T cells from BCC samples, the top 10% genes with the highest expression variations across all cells were used to calculate the diffusion components. The R package ‘ destiny ’ (version 3.0.1) was used to compute a neighborhood graph using 40 neighbors and the first 20 principal components. We then employed SCINA ⁴⁵ to detect naive CD8 ⁺ T cells with the marker gene IL7R and five genes with the highest correlation with IL7R. Three randomly selected naive T cells were used as the root cell for diffusion pseudotime prediction with the ‘ dpt ’ function in the ‘ destiny ’ package using all 20 diffusion components and a window width of 0.1 to decide the branch cutoff.

Calculating the variations of gene expression unexplained by TCRs

As described before in tessa, the TCRs were grouped into K networks. In each network, the TCR distances between the center TCR and the non-center TCRs were defined as d _t , their expression distances were defined as d _e . tessa assumed a linear regression relationship between d _t and d _e , which is

d_{k}^{e} = a_{k} \times d_{k}^{t} + e_{k}

where k = 1, …, K represents the k-th network. We defined the unexplained variations as,

\frac{\sum_{k = 1}^{K} (d_{k}^{e} - a_{k} \times d_{k}^{t})^{2}}{\sum_{k = 1}^{K} (d_{k}^{e} - \frac{1}{K} \sum_{k^{'} = 1}^{K} d_{k^{'}}^{e})^{2}} .

The unexplained variations were calculated separately for each of the networks in each dataset.

Benchmarking analysis with GLIPH

In Extended Data Fig. 4 we performed a series of benchmarking analysis using GLIPH ¹⁰ . We performed the analysis on six datasets including the four Healthy-CD8 datasets from 10x Genomics, the Glanville ¹⁰ dataset, and the Dash dataset ¹¹ ( Supplementary Table 1 ). The command ‘ gliph --tcr TCR_TABLE --gccutoff = n ’ was used to generate clusters from the TCR sequences of these datasets. We adjusted the value of the “ gccutoff ” parameter from 0.5 to 5 with a step-length of 0.5 and calculated the ‘network purities’ for each choice of the parameter.

Statistical analyses

All computations and statistical analyses were carried out in the R computing environment (version 3.5.1). We employed SCINA ⁴⁵ to detect the CD8 ⁺ T cells and CD4 ⁺ T cells from single T cell sequencing data, based on two gene signatures that are genes specifically expressed in the CD8 ⁺ T cells and the CD4 ⁺ T cells, respectively. Within each single cell dataset to be analyzed, we defined the CD8 gene signature as the 10 genes with the highest correlation with CD8A, and the CD4 gene signature as top 10 genes most highly correlated with CD4. For all boxplots appearing in this study, box boundaries represent interquartile ranges, whiskers extend to the most extreme data point which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median. The t-SNE analysis was performed with the ‘ Rtsne ’ package (version 0.15). Specifically, for Fig. 3ab and Fig. 4ab , we used the RNA expression of the T cells as the input. For Fig. 1e and Extended Data Fig. 5ab , we used the embedded TCR sequences as the input. PCA preprocessing was applied to both types of data, and the first 50 Principle Components (the default parameter of the function ‘Rtsne’ ) were employed to calculate the 2-dimensional (default) t-SNE representations, and they were plotted as principles ‘tSNE-1’ and ‘tSNE-2’. We applied Pearson correlation tests for all correlation analyses. Student’s T-test with two tails was used to calculate all the P-values (unless otherwise specified). The function ‘ geom_smooth ’ (method=‘lm’) in the package ‘ ggplot2 ’ (version 3.1.0) was applied to calculate the regression trend lines and 95% confidence intervals. The one-sided jonckheere trend test was applied to calculate the P-value in the analysis of Fig. 2e , with the function ‘ jonckheere.test ’ in the package ‘ clinfun ’ (version 1.0.15). The hierarchical clustering was performed with the ‘hclust’ function (method = ‘manhattan’) from the package ‘stats’ .

Data availability

The bulk RNA-Seq datasets used for deriving TCRs and then for the auto-encoder training are publicly available at https://gdc.cancer.gov/about-data/publications/panimmune (TCGA ²³ ), https://www.iedb.org/database_export_v3.php (IEDB), and http://friedmanlab.weizmann.ac.il/McPAS-TCR/ (McPAS ²⁵ ). We made the Kidney-bulkRNA ²⁴ dataset available in csv format at https://github.com/jcao89757/TESSA/tree/master/Tessa_released_data . All scRNA-seq/TCR-seq datasets are publicly available. The NSCLC-1 and healthy-PBMC-1 datasets are available on the 10X website https://support.10xgenomics.com/single-cell-vdj/datasets/2.2.0 . The healthy-CD8 1-4 datasets are available on https://www.10xgenomics.com/resources/application-notes/a-new-way-of-exploring-immunity-linking-highly-multiplexed-antigen-recognition-to-immune-repertoire-and-phenotype/ . The healthy-PBMC-2 dataset is also available on the 10X website https://support.10xgenomics.com/single-cell-vdj/datasets/3.0.0 . The NSCLC-2 ²⁶ , CRC ²⁷ , and HCC ²⁸ datasets are downloaded from the European Genome-Phenome Archive (EGA) under accession numbers, EGAS00001002430, EGAS00001002791, and EGAS00001002072, respectively. The Breast 1-5 ²⁹ datasets are available on the Gene Expression Omnibus (GEO) under accession numbers, {"type":"entrez-geo","attrs":{"text":"GSE114727","term_id":"114727"}} GSE114727 and {"type":"entrez-geo","attrs":{"text":"GSE114724","term_id":"114724"}} GSE114724 . The Melanoma ³⁰ , BCC ³¹ and ECCITE-seq ¹⁶ datasets are also on the GEO database under study numbers, {"type":"entrez-geo","attrs":{"text":"GSE123139","term_id":"123139"}} GSE123139 , {"type":"entrez-geo","attrs":{"text":"GSE113590","term_id":"113590"}} GSE113590 and {"type":"entrez-geo","attrs":{"text":"GSE126310","term_id":"126310"}} GSE126310 . The Glanville ¹⁰ dataset is downloaded from https://doi.org/10.1038/nature22976 . The Dash ¹¹ dataset is available in the NCBI Sequence Read Archive (SRA) under accession number SRP101659. The details of the data used, including sample size, role in the analysis, and references, are shown in Supplementary Table 1 . All scRNA-seq data were involved in Fig. 2 (directly or indirectly mentioned), the BCC scRNA-seq data were used in Fig. 3 , and all scRNA-seq data were used in Fig. 4 .

Code availability

The tessa model: https://github.com/jcao89757/tessa (doi: 10.5281/zenodo.4161819 ) ⁴⁶

The SCINA model: https://github.com/jcao89757/SCINA (doi: 10.3390/genes10070531 ) ⁴⁵

Reporting Summary

Please refer to Life Sciences Reporting Summary regarding detailed information on experimental design.

Extended Data

Extended Data Fig. 1

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0005.jpg

Open in a separate window

Details of the stacked auto-encoder for TCR embedding. (a) The structure of the auto-encoder, with the configurations of each layer shown. (b) Typical examples of TCR CDR3b sequences, heatmaps of the initially embedded ‘Atchley’ matrices of TCRs, and heatmaps of the auto-encoder-reconstructed ‘Athley’ matrices. The TCR sequence examples were not used in the training step of the auto-encoder. (c) Scatterplots showing the consistency between the ‘Atchley factor’ values of the original and re-constructed TCRs. Green points represent tiles in the heatmaps in (b).

Extended Data Fig. 2

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0006.jpg

Open in a separate window

Scatterplots showing the relationships between the distances of TCRs and the distances of RNA expression levels for several more datasets. Both distances are calculated in a pair-wise manner between all the T cell clonotypes of each dataset. Four example datasets are shown: Healthy-CD8-3 (a), Healthy-CD8-4 (b), Breast-1 (c), and Breast-2 (d) ( Supplementary Table 1 ). The P values indicate the significance of the Pearson correlation coefficients. The shaded areas denote the 95% confidence intervals for linear regressions.

Extended Data Fig. 3

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0007.jpg

Open in a separate window

The weights of the TCR embeddings learned from tessa. The X axis shows the digits of the 30-dimensional embeddings, and the Y axis shows the weights learned for all datasets. Each bar represents one digit of the weights and shows the values of that digit obtained from all the 19 scRNA datasets in the Supplementary Table 1 .

Extended Data Fig. 4

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0008.jpg

Open in a separate window

Benchmarking results using GLIPH. (a) Clustering rates of the four Healthy-CD8 datasets from 10x Genomics, the Glanville dataset, and the Dash dataset under different global convergence distance cutoff (‘ gccutoff ’) values ( Supplementary Table 1 ). The dashed lines represented the tessa clustering rates of the corresponding datasets. (b) Clustering purities of GLIPH when the ‘ gccutoff ’ equals to 3. The cutoff value was selected so that the GLIPH clusters achieved clustering rates that are most similar to the tessa networks. The clustering purities were calculated with the same method as in Fig. 2 . (c, d) The GLIPH network purities (c) and number of networks (d) with different ‘gccutoff’ values, compared with the tessa network purities and the number of networks.

Extended Data Fig. 5

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0009.jpg

Open in a separate window

The antigen binding specificity of 207 Human TCRβ chains from 704 T cells were profiled against two epitopes in the Dash dataset, and 276 TCRs from 415 T cells against three epitopes in the Glanville dataset. (a, b) T-SNE plots showing the TCR clonotypes in the space of the TCR embeddings, with the embeddings adjusted by the tessa-inferred weights. The hierarchical clustering tree cutoff used in the two plots was represented with green dashed lines in c-f. Each point in the plots represents one TCR clonotype, and the size of the point refers to the clone size. Points are colored by the true antigens that the corresponding TCRs target according to the original report. Points are connected if they are clustered into the same network based on hierarchical clustering of the TCR embeddings. T cell clones with only one cell were deemed as having low confidence and unclustered clones, which does not affect the calculation of the purities, were excluded from visualization. (c, d) The numbers of TCR networks and the clustering rates with different hierarchical tree cutoffs in the Dash dataset (c) and in the Glanville dataset (d). Cluster rates were calculated as the number of TCR clonotypes that are clustered with at least another TCR clonotype, divided by the total number of TCR clonotypes. (e, f) The network purities and p-values testing the significance of the purities with different hierarchical tree cutoffs in the Dash dataset (c) and the Glanville dataset (d). The network purity and P value calculations were described in the Online Methods section.

Extended Data Fig. 6

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0010.jpg

Open in a separate window

T cell pathway activity scores of the different T cell subsets in the BCC dataset. The naive and activated pathways are shown, to be compared against the inhibition, memory and exhausted pathways shown in Fig. 3 . The T cell subsets were the same as those in Fig. 3e - -g g .

Extended Data Fig. 7

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0011.jpg

Open in a separate window

Pseudotime analysis of the different T cell subsets in the BCC dataset. The T cell subsets were the same as those in Fig. 3e - -g g .

Extended Data Fig. 8

An external file that holds a picture, illustration, etc. Object name is nihms-1646561-f0012.jpg

Open in a separate window

A cartoon sketch shows how the unexplained variance in gene expression of the TCR networks were determined. Details were described in the Materials and Methods section.

Supplementary Material

1

Supplementary Note 1 Detailed description of tessa, along with simulation and diagnostic analyses

Supplementary Note 2 More bioinformatics analyses and discussion of tessa

Click here to view. ^{(1.6M, pdf)}

2

Supplementary Table 1 Data cohorts and details.

Click here to view. ^{(9.9K, docx)}

3

Supplementary Table 2 The genes in the T cell pathways used in this study.

Click here to view. ^{(24K, xlsx)}

ACKNOWLEDGEMENTS

We would like to thank Dr. LHR Xu for his valuable input on the manuscript writing. This study was supported by the National Institutes of Health (NIH) [CCSG 5P30CA142543/TW, R15GM131390/XW], and Cancer Prevention Research Institute of Texas [CPRIT RP190208/TW].

Footnotes

COMPETING INTERESTS

The authors declare no conflicts of interest.

References

1. Oettinger MA V(D)J recombination: on the cutting edge . Curr. Opin. Cell Biol 11 , 325–329 (1999). [ PubMed ] [ Google Scholar ]

2. Jung D & Alt FW Unraveling V(D)J recombination; insights into gene regulation . Cell 116 , 299–311 (2004). [ PubMed ] [ Google Scholar ]

3. Kappler J et al. The major histocompatibility complex-restricted antigen receptor on T cells in mouse and man: identification of constant and variable peptides . Cell 35 , 295–302 (1983). [ PubMed ] [ Google Scholar ]

4. Haskins K et al. The major histocompatibility complex-restricted antigen receptor on T cells. I. Isolation with a monoclonal antibody . J. Exp. Med 157 , 1149–1169 (1983). [ PMC free article ] [ PubMed ] [ Google Scholar ]

5. Staveley-O’Carroll K et al. Induction of antigen-specific T cell anergy: An early event in the course of tumor progression . Proc Natl Acad Sci USA 95 , 1178–1183 (1998). [ PMC free article ] [ PubMed ] [ Google Scholar ]

6. Skapenko A, Leipe J, Lipsky PE & Schulze-Koops H The role of the T cell in autoimmune inflammation . Arthritis Res. Ther Suppl 2 , S4–14 (2005). [ PMC free article ] [ PubMed ] [ Google Scholar ]

7. Stubbington MJT et al. T cell fate and clonality inference from single-cell transcriptomes . Nat. Methods 13 , 329–332 (2016). [ PMC free article ] [ PubMed ] [ Google Scholar ]

8. Bolotin DA et al. Antigen receptor repertoire profiling from RNA-seq data . Nat. Biotechnol 35 , 908–911 (2017). [ PMC free article ] [ PubMed ] [ Google Scholar ]

9. Eltahla AA et al. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells . Immunol. Cell Biol 94 , 604–611 (2016). [ PubMed ] [ Google Scholar ]

10. Glanville J et al. Identifying specificity groups in the T cell receptor repertoire . Nature 547 , 94–98 (2017). [ PMC free article ] [ PubMed ] [ Google Scholar ]

11. Dash P et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires . Nature 547 , 89–93 (2017). [ PMC free article ] [ PubMed ] [ Google Scholar ]

12. Tubo NJ et al. Single naive CD4+ T cells from a diverse repertoire produce different effector cell types during infection . Cell 153 , 785–796 (2013). [ PMC free article ] [ PubMed ] [ Google Scholar ]

13. Buchholz VR et al. Disparate individual fates compose robust CD8+ T cell immunity . Science 340 , 630–635 (2013). [ PubMed ] [ Google Scholar ]

14. Picelli S et al. Full-length RNA-seq from single cells using Smart-seq2 . Nat. Protoc 9 , 171–181 (2014). [ PubMed ] [ Google Scholar ]

15. Sheng K, Cao W, Niu Y, Deng Q & Zong C Effective detection of variation in single-cell transcriptomes using MATQ-seq . Nat. Methods 14 , 267–270 (2017). [ PubMed ] [ Google Scholar ]

16. Mimitou EP et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells . Nat. Methods 16 , 409–412 (2019). [ PMC free article ] [ PubMed ] [ Google Scholar ]

17. Atchley WR, Zhao J, Fernandes AD & Drüke T Solving the protein sequence metric problem . Proc Natl Acad Sci USA 102 , 6395–6400 (2005). [ PMC free article ] [ PubMed ] [ Google Scholar ]

18. Modular learning in neural networks ∣ Proceedings of the sixth National conference on Artificial intelligence - Volume 1 at < https://dl.acm.org/doi/10.5555/1863696.1863746 > [ Google Scholar ]

19. Ostmeyer J et al. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis . BMC Bioinformatics 18 , 401 (2017). [ PMC free article ] [ PubMed ] [ Google Scholar ]

20. Ostmeyer J, Christley S, Toby IT & Cowell LG Biophysicochemical Motifs in T-cell Receptor Sequences Distinguish Repertoires from Tumor-Infiltrating Lymphocyte and Adjacent Healthy Tissue . Cancer Res . 79 , 1671–1680 (2019). [ PMC free article ] [ PubMed ] [ Google Scholar ]

21. Thomas N et al. Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence . Bioinformatics 30 , 3181–3188 (2014). [ PMC free article ] [ PubMed ] [ Google Scholar ]

22. Zhang AW et al. Interfaces of malignant and immunologic clonal dynamics in ovarian cancer . Cell 173 , 1755–1769.e22 (2018). [ PubMed ] [ Google Scholar ]

23. Thorsson V et al. The immune landscape of cancer . Immunity 48 , 812–830.e14 (2018). [ PMC free article ] [ PubMed ] [ Google Scholar ]

24. Wang T et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies missing link to prognostic inflammatory factors . Cancer Discov . 8 , 1142–1155 (2018). [ PMC free article ] [ PubMed ] [ Google Scholar ]

25. Tickotsky N, Sagiv T, Prilusky J, Shifrut E & Friedman N McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences . Bioinformatics 33 , 2924–2929 (2017). [ PubMed ] [ Google Scholar ]

26. Guo X et al. Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing . Nat. Med 24 , 978–985 (2018). [ PubMed ] [ Google Scholar ]

27. Zhang L et al. Lineage tracking reveals dynamic relationships of T cells in colorectal cancer . Nature 564 , 268–272 (2018). [ PubMed ] [ Google Scholar ]

28. Zheng C et al. Landscape of Infiltrating T Cells in Liver Cancer Revealed by Single-Cell Sequencing . Cell 169 , 1342–1356.e16 (2017). [ PubMed ] [ Google Scholar ]

29. Azizi E et al. Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment . Cell 174 , 1293–1308.e36 (2018). [ PMC free article ] [ PubMed ] [ Google Scholar ]

30. Li H et al. Dysfunctional CD8 T Cells Form a Proliferative, Dynamically Regulated Compartment within Human Melanoma . Cell 176 , 775–789.e18 (2019). [ PMC free article ] [ PubMed ] [ Google Scholar ]

31. Yost KE et al. Clonal replacement of tumor-specific T cells following PD-1 blockade . Nat. Med 25 , 1251–1259 (2019). [ PMC free article ] [ PubMed ] [ Google Scholar ]

32. Eduati F et al. Prediction of human population responses to toxic compounds by a collaborative competition . Nat. Biotechnol 33 , 933–940 (2015). [ PMC free article ] [ PubMed ] [ Google Scholar ]

33. Bansal M et al. A community computational challenge to predict the activity of pairs of compounds . Nat. Biotechnol 32 , 1213–1222 (2014). [ PMC free article ] [ PubMed ] [ Google Scholar ]

34. Costello JC & Stolovitzky G Seeking the wisdom of crowds through challenge-based competitions in biomedical research . Clin. Pharmacol. Ther 93 , 396–398 (2013). [ PubMed ] [ Google Scholar ]

35. Waugh KA et al. Molecular Profile of Tumor-Specific CD8+ T Cell Hypofunction in a Transplantable Murine Cancer Model . J. Immunol 197 , 1477–1488 (2016). [ PMC free article ] [ PubMed ] [ Google Scholar ]

36. Wu AA, Drake V, Huang H-S, Chiu S & Zheng L Reprogramming the tumor microenvironment: tumor-induced immunosuppressive factors paralyze T cells . Oncoimmunology 4 , e1016700 (2015). [ PMC free article ] [ PubMed ] [ Google Scholar ]

37. Burkholder B et al. Tumor-induced perturbations of cytokines and immune cell networks . Biochim. Biophys. Acta 1845 , 182–201 (2014). [ PubMed ] [ Google Scholar ]

38. Conley JM, Gallagher MP & Berg LJ T Cells and Gene Regulation: The Switching On and Turning Up of Genes after T Cell Receptor Stimulation in CD8 T Cells . Front. Immunol 7 , 76 (2016). [ PMC free article ] [ PubMed ] [ Google Scholar ]

39. Cho J-H et al. Unique features of naive CD8+ T cell activation by IL-2 . J. Immunol 191 , 5559–5573 (2013). [ PubMed ] [ Google Scholar ]

40. Iezzi G, Karjalainen K & Lanzavecchia A The duration of antigenic stimulation determines the fate of naive and effector T cells . Immunity 8 , 89–95 (1998). [ PubMed ] [ Google Scholar ]

41. Moskophidis D, Lechner F, Pircher H & Zinkernagel RM Virus persistence in acutely infected immunocompetent mice by exhaustion of antiviral cytotoxic effector T cells . Nature 362 , 758–761 (1993). [ PubMed ] [ Google Scholar ]

42. Kalergis AM et al. Efficient T cell activation requires an optimal dwell-time of interaction between the TCR and the pMHC complex . Nat. Immunol 2 , 229–234 (2001). [ PubMed ] [ Google Scholar ]

43. Corse E, Gottschalk RA, Krogsgaard M & Allison JP Attenuated T cell responses to a high-potency ligand in vivo . PLoS Biol . 8 , (2010). [ PMC free article ] [ PubMed ] [ Google Scholar ]

44. Inc., T. & View, M. Efficient Estimation of Word Representations in Vector Space .

45. Zhang Z et al. SCINA: A Semi-Supervised Subtyping Algorithm of Single Cells and Bulk Samples . Genes (Basel) 10 , (2019). [ PMC free article ] [ PubMed ] [ Google Scholar ]

46. Zhang Z jcao89757/TESSA: Mapping the Functional Landscape of T Cell Receptor Repertoire by Single T Cell Transcriptomics . Zenodo (2020). doi: 10.5281/zenodo.4161819 [ CrossRef ] [ Google Scholar ]