Science | 单细胞分析人类胸腺发育的细胞图谱



The human thymus is the organ responsible for the maturation of many types of T cells, which are immune cells that protect us from infection. However, it is not well known how these cells develop with a full immune complement that contains the necessary variation to protect us from a variety of pathogens. By performing single-cell RNA sequencing on more than 250,000 cells, Park et al. examined the changes that occur in the thymus over the course of a human life. They found that development occurs in a coordinated manner among immune cells and with their developmental microenvironment. These data allowed for the creation of models of how T cells with different specific immune functions develop in humans.


该文章由Sanger研究所团队于2020年2月21日发表在Science上,题为A cell atlas of human thymic development defines T cell repertoire formation ,揭示人类胸腺细胞的发育及T细胞的发育成熟,对重建T细胞发育过程意义重大。



T细胞发育涉及阶段性T淋巴细胞分化的并行过程,并伴随着用于抗原识别的多种TCR repertoire的获得。这是通过基因组重组过程实现的,该过程从多个基因组拷贝中选择变异(V),连接(J),在某些情况下还选择了多样性(D)基因片段。V(D)J基因重组可以优先包含某些基因片段,从而导致库的偏移(淋巴细胞发育中免疫球蛋白可变区基因片段经重组而形成完整可变区序列的过程。重链可变区基因由V、D、J各1个基因片段组成,轻链可变区基因由V、J各1个基因片段组成,VDJ基因均有多个拷贝,各片段通过随机组合(即重排)而形成多样性的抗体可变区。)。迄今为止,我们对VDJ重组和库偏倚的大多数了解都来自动物模型和人类外周血分析,而关于人胸腺TCR repertoire的综合数据很少。

在这里,作者应用scRNA-seq生成了胚胎、胎儿、幼儿和成年阶段胸腺细胞的综合转录组图谱,并将其与TCR repertoire分析相结合以重建T细胞分化过程。


sample: 受孕后的7到17周之间抽取了跨越胸腺发育阶段的15个胚胎和胎儿胸腺,以及儿科和成年个体的9个产后胸腺(在所有参与者的书面知情同意下,根据《赫尔辛基2000年宣言》中的指南获得了本研究的所有组织样品)。通过FACS捕获表面Marker为CD45, CD3在, EpCAM的细胞。(CD45是免疫细胞的marker,CD3是T细胞的marker,EpCAM是上皮细胞的marker。)


  1. 工具: Single Cell 3′ and 5′ Reagent Kit (10X  Genomics),Illumina Nextera XT kit.

  2. 比对: Cell Ranger Single-Cell Software Suite (version 2.0.2 for 3′ chemistry and version 2.1.0 for 5′ chemistry, 10x Genomics Inc. ),GRCh38.

  3. 筛选:

    Cells with fewer than 2000 UMI counts and 500 detected genes were considered as empty droplets and removed from the dataset. Cells with more than 7000 detected genes were considered as potential doublets and removed from the dataset.



作者对15个产前胸腺进行了scRNA-seq检测,范围从7个PCW(受孕后的几周,post-conception weeks)(可分离出胸腺残基)到17个PCW(胸腺发育完成)(图1A和B),还分析了九个产后样本,涵盖了整个胸腺活动期。在单细胞转录组分析与TCRαβ谱分析结合之前,根据CD45、CD3或上皮细胞粘附分子(EpCAM)的表达对分离的单细胞进行分类,将胸腺细胞和非胸腺细胞分离获取。在进行QC后(对一篇单细胞RNA综述的评述:细胞和基因质控参数的选择),从产前胸腺中获得了138,397个细胞,从产后胸腺中获得了117,504个细胞。




图1. Cellular composition of the developing human thymus

(A) Schematic of single-cell transcriptome profiling of the developing human thymus.
(B) Summary of gestational stage/age of samples, organs (circles denote thymus; rectangles denote fetal liver or adult bone marrow, adult spleen, and lymph nodes), and 10x Genomics chemistry (colors).
(C) UMAP visualization of the cellular composition of the human thymus colored by cell type (DN, double-negative T cells; DP, double-positive T cells; ETP, early thymic progenitor; aDC, activated dendritic cells; pDC, plasmacytoid dendritic cells; Mono, monocyte; Mac, macrophage; Mgk, megakaryocyte; Endo, endothelial cells; VSMC, vascular smooth muscle cells; Fb, fibroblasts; Ery, erythrocytes).
(D) Same UMAP plot colored by age groups, indicated by post-conception weeks (PCW) or postnatal years (y).
(E) Dot plot for expression of marker genes in thymic stromal cell types. Here and in later figures, color represents maximum-normalized mean expression of marker genes in each cell group, and size indicates the proportion of cells expressing marker genes.
(F) RNA smFISH in human fetal thymus slides with probes targeting stromal cell populations. Top left: Fb2 population marker FBN1 and general fibroblast markers PDGFRA and CDH5. Top right: Fb1 marker GDF10FBN1, and CDH5. Middle left: Fb1 marker COLEC11 and FBN1. Middle right: Fb1 marker ALDH1A2, VSMC marker ACTA2, and FBN1. Bottom left: TEC(myo) marker MYOD1. Bottom right: Epithelial cell marker EPCAM and TEC(neuro) marker NEUROG1. Data are representative of two experiments.
(G) Relative proportion of cell types throughout different age groups. Dot size is proportional to absolute cell numbers detected in the dataset. Statistical testing for population dynamics was performed by t tests using proportions between stage groups. The x axis shows age of samples, which are colored in the same scheme as (D).


2.  胸腺基质和T细胞的协调发育


在早期胎儿样本(7至8个PCW)中,淋巴区包含NK细胞、γδT细胞和ILC3s,几乎没有分化的αβT细胞。分化的T细胞被发现主要是在7个PCW的样品的DN阶段。他们逐渐从DP发展到SP阶段,在12 PCW左右达到平衡,先天淋巴细胞的比例则是下降趋势。

值得注意的是,成人样本显示出胸腺变性的形态学证据。将胸腺与同一供体的脾脏和淋巴结的比较显示,胸腺中存在终末分化的T细胞,这表明终末分化的T细胞再次进入胸腺或出现循环细胞污染(图1G)。另一方面,表达IL10、穿孔素和颗粒酶的细胞毒性CD4+ T淋巴细胞(CD4+ CTL)在退化的胸腺样品中富集。在其他样本中也证实了记忆T细胞和B细胞增加的趋势(图1G;记忆T细胞的P=9.3×10^(-6),记忆B细胞的P=0.0096)。

胸腺基质细胞的相应变化反映了T细胞发育的趋势。作者观察到TEC亚群的时间变化与T细胞成熟的开始一致,从富集的cTECs转变到cTECs和mTECs的平衡(图1G;P=0.0054)。这支持了“thymic cross-talk”的概念,其中的上皮细胞和成熟的T细胞协同相互作用可以支持它们的相互分化。


最后,其他免疫细胞也在妊娠和产后生活中发生动态变化。早期妊娠期间巨噬细胞丰富,DC在整个发育过程中均增加(图1G)。12 PCW后DC1占主导,产后生活中pDC比例增加(巨噬细胞,P = 2.7×10^(–8);DC1,P = 1.05×10^(–3);DC2,P = 4.86×10^(–5))。




为了研究下游T细胞分化轨迹(NBT|45种单细胞轨迹推断方法比较,110个实际数据集和229个合成数据集),作者决定显示分化T细胞的连续轨迹。为了证实这一轨迹的有效性,使用T细胞分化的标志性基因:CD4/CD8A/CD8B基因(图2D),细胞周期(CDK1)和重组(RAG1)基因(图2E)以及完全重组的TCRαs/TCRβs(图2F)。轨迹从CD4-CD8-DN细胞开始,逐渐表达CD4CD8成为CD4+CD8+ DP细胞,然后过渡到CCR9highTαβ(进入)阶段,分化为成熟的CD4+CD8+ SP细胞(图2D)。通过细胞周期基因的表达将DN和DP细胞分为两个阶段(图2E)。作者将具有强烈细胞周期特征的早期亚群称为增殖(P),将晚期亚群称为静止(Q)(图2C)。VDJ重组基因(RAG1和RAG2)的表达从增殖后期开始增加,并在静止期达到高峰。这种模式反映了每轮重组之前T细胞的增殖。

图2. Thymic seeding of early thymic progenitors (ETPs) and T cell differentiation trajectory

(A) UMAP visualization of ETP and fetal liver hematopoietic stem cells (HSCs) and early progenitors. NMP, neutrophil-myeloid progenitor; MEMP, megakaryocyte/erythrocyte/mast cell progenitor.
(B) The same UMAP colored by organ (liver in blue, thymus in yellow/red).
(C) UMAP visualization of developing thymocytes after batch correction. DN, double-negative T cells; DP, double-positive T cells; SP, single-positive T cells; P, proliferating; Q, quiescent). The data contain cells from all sampled developmental stages. Cells from abundant clusters are downsampled for better visualization. The reproducibility of structure is confirmed across individual samples. Unconventional T cells are in gray.
(D to F) The same UMAP plot showing CD4CD8A, and CD8Bgene expression (D), CDK1 cell cycle and RAG1 recombination gene expression (E), and TCRα, productive TCRβ, and nonproductive TCRβ VDJ genes (F).
(G) Heat map showing differentially expressed genes across T cell differentiation pseudotime. Top: The x axis represents pseudo-temporal ordering. Gene expression levels across the pseudotime axis are maximum-normalized and smoothed. Genes are grouped by their functional categories and expression patterns. Bottom: Cell type annotation of cells aligned along the pseudotime axis. Colors are as in (C).
(H) Scatterplot showing the rate of productive chain detection within cells in specific cell types (x axis) and the ratio of nonproductive/productive TCR chains detected in specific cell types (y axis). Left: TCRβ; right, TCRα.
(I) Graph showing correlation-based network of transcription factors expressed by thymocytes. Nodes represent transcription factors; edge widths are proportional to the correlation coefficient between two transcription factors. Transcription factors with significant association to specific cell types are depicted in color. Node size is proportional to the significance of association to specific cell types.



除了构成发育中胸腺T细胞中大多数的常规CD4+或CD8+ T细胞外,单细胞数据还按标记基因的表达进行分组,确定了多种非常规T细胞类型(图2I和图3A和3B)。

接下来,作者调查了这些非经典T细胞的发育是否依赖于胸腺。作者认为,如果一个细胞群体是胸腺依赖性的,它将在胸腺成熟(约10 PCW)后积累,并且相对于其他造血器官会在胸腺中富集。通过绘制散点图展示胎儿肝和胸腺之间以及胸腺成熟前后(10 PCW)每种细胞类型的相对丰度发现结果与这个想法一致,所有非常规T细胞都在胸腺富集,尤其是在胸腺成熟后,这表明它们是来源于胸腺(图3D)。


其他非经典T细胞群体包括CD8αα+ T细胞、NKT样细胞和TH17样细胞(图3B)。存在三种不同的CD8αα+ T细胞种群:GNG4+CD8αα+ T(I)细胞,ZNF683+CD8αα+ T(II)细胞和以EOMES标记的CD8αα+ NKT样细胞亚群(图3E)。GNG4+CD8αα+ T(I)细胞和ZNF683+CD8αα+ T(II)细胞在早期阶段共享PDCD1表达,并在其终末分化状态下降低。GNG4+CD8αα+ T(I)细胞显示出与DP晚期不同的轨迹(αβTSP进入细胞),而ZNF683+CD8αα+ T(II)细胞具有混合的αβ和γδT细胞信号,并位于GNG4+CD8αα+ T(I)细胞和γδT细胞旁边。

图3. Identification of GNG4+ CD8αα T cells in the thymic medulla

(A) UMAP visualization of mature T cell populations in the thymus. Axes and coordinates are as in 图2C. (The cell annotation color scheme used here is maintained throughout this figure.)
(B) Dot plot showing marker gene expression for the mature T cell types. Genes are stratified according to associated cell types or functional relationship.
(C) Scatterplot showing the ratio of nonproductive/productive TCR chains detected in specific cell types in TCRα chain (x axis) and TCRβ chain (y axis). The gray arrow indicates a trendline for decreasing nonproductive TCR chain ratio in unconventional versus conventional T cells.
(D) Scatterplot showing the relative abundance of each cell type between fetal liver and thymus (x axis) and before and after thymic maturation (delimited at 10 PCW) (y axis). Gray arrow indicates trendline for increasing thymic dependency.
(E to H) Scatterplots comparing the characteristics of unconventional T cells based on CD8A versus CD8B expression levels (E), KLRB1 versus ZBTB16 expression levels (F), TCRα productive chain versus TRDC detection ratio (G), and TRDV1 versus TRDV2 expression levels (H). Gray arrows or lines are used to set boundaries between groups [(E), (G), (H)] or to indicate the trend of innate marker gene expression (F).
(I) RNA smFISH showing GNG4TNFRSF9, and CD8A in a 15 PCW thymus. Lower right panel shows detected spots from the image on top of the tissue structure based on 4′,6-diamidino-2-phenylindole (DAPI) signal. Color scheme for spots is the same as in the image.
(J) FACS gating strategy to isolate CD8αα(I) cells (live/CD3+/CD4–/CD137+) and Smart-seq2 validation of FACS-isolated cells projected to the UMAP presentation of total mature T cells from the discovery dataset (lower left). GNG4 expression pattern is overlaid onto the same UMAP plot (lower right).


5. 胸腺髓质中GNG4+CD8ααT细胞的发现和表征

在确定了非经典T细胞及其在胸腺T细胞发育中的起源后,作者将重点放在新型GNG4+CD8αα+ T(I)细胞上,因为它们具有独特的基因表达谱(GNG4,CREB3L3和CD72),这与CD8αα+ T(II)细胞形成对比,CD8αα+ T(II)表达CD8αα+ T细胞的已知标记基因,例如ZNF683MME(53)。此外,胸腺迁徙的调节因子KLF2在CD8αα+ T(I)细胞中的表达水平极低,这表明它们可能是胸腺驻留(图3B)。为了定位和验证CD8αα+ T(I)细胞,作者在胎儿胸腺组织切片中进行了靶向GNG4的RNA smFISH,GNG4 RNA探针鉴定出一组富含胸腺髓质并与CD8A RNA共定位的细胞(图3I)。

由于CD137是CD8αα+ T(I)细胞和Tregs的表面marker基因,作者使用该marker富集了这些细胞,然后使用CD3 +CD137+CD4- FACS进一步细化,这样能够特异性富集CD8αα+ T(I)细胞,并通过Smart-seq2 scRNA seq确认其身份,从而为这些细胞提供更多的转录表型(图3J)。

6. 胸腺细胞选择的DC招募和激活

T细胞的选择由特异性的TEC和DC协调。作者首先确定了三个以前表征明确的胸腺DC亚型:DC1(XCR1+CLEC9A+),DC2(SIRPA+CLEC10A+)和pDC(IL3RA+CLEC4C +)。然后作者还鉴定了以前没有描述的细胞群,称其为“活化的DC”(aDC),其特征在于LAMP3和CCR7表达(图4A和B)。aDCs表达高水平的趋化因子和共刺激分子,以及转录因子,表明它们可能与人类扁桃体和胸腺中先前描述的AIRE+CCR7+ DC相对应。


图4. Recruitment and activation of dendritic cells for thymocyte selection

(A and B) UMAP visualization of thymic DC populations (A) and dot plot of their marker genes (B).
(C) Heat map of chemokine interactions among T cells, DCs, and TECs, where the chemokine is expressed by the outside cell type and the cognate receptor by the inside cell type.
(D) Schematic model summarizing the interactions of TECs, DCs, and T cells. The ligand is secreted by the cell at the beginning of an arrow, and the receptor is expressed by the cell at the end of that arrow.
(E) Left: RNA smFISH detection of GNG4XCR1, and FOXP3 in 15 PCW thymus. Right: Computationally detected spots are shown as solid circles over the tissue structure based on DAPI signal. Color schemes for circles are the same as in the image.
(F to H) Sequential slide ps from the same sample are stained for the detection of LAMP3AIRE, and XCR1 (F), LAMP3ITGAX, and CD80 (G), and LAMP3and FOXP3 (H). Spot detection and representation are as in (E). Data are representative of two experiments.


7. 人TCR repertoire构成和选择中的偏移

从富含TCR的5’测序文库中检测到TCR链,对其进行全长重组体的过滤,并与细胞类型注释关联,这样能够分析TCR repertoire的形成和选择模式(图5A和5B)



为了研究不同细胞类型之间是否存在差异性TCR repertoire偏倚,作者通过主成分分析比较不同细胞类型的TCR repertoire(图5E),观察到CD8+ T细胞和其他细胞类型清晰地分开。而且相对于其他细胞类型,CD8+ T细胞的TRAV-TRAJ repertoire偏向远端V-J对(图5F)。考虑到远端repertoire是在TCRα重组的后期产生的,这可能是CD8+ T谱系的反应较慢或效率较低造成的(图5D)。

图5. Intrinsic bias in human TCR repertoire formation and selection

(A) Heat map showing the proportion of each TCRb V, D, and J gene segment present at progressive stages of T cell development. Gene segments are positioned according to genomic location.
(B) Same scheme as in (A) applied to TCRa V and J gene segments. Although there is a usage bias of segments at the beginning of development, segments are evenly used by the late developmental stages, indicating progressive recombination leading to even usage of segments.
(C and D) Schematics illustrating a hypothetical chromatin loop that may explain genomic location bias in recombination of TCRb locus (C) and the mechanism of progressive recombination of TCRa locus leading to even usage of segments (D).
(E) Principal components analysis plots showing TRBV or TRAV and TRAJ gene usage pattern in different Tcell types. Arrows depict T cell developmental order. For TRBV, there is a strong effect from beta selection, after which point the CD4+ and CD8+ repertoires diverge. The development for TRAV+ TRAJ is more progressive, with stepwise divergence into the CD4+ and CD8+ repertoires.
(F) Relative usage of TCRa V and J gene segments according to cell type. The z-score for each segment is calculated from the distribution of normalized proportions stratified by the cell type and sample. P value is calculated by comparing z-scores in CD4+ T and CD8+ T cells using t test, and false discovery rate (FDR) is calculated using BenjaminiHochberg correction: P < 0.05, *FDR < 10%. Gene names and asterisks are colored by significant enrichment in CD4+ T cells (blue) or CD8+ T cells (orange).



