Abstract
Background/Aim: T-cell receptor recombination sequencing reads have been extracted from genomics files and have been characterized and correlated with patient outcomes for many different cancer types.
Materials and Methods: An immunogenomics approach was applied to endemic Burkitt lymphoma (BL) utilizing the Cancer Genome Characterization Initiative – Burkitt Lymphoma Genome Sequencing Project (CGCI-BLGSP), focusing on TCR V- or J- gene segment, HLA allele combinations and anti-HIV CDR3s.
Results: Results indicated that five TCR V- or J- gene segment, HLA allele combinations were associated with overall survival (OS) distinctions where the individual arms of the combinations were not associated with survival. Anti-HIV analyses indicated that those cases with an anti-HIV TRA CDR3 had a higher OS probability than those without (median OS not reached vs. 215 days; log-rank p=0.0013), and those with anti-HIV TRA and TRB CDR3s had better pathologic staging compared to those without.
Conclusion: T-cell response leads to early control of progression of endemic BL.
Introduction
Burkitt lymphoma (BL) is an aggressive B-cell lymphoma characterized by a translocation of c-MYC and an immunoglobulin gene. BL cases are divided into three variants: endemic (which occurs in equatorial Africa), immunodeficiency-associated (often in HIV patients and organ transplant recipients), and sporadic. Endemic BL comprises the majority of BL cases, and 30 to 50% of childhood cancers in equatorial Africa are attributed to BL. While there are many BL treatments, management of endemic BL is often hampered due to the resource-poor setting. For example, a 2016 cohort study focusing on Kenyan children found a one-year overall survival (OS) of 59% (1).
Thus, investigations have focused on the adaptive immune system, especially given that the Epstein-Barr virus (EBV) is present in all cases of BL. One major conundrum has been, why is BL specifically endemic in sub-Saharan Africa even though EBV is present throughout the world? Thus, recent studies have focused on the role of EBV in tumorigenesis (2, 3), as well as looking at other endemic diseases possibly having a cooperative role in tumorigenesis of BL. Another approach has focused on T-cell roles and functions in BL. BL cells have shown decreased CD4+ T-cell activation as a result of decreased immunogenicity of certain EBV antigens, allowing them to escape HLA class I presentation (4). Other studies have shown that T-cells appear to play a dichotomous role in both allowing for immune evasion while also suppressing pre-BL cells (5). Translation of this knowledge to therapeutics has been attempted with studies focused on CAR-T cell therapy, particularly inspired by the success of CAR-T cell therapy for other hematologic malignancies (6). However, there is a notable gap in the translation of molecular knowledge to novel immunotherapies for BL, as first-line treatment continues to rely largely on cytotoxic agents rather than targeted therapies at this time (7).
In other cancer settings, genomics sequencing data have facilitated the identification of apparently successful adaptive immune responses, i.e., with the recovery of adaptive immune receptor recombination sequencing reads from the genomics files and assessing outcome distinctions associated with such recombination read recoveries. In these studies, the focus has been on identifying immune receptor V-IDs and J-IDs, which yield diverse complementary determining region-3s (CDR3s), to identify patterns related to cancer antigen recognition. For example, this immunogenomics approach has been utilized to identify and characterize patients with immunologically cold cancers, such as MYCN-amplified neuroblastoma which has correspondingly poor outcomes (8).
Further, translational research has been done to identify targetable antigens which could be shared among patients with the same HLA alleles, especially for virus-induced cancers (9). One example is the first engineered peptide-TCR molecule targeting a specific HLA, tebentafusp, approved by the FDA in 2022 (10). Immunogenomics approaches have also been applied to meet the goal of identifying targetable antigens, with studies finding that particular TCR V- and J-gene segment usage, HLA allele combinations are associated with better outcomes in melanoma (11), head and neck cancers (12), and multiple myeloma (13, 14). In addition to human encoded, tumor antigen approaches, other approaches have focused on the adaptive immune response to potential or likely tumor viruses. Detection of TCR CDR3s matching known anti-viral TCR CDR3s have represented outcome differences for ovarian cancer (anti-EBV TCR CDR3s) (15), breast cancer [anti-cytomegalovirus (CMV) TCR CDR3s] (16, 17) and neuroblastoma (anti-CMV TCR CDR3s) (18). Thus, we aimed to characterize the role of T-cells in endemic BL utilizing an immunogenomic approach, focusing on identifying outcome differences in cases with specific V-, J- gene segment, HLA allele combinations; and in cases with TCR CDR3s matching known anti-viral CDR3s.
Materials and Methods
Downloading Burkitt lymphoma RNA-seq files. RNA-seq files provided by the Cancer Genome Characterization Initiative – Burkitt Lymphoma Genome Sequencing Project (CGCI-BLGSP, phs000235), available at the Genomic Data Commons (GDC) website, were used for this study. Specifically, RNA-seq files were downloaded as binary alignment map (BAM) files, utilizing the GDC transfer tool (https://gdc.cancer.gov/access-data/gdc-data-transfer-tool), to a University of South Florida Research Computing cluster. RNA-seq file access was via the authorization of database of genotypes and phenotypes (dbGaP) approved project number, 20312 (with approval granted to Dr. George Blanck). This study represents only endemic BL cases from Uganda (19), which includes 160 RNA-seq files from primary tumor samples, originally from 105 unique cases [supporting online material (SOM), Table S1, GDC download manifest]. Clinical data obtained from the GDC included gender, race, age of diagnosis, days to death, days to last follow-up, vital status, and Ann Arbor pathologic stage (days to death, or when days to death not available, days to last follow up, were used for the KM analyses, below). Ann Arbor pathologic stage is a widely used staging mechanism for lymphomas, where stage I indicates single-region cancer, stage II indicates cancer in two regions, stage III indicates involvement of both sides of the diaphragm, and stage IV indicates diffuse involvement.
Obtaining HLA alleles and T-cell receptor recombination reads. The 160 RNA-seq files used in this study were mined for HLA alleles for the class I genes, HLA-A, -B, -C and for the class II genes, HLA-DPB1, -DQB1, -DRB1, utilizing the xHLA software (20). Twenty-eight samples were noted to represent 13 cases and were utilized to verify replicability for HLA processing, which was found to be consistent at the level of 90% or above for all HLA genes (data not shown). The RNA-seq files were also mined for productive T-cell receptor (TCR) recombination reads for TRA, TRB, TRD, and TRG, yielding the V IDs, the J IDs, and the complementarity determining region-3 (CDR3) amino acid (AA) sequences, as described in (21) (see Table S2 for all TCRs from this study.) This algorithm for mining the TCR recombination sequencing reads from the RNA-seq files first utilizes a low-stringency search sourcing germline 10-mers matching the TCR (e.g., TRA) V- and J-gene sequences. Then, a high-stringency search using a scoring algorithm selects V- and J- reads greater than 19 nucleotides and with greater than 90% nucleotide matches. Of the obtained TCR sequencing recombination reads, only reads with productive TCR CDR3s (i.e., no stop codon or out-of-frame V-J joining segment) were used for this report.
Outcome distinctions based on TCR and HLA features. Initial analyses focused on whether recovering at least one productive CDR3 for each of the TCR genes (TRA, TRB, TRD, or TRG) was associated with OS distinctions. Then, for each TCR, the mean number of productive TCR recoveries among all samples was compared across Ann Arbor pathologic stage utilizing an analysis of variance test (ANOVA). If a case had multiple samples in the ANOVA, the number of recoveries in each sample was averaged for that case. Analyses were then focused on the HLA alleles obtained above. HLA alleles that were associated with distinct clinical features were identified, first by assessing OS probability distinctions using a log-rank test, and then by assessing age of diagnosis distinctions using the Student’s t-test. Subsequently, specific TCR V- or J-gene segment, HLA allele combinations associated with survival distinctions were identified. TCR V- or J- gene segment, HLA allele combinations presented in this report include only those combinations where the sample size was greater than 10; and where the log-rank p-value for the V- or J-gene segment arm and the HLA allele were not statistically, significantly associated with a survival distinction, respectively. Further, only TCR V- or J- gene segment, HLA allele combinations which were supported with a replicative set representing a randomly selected half of the original set were reported in this study. The above analyses were done utilizing the original script freely available at https://github.com/thudausf. The TCR V- or J- gene segment, HLA allele combination outcome analyses were repeated with R “survival” package v3.8.3. Figures for Kaplan-Meier (KM) analyses were produced using R “survminer” package v0.5.0 and the figure for the ANOVA analysis was produced using Graphpad Prism v10.
Anti-human immunodeficiency virus (HIV) TRA and TRB CDR3 analyses. Further analyses focused on identifying whether there were outcome distinctions in endemic BL cases where the mined TRA or TRB CDR3s represented exact AA sequence matches to validated TRA or TRB CDR3s against HIV antigens, published at the VDJdb database (22). Note that in this study, only the HIV-1 strain was utilized for analysis. OS analyses were performed for cases with exact AA sequence matches to anti-HIV TRA or TRB CDR3s versus cases with TRA or TRB recombination reads but without any anti-HIV CDR3 matches, using a log-rank test. Similarly, Ann Arbor pathologic stage was compared for cases with and without an anti-HIV TRA or TRB CDR3 match using a Mann-Whitney U-test. These analyses were done utilizing the original script at https://github.com/thudausf and repeated with IBM Statistical Product and Service Solutions (SPSS) v29. The figures for the Mann-Whitney U analysis were produced using IBM SPSS v29.
Results
Productive TCR recombination reads from RNA-seq files representing endemic BL. TCR recombinations were mined from 160 RNA-seq files representing 105 endemic BL cases. The recovery counts for the sequencing reads representing productive TRA, TRB, TRD, and TRG recombinations are indicated in Table I, grouped by Ann Arbor Pathologic Stage (the percentages of cases represented by at least one recombination sequencing read, for each of TRA, TRB, TRD, and TRG are provided in Table S3). OS probability for cases with and without a particular T-cell receptor was assessed, with results indicating that the 95 endemic BL cases with a TRG recombination read recovery had a higher OS probability as compared to the 10 cases without any TRG recovery (median OS: 387 days vs. 11 days, p=0.0017, Table II, Table S4, Figure 1A). Because some cases were represented by multiple files, this OS test was repeated with the adjustment of removing all cases with duplicate files from the TRG recovery group, thus ensuring that a TRG recovery was not due to having an increased number of samples. This adjusted log-rank test compared 54 cases with TRG with the 10 cases without any TRG recovery and continued to show that cases with a TRG recovery had a higher OS probability (p=0.0097, Table II, Figure 1B). BL pathologic stage was then correlated to the count of TCR recombination reads. TRB and TRG recombination read counts were found to be decreased as the Ann Arbor pathologic stage increased (Table III; TRB and TRG). For example, Stage I BL primary tumor samples had an average of 25.7 TRG recombination reads, whereas Stage IV samples had an average of 11.1 TRG recombination reads (TRG ANOVA p=0.087, post-hoc Stage I vs. Stage IV p=0.054) (Table S5; Figure 2, TRG).
TCR recombination read recovery counts extracted from 105 CGCI-BLGSP endemic cases.
Endemic BL cases with a productive TRG recovery had a higher OS probability.
Endemic BL cases with a TRG recombination read recovery have a higher OS probability. (A) Endemic BL cases with a TRG recombination read recovery (n=95, black) had a higher overall survival (OS) probability than those without a TRG recovery (n=10, grey); log-rank p=0.0017. (B) In a modified analysis, endemic BL cases with a TRG recombination (n=54, black) still had a higher OS probability greater than those without a TRG recombination (n=10, grey); log-rank p=0.0097. In this second analysis, unlike in (A) above, no cases that had duplicate samples were included, which reduced the sample size from 95 in (A) to 54 in (B). BL: Burkitt lymphoma; OS: overall survival.
Mean productive TCR recombination read counts decreased for TRB and TRG across BL Ann Arbor pathologic stage.
Tukey box plot demonstrating a decrease in TRG recombination read count as Ann Arbor pathologic stage progresses in endemic BL; ANOVA p=0.08. Post-hoc analysis identified a significant difference between Stage I recovery counts and Stage IV recovery counts, p=0.054. BL: Burkitt lymphoma.
TCR gene segment usage, HLA allele combinations and outcome distinctions for endemic BL. Using the available samples with HLA allele typing output, two HLA alleles representing OS distinctions, HLA-A*68:02 and HLA-DPB1*04:01, were identified (Table IV, Figure 3A and B). For example, 14 cases with the HLA-A*68:02 allele were compared to 91 remaining cases, whereby the cases with the HLA-A*68:02 allele had a median OS of 165 days compared to 477 days for the remaining cases (log-rank p=0.0026). Both HLA-A*68:02 and HLA-DPB1*04:01 were associated with worse outcomes. Next, four HLA alleles that were associated with a later age of diagnosis were identified, which are detailed in Table V. Lastly, OS distinctions for five TCR V- or J-gene segment usage, HLA allele combinations were identified (Table VI). These distinctions were not found when assessing the potential correlation with OS with either the HLA allele arm, or the V- or J-gene segment usage arm, independently. In other words, the HLA allele arm and the V- or J-gene segment usage arm were treated as controls for the results obtained with the V- or J-gene segment usage, HLA allele combinations (Table S6). For example, 33 cases had both a TRAV12-3 gene segment and the HLA DQB1*05:01 allele. These cases had an OS that did not reach the median, whereas the remaining cases had a median OS of 199 days (p=0.013, Figure 4). The individual TRAV12-3 gene segment arm and the HLA-DQB1*05:01 allele arm did not have statistically significant correlations with OS (TRAV12-3 log-rank p=0.17; HLA DQB1*05:01 log-rank p=0.08).
Specific HLA alleles were associated with lower OS probability in endemic BL cases.
Cases with specific HLA alleles were associated with a lower OS probability in endemic BL. (A) KM analysis demonstrates that cases with HLA-A*68:02 (n=14, black), had a lower OS probability than those without (n=91, grey); p=0.0026. (B) KM analysis demonstrates the cases with HLA-DPB1*04:01 (n=14, black), had a lower OS probability than those without (n=91, grey); p=0.038. BL: Burkitt lymphoma; OS: overall survival; KM: Kaplan-Meier.
Specific HLA alleles were associated with a later age of diagnosis in endemic BL cases.
TCR gene segment, HLA allele combinations were associated with OS in endemic BL.
Endemic BL OS association with specific TCR gene segment, HLA allele combinations. KM analyses demonstrating higher OS probability in cases with the TRAV12-3, HLA-DQB1*05:01 combination (n=33, black), as compared to all other cases with a TRA recombination read recovery (n=72, grey); p=0.013. There was no significant OS difference in either allele independently. BL: Burkitt lymphoma; OS: overall survival; KM, Kaplan-Meier.
Anti-HIV TRA or TRB CDR3s in endemic BL cases were associated with improved outcomes. Cases that had TCR recombination read recoveries where at least one of the CDR3s matched known anti-HIV TRA or TRB CDR3s were identified. Next, the cases with the anti-HIV CDR3s were assessed for OS, versus all remaining cases with TRA or TRB recombination reads (Table S7). Twenty-two cases with an anti-HIV TRA CDR3 were identified and had an OS that did not cross the median compared to the 83 remaining cases which had a median OS of 215 days (log-rank p=0.0013) (Table VII, Figure 5A). Additionally, 10 cases with an anti-HIV TRB CDR3 also showed a higher OS probability as compared to all remaining cases, with the log-rank test representing a trend (p=0.24, Table VII, Figure 5B). Next, 99 cases which had Ann Arbor pathologic staging data were evaluated (that is, 6 cases from the preceding total of 105 cases did not have staging data). Cases with an anti-HIV CDR3 representing either TRA or TRB were found to have lower Ann Arbor pathologic staging in Mann-Whitney U analyses. For both the TRA and TRB analysis, cases with an anti-HIV TRA CDR3 or anti-HIV TRB CDR3 had a lower rank-sum, as compared to those cases without an anti-HIV TRA CDR3 or anti-HIV TRB CDR3 (Table VIII, Table S8). For example, for the 10 cases with an anti-HIV TRB CDR3, the median stage was II, whereas the median stage of the remaining 89 samples was III (Mann-Whitney U-test p=0.012, Figure 6). Similar analyses were done to identify any association with an immune response to EBV and survival or staging. Seventy-four cases with an anti-EBV TRA CDR3 were identified and had a median OS of 437 days, as compared to the 31 remaining cases which had a median OS of 165 days (log-rank p=0.005, Table VII). Cases with an anti-EBV TRA showed a trend towards a lower Ann Arbor pathologic stage using a Mann-Whitney U-test which did not reach statistical significance (data not shown).
Endemic BL cases with anti-viral CDR3s for HIV and EBV were associated with a higher OS probability.
Endemic BL cases with anti-HIV TRA CDR3s or anti-HIV TRB CDR3s had a higher OS probability. (A) In endemic BL, cases with an anti-HIV TRA CDR3 (n=22, black) had a higher OS probability than those with TRA recombination read recoveries but without anti-HIV TRA CDR3s (n=83, grey); p=0.0013. (B) In endemic BL, cases with an anti-HIV TRB CDR3 (n=10, black) had a trend towards higher OS probability than those with a TRB recombination read recovery but without anti-HIV TRB CDR3s (n=95, grey); p=0.24. BL: Burkitt lymphoma; HIV: human immunodeficiency virus; CDR3: complementarity determining region-3; OS: overall survival.
Cases with anti-HIV TCR recombinations were associated with a lower BL Ann Arbor pathologic stage.
Independent-samples Mann-Whitney U-test comparing the Ann Arbor pathologic stage of cases with anti-HIV TRB CDR3s (n=10, left) with cases without an anti-HIV TRB CDR3 (n=89, right) in endemic BL; p=0.012. BL: Burkitt lymphoma; HIV: human immunodeficiency virus; CDR3: complementarity determining region-3.
Discussion
Overall, these results indicate an association of T-cell features with the development and outcomes of endemic BL. The analyses above represent the first exploration of immunogenomics of endemic BL, to the authors’ knowledge. Focusing on T-cell receptors alone, it is interesting that those patients with a TRG recovery did better than those without and that TRG recombination read recoveries significantly decreased as pathologic stage worsened. This result is consistent with conclusions that γδ T-cells are more common in clinical settings where there is documented HLA class I downregulation, which has been noted to occur in EBV infected B-cells. Further, tumor-infiltrating γδ T-cells have been shown to be favorable prognostically in many other cancers (23), and it is also well-known that these cells are durable. Clinical trials designed to exploit the positive impact of γδ T-cells, via stimulation by bisphosphonates and other strategies, are ongoing in other B-cell malignancies (24, 25). Most importantly, given that γδ T-cells can be stimulated by bisphosphonates, a further understanding of any potential positive role for γδ T-cells in the resource-limited areas where endemic BL occurs would seem to be a high priority.
While recent studies have assessed correlations of specific HLA alleles in other non-Hodgkin lymphomas settings (26), studies have not assessed HLA alleles with regard to BL clinical features. This study identified two HLA alleles associated with worse survival in endemic BL. One of the two HLA alleles associated with worse overall survival in this study was HLA-A*68:02, which is a highly prevalent HLA allele in sub-Saharan Africa (27). Notably, HLA-A*68 has been shown to be associated with a higher HIV viral load (28, 29), as well as a poor immune response in AIDS patients on antiretroviral therapy (30). Thus, having this HLA allele could not only imply poor responses to HIV, but also EBV. Also, in one study, HLA-A*68 was found to be more frequent in Hodgkin’s lymphoma patients (31). Four other HLA alleles were associated with later age of diagnosis in this study; one of these was HLA-DRB1*13:02. This allele has been noted to be a protective for allele cervical squamous cell carcinoma risk (32), and for malaria and Hepatitis B (33).
TCR V- or J-gene segment usage, HLA allele combinations, have been found to be associated with distinct OS probabilities in several different cancers. These conclusions have just begun to be translated into clinical settings. Preliminary research is ongoing to create tools that predict binding between the two receptors, TCR and HLA, to further advance this broad field (34, 35). Creating a repository of known binding pairs using this study and other studies representing known outcome associations for other cancers could potentially guide and validate future clinical trials.
Furthermore, this study adds to the knowledge base regarding the complex interactions HIV may have in endemic BL, as those who had evidence of an immune response against HIV had better OS probabilities and lower pathologic staging. Although HIV has a 6% prevalence in Uganda and viral suppression is reached in only 75% of individuals (https://phia.icap.columbia.edu/wp-content/uploads/2022/08/UPHIA-Summary-Sheet-2020.pdf), the HIV role in endemic BL has been understood to be limited. For example, one study identified only 5% of children with endemic BL having HIV (36), and another identified only one patient with HIV among 145 patients with endemic BL biopsies (37). However, cases with immunodeficiency-related BL in developed countries are highly correlated with HIV infection and represent worse outcomes (36, 38). Another report noted that in the in vivo setting HIV co-infection augments EBV tumorigenesis (39), and another indicated that those who were on anti-retroviral therapy for HIV had decreased co-infection with EBV and a decreased load of EBV (40). Thus, the question arises: how might an anti-HIV T-cell response be consistent with a better outcome in endemic BL? With regard to the EBV results from this study, the correlation between the detection of anti-EBV CDR3s and longer survival is consistent with the understanding that BL is driven by EBV and that the control of this virus by the immune system or therapeutics which are currently being developed may result in better control of the cancer.
With regard to limitations of this study, the study is primarily correlative. In addition, this study is limited given the lack of replicative sets for analysis. However, this study provides a starting point for identifying a T-cell response that appears to be strongly correlated with control of endemic BL progression, whether by increased anti-viral T-cell receptors or via effective combination of antigen presentation and TCR antigen binding.
Acknowledgements
The Authors gratefully acknowledge the contributions of USF research computing and the taxpayers of the State of Florida; and Ms. Lindsey Dickerson for extensive administrative support related to the NIH dbGaP approvals.
Footnotes
Authors’ Contributions
TIH: Conceptualization; Formal analysis; Methodology; Visualization; Writing – Original draft preparation; Writing - review & editing. RJ: Methodology; Visualization. TS: Methodology; Software. GB: Project administration; Resources; Supervision; Writing - review & editing.
Supplementary Material
SOM tables are available at: https://usf.box.com/s/m4ov678jz5nm82zswajntw5ai932u5u8
Conflicts of Interest
The Authors declared that they have no conflicts of interest.
Artificial Intelligence (AI) Disclosure
No artificial intelligence (AI) tools, including large language models or machine learning software, were used in the preparation, analysis, or presentation of this manuscript.
- Received January 24, 2026.
- Revision received March 30, 2026.
- Accepted April 21, 2026.
- Copyright © 2026 The Author(s). Published by the International Institute of Anticancer Research.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.












