Abstract
Background/Aim: Prompted by the increasing demand of non-invasive diagnostic tools for screening of gastric cancer (GC) risk conditions, i.e., atrophic gastritis (AG) and Helicobacter pylori (Hp) infection, the GastroPanel® test (GP: biomarker panel of PGI, PGII, G-17, Hp IgG ELISA) that was developed in the early 2000’s, was recently updated to a new-generation (unified GP) test version. This clinical validation study evaluated the diagnostic accuracy of the new-generation GP test in detection of AG and Hp among gastroscopy referral patients in a University Clinic. Patients and Methods: Altogether, 522 patients were enrolled among the patients referred for gastroscopy at the Gastro Center, Oulu University Hospital (OUH). All patients underwent gastroscopy with biopsies classified using the Updated Sydney System (USS), and blood sampling for GP testing. Results: Biopsy-confirmed AG was found in 10.2% (53/511) of the patients. The overall agreement between the GP and the USS classification was 92.4% (95%CI=90.0-94.6%), with the weighted kappa (κw) of 0.861 (95%CI=0.834-0.883). In ROC analysis using moderate/severe AG of the corpus (AGC2+) as the endpoint, AUC=0.952 (95%CI=0.891-1.000) and AUC=0.998 (95%CI=0.996-1.000) for PGI and PGI/PGII, respectively. Hp IgG antibody ELISA detected biopsy-confirmed Hp-infection with AUC=0.993 (95%CI=0.987-0.999). Conclusion: The new generation GastroPanel® is a precise test for non-invasive diagnosis of atrophic gastritis and Hp-infection in dyspeptic patients referred for diagnostic gastroscopy.
- Atrophic gastritis (AG)
- serological biomarker panel
- new-generation GastroPanel
- non-invasive test
- clinical validation
- gastroscopy
- biopsies
- updated Sydney System (USS)
- Helicobacter pylori
- pepsinogen I
- pepsinogen II
- gastrin-17
- Hp IgG antibody ELISA
- diagnostic accuracy
It is estimated that of the >1 million annual cases of gastric cancer (GC), nearly 80% among males and 70% in women are due to lifestyle and environmental factors (1-5). Two risk factors exceed in importance all the others in pathogenesis of GC: i) Helicobacter pylori (Hp) infection and ii) atrophic gastritis (AG) (3, 6, 7). Hp itself is not a directly carcinogenic agent (8), but Hp-induced AG is the single most severe risk condition of GC (3, 7, 9). About 5-10% of all Hp-infected patients eventually develop moderate to severe AG, and the risk of GC increases in parallel with the severity of AG, reaching 90-fold in patients who have severe AG both in the corpus and in the antrum (so called pan-gastritis; AGP) (3, 6, 7, 10).
The intestinal type of GC develops in atrophic gastric mucosa as a stepwise process known as Correa cascade (3), through mild, moderate and severe AG, often accompanied by intestinal metaplasia (IM) and various degrees of dysplasia (7). This cascade may be interrupted by curative treatment of Hp-infection (3, 4, 9, 11-13). According to the Updated Sydney System classification (USS), AG is classified by its topographic location (antrum, corpus, or both) as AGA, AGC or AGP, respectively (14). AG and Hp-infection can cause upper abdominal symptoms known as dyspepsia (15). It is still controversial, whether systematic Hp-eradication is effective in relieving dyspeptic symptoms (9, 10, 15).
The Correa cascade takes decades to progress into GC, which makes possible to diagnose the precursor lesions given that a suitable screening test is available (16). AG has been traditionally diagnosed using gastroscopy and biopsies (3, 14, 17). However, this invasive method is expensive and felt uncomfortable by most patients, who prefer an inexpensive and non-invasive diagnostic test (18-20). Development of non-invasive methods were initiated in the 1980’s by Miki et al. (21) and Samloff et al. (22) who introduced assays for pepsinogen (PG) measurement in blood samples. In the early 2000’s, a biomarker panel was developed in Finland (Biohit Oyj, Helsinki), combining serum pepsinogen I (PGI) and II (PGII), gastrin-17 (G-17) and Hp antibody (Hp IgG), as an ELISA test known as GastroPanel® (GP) (23). The prime indications of this biomarker panel include the first-line diagnosis of dyspeptic patients (24-26), as well as screening of the risk conditions of GC (i.e., Hp and AG) (27, 28). This biomarker combination gives 1) accurate measurements of the capacity of corpus and antrum to produce gastric acid and G-17, respectively, 2) detects important gastric pathologies, like inflammation, as well as 3) estimates the grade and topography of AG (20-22, 28-31).
Since its introduction, the GP test has been extensively tested in different clinical and screening settings worldwide (32-40). The GP literature accumulated until 2016/2017 was covered by two separate meta-analyses (31, 41), disclosing pooled sensitivity of 72-75% and pooled specificity close to 95% for the AGC endpoint. Since the appearance of these two meta-analyses, the number of relevant studies and tested patients have almost doubled (42-47, only few to cite), indicating an increased global interest in this test (23, 25-29). Before finalizing the GastroPanel® quick (POC) test, a new-generation (unified) GastroPanel® test was introduced, harmonizing the ELISA processing conditions of the 4 biomarkers. This new test version was recently validated in patients at high risk for AG (48) and in those with high Hp-prevalence (49).
Unfortunately, the first study was subject to verification bias, i.e., only GP-test positive patients were examined by gastroscopy and biopsies (48), whereas the latter was compromised by an insufficient number of AG patients (49). With a 100% biopsy-confirmed design, the present study on gastroscopy referral patients provides unbiased estimates on the diagnostic accuracy of the GP test for both AG and Hp endpoints, and concludes the series of clinical validation studies of the unified GP test.
Patients and Methods
The patients were enrolled at the outpatient Department of Gastroenterology, Oulu University Hospital (OUH) Gastro Center, among the consecutive patients referred for gastroscopy with a wide variety of abdominal symptoms. The potentially eligible patients (18 years or older) were identified among the gastroscopy referral outpatients, and were asked to sign a written consent. All consented patients were interviewed using previously validated questionnaires (50). The exclusion criteria were the same as listed in the previous study (48). The study protocol followed the Declaration of Helsinki, and was approved by the Ethics Committee of the Northern Ostrobothnia Hospital District (DNo 060/2015). A cohort of 522 patients completed the study protocol. The key characteristics of the patients and their symptoms are summarized in Table I. Of the 522 patients, 66.7% (n=348) were women and 33.3% were men. The median age of the patients was 58 years (range 18-86 years).
Questionnaire of the symptoms. GI-symptom questionnaire was completed prior to blood sampling (50). The results of the questionnaire (Table I) will be the subject of a separate analysis.
Preparation for GastroPanel® sampling. Proper performance of GastroPanel® test requires that some preparatory measures are being strictly followed, including discontinuation of PPI medication one week before testing, as described before (48). If this was not possible, a notice of PPI use and its eventual discontinuation should be included in the GP request form (23, 29, 30).
Sample processing for GastroPanel® test. GP test results are interpreted by the GastroSoft® application (Biohit Oyj, Helsinki), necessitating completion of the GP request form with pertinent clinical information (23, 29, 48, 49) (Table I). A minimum of 2 ml EDTA plasma from a fasting blood sample was taken into an EDTA tube, frozen instantly (–70°C), as instructed by the manufacturer (23, 29, 48, 49).
Stimulated G-17 (G-17s). In addition to the fasting G-17 sample (G-17b), another blood sample was collected to measure the stimulated G-17 (G-17s) (26, 28, 30, 31), collected 20 min after intake of a special protein drink (Biohit Oyj).
GastroPanel® testing. All plasma samples were delivered to Biohit Oyj (Helsinki) for analysis with the new generation GP test following the instructions detailed elsewhere (23, 29, 48).
GastroPanel® results are interpreted by GastroSoft® application. GastroPanel® test is designed for use with the Updated Sydney System (USS) classification of gastritis (14, 17), both using the same diagnostic categories: a) normal mucosa, b) Hp-gastritis with no atrophy, c) atrophic gastritis of the antrum (AGA), d) atrophic gastritis of the corpus (AGC), and e) atrophic gastritis in both antrum and corpus (AGP) (23, 26, 28, 29, 48, 49).
Gastroscopy and biopsies. Gastroscopy biopsies followed the protocol of the USS, targeting to both the antrum and corpus (14, 17). Macroscopic endoscopy findings were classified using the adopted practice of the clinic (48), the endoscopist being blinded to the GP results. All biopsies were examined by expert pathologists at the Department of Pathology, OUH, and the diagnoses were classified using the USS classification (14, 17) and grading of the AGA, AGC and AGP as reported before (48).
Statistical analysis. The descriptive statistics was done using the conventional tests. Sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV) and their 95%CI, of the GP test biomarkers were calculated using the algorithm of Seed et al. (2001) (51). ROC (Receiver Operating Characteristics) analysis was used to identify the optimal SE/SP balance for both endpoints (AGA and AGC), and AUC values were compared by the roccomb test (48, 49). The agreement between the different tests was calculated separately using overall agreement (OA) and intra-class correlation coefficient (ICC) test for weighted kappa (κw). In addition, Fagan’s nomogram (52) was constructed to give the post-test predictions for AGC at a population level, based on the indicators calculated for the AGC2+ endpoint: i) the pre-test probability; ii) positive likelihood ratio (LR+), and iii) negative likelihood ratio. All statistical analyses were performed using the SPSS 27.0.1.0 for Windows (IBM, NY, USA) and STATA/SE 17.0 software (STATA Corp., TX, USA). All tests were deemed significant at the level of p<0.05.
Results
Table I summarises the age and gender of patients and their medical history requested in the GastroPanel® referral form. The symptoms recorded by the GI-questionnaire are not reported in this communication. The majority (66.7%) of patients were women. The mean age of patients was 55.7 years (SD=15.4 years). Of the specific items necessary for the GastroPanel® test, the frequency of prior Hp-eradication was 15.7%, continuous use of PPI medication was reported by 42% of patients, symptoms of high acidity by 33.1% (continuously) as well as continuous use of NSAIDs by 20.7% of the study subjects. The correlation of GI symptoms with GastroPanel® results are to be reported in a subsequent paper. The biomarker values (M±SD) in the five diagnostic categories of the GastroPanel® test are summarized in Table II, with no unexpected findings. Table III gives the biomarker levels across the diagnostic categories of the USS classification. As compared to the GP test categories, the most visible differences are seen in Hp-antibody titres.
AGA was diagnosed in 16 patients, AGC in 23 cases and AGP in 13 patients in the biopsies (Table IV). The unadjusted overall agreement (OA) between the GastroPanel® test and the USS classification is 0.914 (i.e., 91.4%). When adjusted for the correctly diagnosed AGC component of the AGP by the GP test in 5/13 cases, the adjusted OA increases to 92.4% (95%CI=90.0-94.6%). The weighted kappa test (κw) for the two-test agreement is: non-adjusted κw=0.850 and adjusted κw=0.861 (95%CI=0.834-0.883).
Regarding the agreement in diagnosis of AG by GastroPanel® test and macroscopic gastroscopy findings (Table V), 480/520 cases were similarly diagnosed by both tests: OA=92.0% (95% CI=90.0-94.5%). After an AG diagnosis in the GP test, OR for detecting AG on gastroscopy is 23.23 (95%CI=10.78-50.03) (p=0.0001).
Table VI shows the reproducibility between gastroscopy and the USS classification in diagnosis of AG. Altogether, 467/512 cases are similarly diagnosed (AG+/AG-) by the two tests, with OA=91.2%. The OR for diagnosing AG in the biopsies after AG diagnosed in gastroscopy is 23.21, with 95%CI ranging between 11.24-47.92 (p=0.0001).
Table VII depicts the precision of the GastroSoft® AGA- and AGC-profiles as predictors of AGA and AGC in the biopsies. For AGA, sensitivity and PPV of GastroSoft® AGA profile are poor but specificity and NPV are high, with AUC values ranging between 0.507and 0.526. The AGC-profile predicts the biopsy-confirmed AGC2+ with 92.0% SE and 99% SP, with AUC=0.955 (95%CI=0.900-1.000). For any grade of AGC, AUC reaches the value of 0.859 (95%CI=0.785-0.933).
Table VIII summarises the diagnostic accuracy of GastroPanel® biomarkers (PGI, PGI/PGII) and G-17s in diagnosis of AGC and AGA, respectively. As expected, G-17s is of limited value in diagnosing AGA, whereas PGI and PGI/PGII ratio are highly accurate in diagnosis of AGC. Using the AGC2+ (moderate/severe AGC) as the endpoint, both PGI (30 μg/l) and PGI/PGII ratio (3.0) are almost 100% accurate diagnostic tests, with sensitivity of 92.0% and 100%, and specificity of 98.8% and 98.6%, respectively. Figure 1 shows the ROC curve for PGI using the AGC2+ endpoint, with the AUC=0.952 (95%CI=0.891-1.000). The ROC curve is even more impressive for PGI/PGII ratio, with the AUC=0.998 (95%CI=0.996-1.000) (Figure 2). Hp IgG ELISA of the GastroPanel® detects biopsy-confirmed Hp-infection (any topography) with AUC=0.992 (95%CI=0.987-0.999), as shown in Figure 3.
The Fagan’s nomogram (52) illustrated in Figure 4 was drawn by entering the diagnostic indicators of PGI for the AGC2+ endpoint (Table VIII), produced by STATA (diagti algorithm): i) the pre-test probability 0.049; ii) LR+ 74.4, and iii) LR- 0.081. As the post-test predictions of AGC in a population, Fagan’s nomogram implicates that an AGC diagnosis in the GP test predicts AGC2+ with the likelihood of 80%, whereas the likelihood is close to 0% (0.4%) if the GP test result is negative for AGC.
Discussion
Before entering into the GastroPanel® validation data, a few remarks need to be made. First: The GP test biomarkers measure the function and structure of both the antrum (G-17b, G-17s) and the corpus (PGI, PGII, PGI/PGII ratio) separately (28-31, 53). The GP biomarker profiles reflect this topographic location of AG in the antrum (AGA) and corpus (AGC), and these two conditions should be kept separately while validating the diagnostic accuracy of the GP test (26, 28-31, 53). Second: Mild AGA and AGC are poorly reproducible histopathological diagnoses (14, 17, 26, 30-32, 34, 40, 41, 53), and should never be used as the endpoint while calculating the diagnostic accuracy of PGI (PGI/PGII) and G-17, respectively. Instead, only moderate/severe AG (AGC2+, AGA2+) should be used in these calculations (26, 28, 31, 41, 54). Third: Low G-17b values are due to high acid output of the corpus in the vast majority of cases, while AGA is a far more uncommon cause of low G-17b (23, 26-29, 31, 41, 53). Because of this dual regulation, G-17b cannot be an accurate biomarker of AGA only (30, 31, 41). In this respect, protein-stimulated G-17 (G-17s) is more helpful, but even then, truly low levels of G-17 are encountered only in moderate/severe AGA (AGA2+) when G-cells are absent (30, 53). Due to these inherent physiological principles, the accuracy of the GP test in diagnosis of AGA never reaches the level obtained in diagnosis of AGC (26, 31, 41, 48, 49).
According to a comprehensive review, testing computer models that included symptoms, clinical history, risk factors, and patient demographics, clinical symptoms are of limited value in making a distinction between organic and functional dyspepsia (15). The original idea of the GastroPanel® designers was to develop a non-invasive alternative to invasive gastroscopy for the routine diagnosis of dyspepsia (23, 26, 28, 29). To fulfil this intent, the diagnostic agreement between the GP test and gastroscopic examination is of vital importance. In the present cohort, the agreement between these two methods is excellent (92%); 480/520 cases are concordantly diagnosed as AG, and the likelihood for disclosing AG on gastroscopy among the patients whose GP tests indicate AG has OR=23.2 (Table V). This close concordance between the two techniques provides strong support to the practice that all patients with the AG profile in the GP test should be referred for gastroscopy (23, 26, 28-31, 48, 49, 53). Noteworthy, in experienced hands, gastroscopic diagnosis of AG also closely (91.2%) concurs with the AG diagnosis in the biopsies (Table VI). Because of the equal (91.2-92%) concordance of i) the GP test and ii) gastroscopy with the biopsy histology, this leaves space for weighting between a non-invasive and an invasive test option in the primary diagnosis of dyspeptic symptoms.
Regarding classification of gastritis, the most widely used systems include the Updated Sydney System (USS) (14, 17) and OLGA/OLGIM staging (55). The GP test has been optimised for use with the USS, both including 5 diagnostic categories (14, 17, 23, 28-31, 41, 48, 49, 53), which makes it straighforward to calculate the reproducibility of the two tests by applying the weighted kappa (κw) test. When this was done (Table IV), the agreement calculated using the κw test is 0.850 and 0.861, as non-adjusted and adjusted, respectively. The corresponding figures for overall agreement (OA) between the GP test and the USS classification are 91.4% and 92.4%, respectively. Both these values are classified as “almost perfect” (0.8-1.0), while categorizing the tests that measure reproducibility. This lends further support to the statement that the GP test bears a close concordance with the USS classification (23, 28-31, 41, 48, 49, 53). In this context, a word of caution must be stated against the use of OLGA staging (55) in validation of GP test, because this is not feasible. OLGA staging combines varying grades of AGA and AGC into one and the same OLGA stage (55). This results in unpredictable values of the antrum- and corpus-specific GP biomarkers across the OLGA stages, which obscures an accurate linking of specified GP biomarker profiles to individual OLGA stages (30, 31, 41, 53).
GastroSoft® defines biomarker profiles distinct for AGA (low G-17b and G-17s, Hp+) and AGC (low PGI, PGI/PGII ratio, high G-17b) (23, 26, 28-31, 48, 49, 53), that were tested for their diagnostic accuracy for biopsy-confirmed AGA and AGC (Table VII). Not unexpectedly, the sensitivity of the AGA profile in diagnosis of AGA (n=10) in the biopsies is low (3.4-7.0%), whereas specificity is high (98%), resulting in AUC=0.526. This low sensitivity of the GP test for AGA is explained by the dual role of G-17 as a marker of i) AGA, and ii) high acid output. As the cause of low G-17, the latter is far more common than the former, e.g., in this cohort n=16 (Table IV) and n=173 (Table I), respectively. Because of this dual mode of regulation, G-17b can never be a highly sensitive biomarker of AGA (23,26-29,31,41,53). This is in sharp contract to AGC, while the AGC-profile of GastroSoft® detects the biopsy-confirmed AGC2+ with 92.0 SE and 99.0% SP, equivalent to AUC=0.955.
When different cut-off values are used instead of the GastroSoft® profiles, the diagnostic accuracy remains unchanged (Table VIII). For PGI, 30 μg/l seems to be the optimal cut-off, being also used in the majority of the published studies (23, 29, 30, 31, 41). For the AGC2+ endpoint, the highest diagnostic accuracy is obtained with the PGI/PGII ratio (3.0 cut-off). These outstanding AUC values are confirmed by the ROC analysis (Figure 1 and Figure 2), with AUC=0.952 and AUC=0.998 for PGI and PGI/PGII, respectively. These AUC values are superior to the HSROC (hierarchical summary ROC) calculated by Zagari et al. (2017) in their recent meta-analysis, with the pooled SE of 74.7% and pooled SP of 95.6% for PGI in diagnosis of AGC (i.e., AUC=0.851) (41).
Another important role of the GP test is in diagnosis of Hp infections (23, 29, 31). During the past several years, comprehensive reviews have been published on the limitations of the conventional Hp-tests in different clinical conditions (10, 56-59), and their shortcomings have been listed in all European Consensus Reports since 1996 (10, 58, 59). This includes both of the most widely used Hp-tests: the 13C-Urea Breath Test (UBT) and Stool Antigen test (SAT) (60). The accumulated evidence leaves little doubt that several clinical conditions seriously impede the diagnostic accuracy of the UBT and SAT tests, false-negative and false-positive results being recorded in up to 40% of cases (10, 56-60). Many of these caveats of the UBT and SAT tests can be evaded by the GastroPanel® test (26), which distinguishes 3 diagnostic profiles that specify i) an actively ongoing Hp-infection, b) successful Hp-eradication, and c) failed Hp-eradication (23, 26, 29, 30).
The Hp IgG ELISA component of the new-generation GP test was validated recently in a population with a high (64% biopsy-confirmed) prevalence of Hp-infection (49). In this population, the overall agreement between Hp IgG ELISA and gastric biopsies in Hp-detection was 91%, with AUC=0.978 (49). In the present study of a population with biopsy-confirmed Hp-prevalence of 7.8% (Table II and Table III), the diagnostic accuracy of the Hp IgG ELISA is even more impressive: AUC=0.993 (Figure 3), OA 497/511 (97.2%), SE of 95.0% (95%CI=83.1-99.4%) and SP of 97.5% (95%CI=95.6-98.7%), using biopsy-confirmed Hp as the reference.
In 1975, Fagan introduced a nomogram to quantify a post-test probability for individuals to be affected by a condition, based on the probability of the condition before the test (pre-test probability) (52). When adopted to the present data (LR+, LR–), the Fagan’s nomogram (Figure 4) implicates that a GP test result of AGC predicts the diagnosis of AGC2+ in a population, with the likelihood of 80%, whereas such a likelihood is close to 0% (0.4%) if the GP test result is negative for AGC. Assuming that the study sample is representative of the entire population, an estimate of the pre-test probability reflects the global prevalence of this disorder (52).
Conclusion
The present clinical validation study confirms the diagnostic accuracy of the unified GastroPanel® test in gastroscopy referral patients, supporting our previous results in different settings (48, 49). GastroPanel® biomarkers PGI and PGI/PGII ratio as well as Hp IgG ELISA are equally accurate in diagnosing AGC and Hp-infection in the biopsies, respectively. Diagnosis of AGA by G-17 is less accurate, however, due to the dual physiological role of G-17 as a biomarker of i) antrum atrophy and ii) high-acid output of the corpus, thus inflating its diagnostic accuracy for AGA. Being closely concordant with biopsy histology, the GastroPanel® test offers a non-invasive alternative for invasive gastroscopy in the diagnosis of dyspeptic patients. When the AG-profile of the GP test is being used as the indication for gastroscopy, substantial cost savings are achieved by cancelling unnecessary gastroscopies particularly in populations with low to moderate prevalence of AG (and Hp).
Acknowledgements
This study was funded by a non-restricted research grant from Biohit Oyj (Helsinki, Finland). The skilful technical assistance of the following persons are gratefully acknowledged for their important input in the different phases of the study: Mrs. Leena Ukkola, Mrs. Heidi Häikiö, Mrs. Saara Korhonen, Mrs. Saija Kortetjärvi, Mrs. Marita Koistinen, Mrs. Katja Eronen, Mrs. Anita Mikkola, Mrs. Kaisa Friberg, Ms. Milla Mikkola, Mrs. Pia Rinkinen, Dr. Tapani Tiusanen, PhD, Dr. Minna Mäki, PhD, Suvi Elomaa, B.Sc. and Mrs. Heli Holopainen.
Footnotes
This article is freely accessible online.
Authors’ Contributions
All Authors have met all the following four criteria: i) Substantial contributions to the conception or design of the work or the acquisition, analysis, or interpretation of data for the work. ii) Drafting of the work or revising it critically for important intellectual content. iii) Final approval of the version to be published. iv) Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Conflicts of Interest
The Authors declare no conflicts of interest.
- Received September 4, 2021.
- Revision received October 7, 2021.
- Accepted October 21, 2021.
- Copyright © 2021 International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.