Assessment of the inter-platform reproducibility of ultrasound attenuation examination in nonalcoholic fatty liver disease
Article information
Abstract
Purpose
This study aimed to assess the inter-platform reproducibility of ultrasound attenuation examination in patients with nonalcoholic fatty liver disease (NAFLD).
Methods
Between March 2021 and April 2021, patients with clinically suspected or known NAFLD were prospectively enrolled; each patient underwent ultrasound attenuation examinations with three different platforms (Attenuation Imaging [ATI], Canon Medical System; Tissue Attenuation Imaging [TAI], Samsung Medison; and Ultrasound-Guided Attenuation Parameter [UGAP], GE Healthcare) on the same day. The mean attenuation coefficient (AC) values of the three platforms were compared using repeated-measures analysis of variance with the Bonferroni correction. To evaluate inter-platform reproducibility, the AC values obtained for each platform were compared using Bland-Altman analysis with the calculation of 95% limits of agreement (LOA), intraclass correlation coefficients (ICCs), and coefficients of variation (CVs).
Results
Forty-six patients (23 men; mean age±standard deviation, 52.3±12.4 years) were enrolled. The mean AC values showed significant differences among the three platforms (0.75±0.12, 0.80±0.11, and 0.74±0.09 dB/cm/MHz for ATI, TAI, and UGAP, respectively; P<0.001). For inter-platform reproducibility, the 95% LOAs were -0.22 to 0.11 dB/cm/MHz between ATI and TAI, -0.17 to 0.18 dB/cm/MHz between ATI and UGAP, and -0.08 to 0.20 dB/cm/MHz between TAI and UGAP, respectively. The pairwise ICCs were 0.790-0.797 in terms of absolute agreement among the three platforms; the CVs were 8.23%-9.47%.
Conclusion
The AC values obtained from different ultrasound attenuation examination platforms showed significant differences, with significant inter-platform variability. Therefore, the AC values measured using different ultrasound attenuation examination techniques should not be used interchangeably for longitudinal follow-up of patients with NAFLD.
Introduction
The prevalence of nonalcoholic fatty liver disease (NAFLD) is increasing globally, and NAFLD is emerging as the most common type of chronic liver disease in many parts of the world [1]. NAFLD comprises a spectrum of conditions ranging from simple steatosis to nonalcoholic steatohepatitis (NASH), which can progress to cirrhosis [2,3]. The progression of NAFLD to NASH has been known to be stopped or reversed by the early detection and treatment of hepatic steatosis [4]. Although liver biopsy remains the current gold standard for the diagnosis of hepatic steatosis, the invasiveness of biopsy methods and occurrence of sampling errors underscores the need for noninvasive diagnostic tools [5]. Several imaging modalities have been used for the noninvasive assessment of hepatic steatosis [6-8]. Magnetic resonance spectroscopy and chemical shift-encoded magnetic resonance imaging-based proton density fat fraction (MRI-PDFF) are used as validated reference standards with excellent accuracy and reproducibility for the evaluation of hepatic steatosis [6,7]. However, MRI-based techniques may not be cost-effective or easily accessible for the clinical screening of NAFLD considering the high prevalence of this disease [9]. Although B-mode ultrasonography is a commonly used test for hepatic steatosis and is advantageous in terms of its wide availability, the results of this test are subjective, operator-dependent, and not quantifiable [8,10].
In recent years, various quantitative ultrasound techniques, including the speed of sound, ultrasound attenuation, and backscatter coefficient, have been developed for the quantification of hepatic fat [11-14]. Among various quantitative ultrasound techniques, several ultrasound manufacturers have developed software for quantifying the attenuation of ultrasound beams, which increases as hepatic steatosis progresses [15,16]. The controlled attenuation parameter, which is measured by transient elastography, can provide objective measurements of hepatic steatosis; however, it cannot provide B-mode ultrasound images and is known to be affected by the patients’ age, skin-to-liver capsule distance, and body mass index (BMI) [17,18]. Several ultrasound manufacturers have recently developed B-mode ultrasound-guided attenuation examination techniques. Compared to conventional ultrasonography, ultrasound-guided attenuation examination is less operator-dependent and still has the inherent advantages of ultrasonography, such as wide availability, real-time capability, and relatively low cost [19]. In previous studies, ultrasound attenuation examination techniques have demonstrated good accuracy in the diagnosis and grading of hepatic steatosis, with excellent inter-examiner or intra-examiner reliability [20-23].
However, little is known about the reproducibility of ultrasound attenuation coefficient (AC) values measured using multiple platforms from different scanner vendors. High inter-platform reproducibility of ultrasound attenuation examination is essential for its clinical application in patients with NAFLD for purposes including the diagnosis, monitoring, and follow-up of hepatic steatosis. Therefore, this study aimed to assess the inter-platform reproducibility of ultrasound attenuation examination in patients with NAFLD.
Materials and Methods
Compliance with Ethical Standards
This single-center, prospective study was approved by the institutional review board of Seoul Nationl University Hospital (IRB No. 2102-044-1195), and written informed consent was obtained from all participants.
Study Population
Patients who met the eligibility criteria between March 2021 and April 2021 and provided written informed consent were prospectively enrolled in this study. The inclusion criteria were as follows: (1) age ≥18 years and (2) referral to the radiology department for ultrasonographic evaluation of the liver because of suspected or known NAFLD. The exclusion criteria were as follows: (1) age less than 18 years; the (2) presence of clinical, laboratory, or histological evidence of liver disease other than NAFLD; (3) excessive alcohol consumption (≥14 and ≥7 drinks per week for men and women, respectively); (4) steatogenic or hepatotoxic medication use; and (5) history of liver surgery.
B-Mode Ultrasound and Ultrasound Attenuation Examinations
For each patient, conventional B-mode ultrasound and ultrasound attenuation examinations were performed by one of two body radiologists (S.K.J. and J.M.L., each with more than 7 years of experience in abdominal ultrasound examinations). All patients were requested to fast for more than 4 hours prior to the ultrasound examination.
B-mode ultrasound examinations
First, a conventional B-mode ultrasound examination was performed using an ultrasound system (Aplio i900, Canon Medical Systems, Tochigi, Japan) with a 1-8 MHz convex probe. The visual score of hepatic steatosis was recorded by the operator as follows: 0, no steatosis; 1, mild steatosis; 2, moderate steatosis; and 3, severe steatosis. These scores were based on Hamaguchi’s scoring system using the following ultrasound imaging features: bright liver, increased hepatorenal echo contrast, deep attenuation, and vessel blurring [24]. Additionally, the skin-to-liver capsule distance (mm) was measured.
Ultrasound attenuation examinations
For each patient, after the B-mode ultrasound examination, two sessions of ultrasound attenuation examinations were performed by the same radiologist on the same day. Each session was performed using three different ultrasound attenuation examination platforms: Attenuation Imaging (ATI) using Aplio i900 (Canon Medical System), Tissue Attenuation Imaging (TAI) using RS 85 (Samsung Medison, Seoul, Korea), and Ultrasound-Guided Attenuation Parameter (UGAP) using LOGIQ E10 (GE Healthcare, Chicago, IL, USA). The ultrasound attenuation examinations were performed for the right lobe of the liver through an intercostal plane near the level of the hepatic hilum during breath-holding. The radiologists tried to perform the examinations at the same liver location in each patient. In addition, although there were no specific instructions on the timing of patients’ breathing, the examination was performed with a constant timing of breathing and at the point where the same view was seen to the extent possible in each patient.
In the ATI examination, with activation of the ATI mode, a fan-shaped color-coded sampling box was positioned in the hepatic parenchyma at least 2 cm below the liver capsule while avoiding areas with large vessels, focal fat sparing or deposition, and reverberation artifacts or shadowing. Structures other than the hepatic parenchyma, such as vascular structures, were automatically excluded from the sampling box. Thereafter, a 2×4 cm fan-shaped region of interest (ROI) was placed within the sampling box, and the AC value (dB/cm/MHz) was calculated and displayed (Fig. 1A).
For the TAI examination, with the selection of a function key for TAI, a 2×3 cm fan-shaped ROI with a color-coded map was generated (Fig. 1B). The ROI box was placed in a right lobe of the liver at least 2 cm below the liver capsule while avoiding areas with large vessels, focal fat sparing or deposition, and reverberation artifacts or shadowing. Areas with significant errors in the calculation of parameters, such as vascular structures, were automatically excluded from the maps. The AC value (dB/cm/MHz) was automatically calculated and provided.
For the UGAP examination, with activation of the UGAP mode, a color-coded map was generated where the signal quality was sufficiently high to perform a measurement (quality map). An ROI with a length of 65 mm was placed within the color-coded map while avoiding areas with large vessels, focal fat sparing or deposition, and reverberation artifacts or shadowing, and the AC value was automatically calculated (Fig. 1C).
For both ATI and TAI examinations, the reliability of measurement was described as an R2 value, and the operator attempted to obtain AC values with an R2 value ≥0.6. For the UGAP examination, an unreliable area was presented as a vacancy on the quality map, and the operator attempted to obtain AC values by avoiding that area. Five consecutive measurements were performed for each examination, and the mean values of the five measurements were used for analysis.
Statistical Analysis
Continuous values were summarized as means with standard deviations, and categorical variables were summarized as counts with percentages. Pearson correlation coefficients were calculated between the AC values of ATI and TAI, ATI and UGAP, and TAI and UGAP. Repeated-measures one-way analysis of variance (ANOVA) with the Bonferroni post-hoc test was performed to compare the mean AC values of different ultrasound attenuation examination platforms. In addition, scatter plots and Bland-Altman plots were generated for the AC values of the different platforms.
To evaluate the inter-platform reproducibility of the different ultrasound attenuation examination platforms and inter-session reproducibility of each platform, a Bland-Altman analysis with 95% limits of agreement (LOAs) was employed for the mean AC values of the different platforms. Intraclass correlation coefficients (ICCs) with 95% confidence intervals (CIs) and coefficients of variation (CVs, %) were calculated [25]. ICC values were calculated based on a single-unit two-way mixed-effects ANOVA model in which the patient was treated as a random effect and the platform was treated as a fixed effect. The ICC for absolute agreement was also reported. Agreement using ICCs was classified using the following criteria: ≥0.90, excellent; ≥0.75 to <0.90, good; ≥0.50 to <0.75, moderate; and <0.50, poor [26]. Pearson correlation coefficients were calculated to evaluate the correlations between ultrasound attenuation examination platforms and were categorized using the following criteria: 0-0.19, very weak; 0.2-0.39, weak; 0.40-0.59, moderate; 0.60-0.79, strong; and 0.80-1.0, very strong [27]. In addition, to evaluate the inter-platform reproducibility of different ultrasound attenuation examination platforms according to the visual grade of hepatic steatosis, the Bland-Altman 95% LOAs, ICCs, CVs, and Pearson correlation coefficients were calculated in patients with no or mild steatosis and moderate to severe steatosis (Supplementary Table 1). To evaluate whether inter-platform variability is affected by the patient’s BMI and skin-to-liver capsule distance, Pearson correlation coefficients were calculated between the absolute between-platform differences in AC values and BMI; and between the absolute inter-platform differences in AC values and the skin-to-liver capsule distance. All statistical analyses were performed using commercially available software (SPSS version 25, IBM Corp., Armonk, NY, USA; MedCalc version 18, MedCalc Software, Mariakerke, Belgium), with P-values <0.05 considered to indicate a statistically significant difference.
Results
Forty-six patients (23 men; mean age, 52.3±12.4 years; range, 24 to 80 years) were enrolled in the study and their data included in the analyses. The demographic characteristics of the study cohort are summarized in Table 1. The patients had an average BMI of 26.2±3.0 kg/m2 and an average skin-to-liver capsule distance of 19.9±4.1 mm. On visual assessment, patients had either no (n=7), mild (n=14), moderate (n=17), or severe hepatic steatosis (n=8).
Inter-platform Reproducibility of AC Values between ATI, TAI, and UGAP
The mean AC values were 0.75±0.12 for ATI, 0.80±0.11 for TAI, and 0.74±0.09 dB/cm/MHz for UGAP, and these values were strongly correlated with each other (ATI and TAI, r=0.73; 95% CI, 0.56 to 0.84; P<0.001; ATI and UGAP, r=0.68; 95% CI, 0.48 to 0.81; P<0.001; and TAI and UGAP, r=0.78; 95% CI, 0.63 to 0.87; P<0.001). The mean AC value for TAI was significantly higher than that for ATI and UGAP (P<0.001 for both), whereas there was no significant difference between the mean AC values for ATI and UGAP (P>0.99) (Table 2, Fig. 2).
The Bland-Altman analysis showed a bias across AC values for different ultrasound attenuation examination platforms, with a mean difference of -0.05 dB/cm/MHz between ATI and TAI, 0.01 dB/cm/MHz between ATI and UGAP, and 0.06 dB/cm/MHz between TAI and UGAP. The 95% LOAs of the mean AC values ranged from -0.22 to 0.11 dB/cm/MHz for ATI and TAI, from -0.17 to 0.18 dB/cm/MHz for ATI and UGAP, and from -0.08 to 0.20 dB/cm/MHz for TAI and UGAP (Table 3, Fig. 3).
The pairwise ICCs of AC values for different ultrasound attenuation examination platforms ranged from 0.790 to 0.797 (0.797 for ATI and TAI; 0.794 for ATI and UGAP; and 0.790 for TAI and UGAP), indicating good agreement. The CVs were 9.47% for ATI and TAI, 8.27% for ATI and UGAP, and 8.23% for TAI and UGAP (Table 3).
Inter-platform Reproducibility and Potential Confounding Factors
BMI was not correlated with the absolute inter-platform difference in AC values: ATI and TAI, r=0.008, P=0.957; ATI and UGAP, r=0.031, P=0.839; TAI and UGAP, r=0.077, P=0.077. Additionally, the skin-to-liver capsule distance was not related with the absolute inter-platform difference in AC values: ATI and TAI, r=0.091, P=0.548; ATI and UGAP, r=-0.054, P=0.721; and TAI and UGAP, r=-0.043, P=0.778. The inter-platform reproducibility improved with an increasing number of acquisitions, averaged from one to five. When five acquisitions were used, the inter-platform ICCs for AC were 0.79-0.80 in terms of the absolute agreement, and the CVs were 8.23%-9.47% (Supplementary Table 2).
Inter-session Reproducibility of Each Ultrasound Attenuation Examination Platform
The inter-session reproducibility of each ultrasound attenuation examination platform is summarized in Table 4. For each platform, the overall inter-session reproducibility was excellent, with ICCs of 0.962 (95% CI, 0.931 to 0.979) for ATI, 0.957 (95% CI, 0.922 to 0.976) for TAI, and 0.962 (95% CI, 0.931 to 0.979) for UGAP. The CVs were 4.9% (95% CI, 0.9 to 7.0) for ATI, 3.9% (95% CI, 1.2 to 5.3) for TAI, and 3.4% (95% CI, 2.6 to 4.0) for UGAP.
The Bland-Altman analysis showed a slight bias across AC values between the two sessions of each ultrasound attenuation examination platform, with a mean difference of 0 dB/cm/MHz for ATI and UGAP and -0.01 dB/cm/MHz for TAI. The 95% LOAs of the mean AC values ranged from -0.03 to 0.03 dB/cm/MHz for ATI, from -0.10 to 0.08 dB/cm/MHz for TAI, and from -0.07 to 0.06 dB/cm/MHz for UGAP.
Discussion
In clinical practice, it is a highly likely that several different ultrasound systems are used for the follow-up of patients with NAFLD. Therefore, excellent inter-platform reproducibility is an essential aspect of the wide clinical use of ultrasound attenuation examination for longitudinal follow-up in the monitoring of treatment response. In this study, although each ultrasound attenuation examination platform had excellent inter-session reproducibility, significant inter-platform variability was observed in the mean AC values measured using the different platforms. Thus, AC values measured using different ultrasound attenuation examination techniques should not be used interchangeably for longitudinal follow-up, and different cutoff values for steatosis grading should be applied to different ultrasound attenuation examination platforms.
Only a few studies have investigated the inter-platform reproducibility of ultrasound attenuation examination. Previous studies by Han et al. [28,29] assessed the inter-platform reproducibility of AC values using ultrasound radiofrequency data obtained from two different clinical ultrasound systems and reported an ICC of 0.77, a Pearson correlation coefficient of 0.81, and a small standard deviation of measurements (<0.07 dB/cm/MHz); therefore, the authors concluded that the inter-platform reproducibility of AC values was good. However, in those studies, although ultrasound radiofrequency data were obtained using different ultrasound platforms, all AC values were obtained in common using an offline software program in MATLAB that used the same method of calculating AC values. However, this workflow is quite different from routine clinical practice. In routine clinical practice, each clinical ultrasound machine uses a different system and software methods of calculating AC values. Instead, the present study used three clinical ultrasound machines, each with its own ultrasound attenuation examination platform, which may more closely reflect the real clinical application of ultrasound attenuation examination. In the present study, significant inter-platform variability was observed, which was not affected by BMI or skin-to-liver capsule distance. As this inter-platform variability could be attributed to several system-related factors and the software’s method of calculating ACs, it could be consistently observed across the different ultrasound attenuation examination systems.
In the present study, the pairwise ICCs of AC values from different ultrasound attenuation examination platforms were 0.790-0.797, indicating good agreement, which corresponds to the findings of a previous study by Han et al. [28]. However, the 95% LOAs of the absolute difference in mean AC values were quite large and were thought to be clinically unacceptable (-0.22 to 0.11 dB/cm/MHz for ATI and TAI, -0.17 to 0.18 dB/cm/MHz for ATI and UGAP, and -0.08 to 0.20 dB/cm/MHz for TAI and UGAP). Many previous studies have reported high diagnostic performance of ultrasound attenuation examination for diagnosing hepatic steatosis and grading severity, but there is wide variation in the optimal cutoff values for diagnosing hepatic steatosis reported in each study, ranging from 0.59 to 0.69 dB/cm/MHz for ATI (0.59 to 0.69 dB/cm/MHz with MRI-PDFF as the reference standard; 0.64 to 0.69 dB/cm/MHz with histopathology as the reference standard); 0.88 dB/cm/MHz for TAI (with MRI-PDFF as the reference standard), and from 0.53 to 0.60 dB/cm/MHz for UGAP (0.53 dB/cm/MHz with histopathology as the reference standard; 0.60 dB/cm/MHz with MRI-PDFF as the reference standard [20-22,30]. Moreover, the optimal cutoff values for discriminating each grade of hepatic steatosis (mild [S1], moderate [S2], and severe [S3]) showed minimal gaps (range, 0.02 to 0.10 dB/cm/MHz) [20-22,30]. Considering the minimal gradual change in AC values according to the grade of hepatic steatosis, the calculated 95% LOAs of the absolute difference in AC values in the present study were thought to be too large to be clinically acceptable.
Our study results demonstrated excellent inter-session reproducibility of each ultrasound attenuation examination platform. Of note, the mean bias and 95% LOAs were quite minimal (95% LOAs of mean AC values, -0.03 to 0.03 dB/cm/MHz for ATI, -0.10 to 0.08 dB/cm/MHz for TAI, and -0.07 to 0.06 dB/cm/MHz for UGAP, respectively). These results are in line with previous studies that reported excellent intra-examiner and inter-examiner reproducibility of each ultrasound attenuation examination platform [23,31-33]. In previous studies, all three platforms (ATI, TAI, and UGAP) showed high intra-examiner and inter-examiner reproducibility (ICCs for intra-examiner and inter-examiner reproducibility: 0.93 and 0.79 for ATI; 0.99 and 0.98 for TAI; and 0.86 and 0.64 for UGAP, respectively) [23,31-33]. Considering the high inter-session reproducibility of each platform, each ultrasound attenuation examination technique can be used as a screening test or as a tool for monitoring treatment in patients with hepatic steatosis.
The present study had several limitations. First, this was a single-center study comprising a small number of patients; further multicenter and multi-platform studies are needed. Second, a histological diagnosis or MRI-PDFF as a noninvasive reference standard for hepatic steatosis was not performed in this study, as its primary goal was to evaluate the inter-platform reproducibility of ultrasound attenuation examination platforms and not to compare the diagnostic performance of each platform. Third, as one radiologist performed two sessions with each ultrasound system for each patient, inter-examiner reproducibility could not be assessed.
In conclusion, the results of this study indicate significant inter-platform variability across different ultrasound attenuation examination platforms, although the inter-session reproducibility of AC values of each platform was excellent. Therefore, AC values measured using different ultrasound attenuation examination techniques should not be used interchangeably for the longitudinal follow-up of patients with NAFLD.
Notes
Author Contributions
Conceptualization: Jeon SK, Lee JM. Data acquisition: Jeon SK, Lee JM. Data analysis or interpretation: Jeon SK, Joo I, Yoon JH. Drafting of the manuscript: Jeon SK. Critical revision of the manuscript: Lee JM, Joo I, Yoon JH. Approval of the final version of the manuscript: all authors.
Jeong Min Lee received grants from Samsung Medison, grants from Philips Healthcare, grants from GE Healthcare, grants from Canon Medical, personal fees and non-financial support from Siemens Healthcare, grants from RF MEDICAL, grants and personal fees from Bayer Healthcare, and grants and personal fees from Guerbet, outside the submitted work. Sun Kyung Jeon, Ijin Joo, and Jeong Hee Yoon have declared no conflicts of interest.
Acknowledgements
This study was supported by Research Fund of the Korean Society of Ultrasound in Medicine for 2021.
Supplementary Material
References
Article information Continued
Notes
Key point
Attenuation coefficient values obtained from different ultrasound attenuation examination platforms showed significant differences, with substantial inter-platform variability.