The summaries are free for public
use. The Chronic Liver Disease
Foundation will continue to add and
archive summaries of articles deemed
relevant to CLDF by the Board of
Trustees and its Advisors.
Abstract Details
Suboptimal reliability of liver biopsy evaluation has implications for randomized clinical trials
J Hepatol. 2020 Jun 28;S0168-8278(20)30399-8. doi: 10.1016/j.jhep.2020.06.025.Online ahead of print.
Beth A Davison1, Stephen A Harrison2, Gad Cotter3, Naim Alkhouri4, Arun Sanyal5, Christopher Edwards3, Jerry R Colca6, Julie Iwashita6, Gary G Koch7, Howard C Dittrich6
Author information
1Momentum Research, Inc., Durham, NC, USA. Electronic address: bethdavison@momentum-research.com.
2Hepatology, Radcliffe Department of Medicine, University of Oxford, UK.
3Momentum Research, Inc., Durham, NC, USA.
4Texas Liver Institute, San Antonio, TX, USA.
5Virginia Commonwealth University School of Medicine, Richmond, Virginia, USA.
6Cirius Therapeutics, Inc., San Diego, CA, USA.
7University of North Carolina, Chapel Hill, NC, USA.
Abstract
Background & aims: Liver biopsies are a critical component of pivotal studies in nonalcoholic steatohepatitis (NASH) constituting main inclusion criteria, risk stratification factors and endpoints. We evaluated the reliability of NASH Clinical Research Network scoring of liver biopsies in a NASH clinical trial.
Methods: Digitized slides from 678 biopsies for 339 patients with paired biopsies randomized into the EMMINENCE study examining a novel insulin sensitizer (MSDC-0602K) in NASH were read independently by three hepatopathologists blinded to treatment code and scored using the NASH CRN Histological Scoring System. Various endpoints were computed from these scores.
Results: Inter-reader linearly weighted kappas were 0.609, 0.484, 0.328, and 0.517 for steatosis, fibrosis, lobular inflammation, and ballooning, respectively. Inter-reader unweighted kappas were 0.400 for the diagnosis of NASH, 0.396 for NASH resolution without worsening fibrosis, and 0.366 for fibrosis improvement without worsening NASH. In the current study, 46.3% of the patients included in the study based on one hepatopathologist's qualifying reading were deemed by at least one of the three hepatopathologists as not meeting the study's histologic inclusion criteria. The MSDC-0602K treatment effect was lowest for those histologic features with lower inter-reader reliability. Simulations show that the lack of reliability of endpoints and inclusion criteria can drastically reduce study power - from > 90% in a well-powered study to as low as 40%.
Conclusions: Reliability of hepatopathologists' liver biopsy evaluation using currently accepted criteria is suboptimal. This lack of reliability may affect NASH pivotal studies by introducing patients who do not meet NASH study entry criteria, misclassifying fibrosis subgroups, and attenuating apparent treatment effects.