PreprintPDF Available

ReplicateBE: linear mixed effect model solution for replicated bioequivalence design with accordance to FDA guideline (EMA model type C)

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Objective of this work is: to provide instrument to make bioequivalence analysis with type C model and a development of a demonstrative code for step-by-step clarification of mixed model computation procedure for any interested developers. ReplicateBE based on REML minimization with direct variance-covariance matrix inversion and forward differentiation of REML function.
ReplicateBE: linear mixed effect model
solution for replicated bioequivalence design
with accordance to FDA guideline (EMA model
type C)
VLADIMIR ARNAUTOV <MAIL@PHARMCAT.NET>
Abstract
Objective of this work is: to provide instrument to make bioequivalence analysis with type C
model and a development of a demonstrative code for step-by-step clarification of mixed
model computation procedure for any interested developers. ReplicateBE based on REML
minimization with direct variance-covariance matrix inversion and forward differentiation of
REML function.
Keywords: bioequivalence, mixed model, replicate design
Introduction
The replicate designed bioequivalence is a powerful approach to get more information about
variation. In some cases the number of subjects required to demonstrate bioequivalence can be
reduced by up to about 50% (Van Peer, A., 2010). For a high variability product, replication
can really improve the precision and provide more complete intra-individual variation
estimate. Also replicate design could be used for reference-scaled average bioequivalence
(RSABE) to demonstrate bioequivalence for highly variable drugs (HVDs).
With accordance to US FDA guideline linear mixed-effects model procedures, available in
PROC MIXED in SAS or equivalent software, should be used for the analysis of replicated
crossover studies for average BE (US FDA).
At this moment linear mixed model effect analysis can be done with proprietary (SPSS,
SAS, Stata) and open source (R:nlme, R:lme4, Julia:MixedModels) software. But not all
statistical mixed models packages support flexible covariance structure fitting with structures
like “heterogeneous compound symmetry” (CSH), FA0(2). This doesn’t means that lme4 or
MixedModels can’t be used for bioequivalence estimation, but CSH structure not available in
this packages and comparison of results performed in SAS/SPSS with lme4 can be
problematically.
Objective of this work is: to provide instrument to make bioequivalence analysis with type
C model and a development of a demonstrative code for step-by-step clarification of mixed
model computation procedure for any interested developers.
Materials and Methods
FDA recommended model can be described with following equation (Patterson, 2002; US
FDA):    
Where  indicates sequence,  - subjects,  treatment, 
indicates replicate on treatment for subjects within sequence . is the response of
replicate on treatment for subject in sequence ,  represents the fixed effect of replicate
on treatment in sequence ,  is the random subject effect for subject insequence on
treatment , and  is the random error for subject within sequence on replicate of
treatment . The ’s are assumed to be mutually independent and identically distributed as

And the random subject effects are assumed to be mutually independent and distributed as

  

 

Following code illustrates an example of program statements to run the average
bioequivalence analysis using PROC MIXED in SAS:
PROC MIXED;
CLASSES SEQ SUBJ PER TRT;
MODEL Y = SEQ PER TRT/ DDFM=SATTERTH;
RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G;
REPEATED/GRP=TRT SUB=SUBJ;
ESTIMATE 'T vs. R' TRT 1 -1/CL ALPHA=0.1;
Statement TYPE=CSH also can be used to match the model described above.
In matrix notation a mixed effect model can be represented as:

And gives Henderson's «mixed model equations»:
 
 


The solution to the mixed model equations is a maximum likelihood estimate when the
distribution of the errors is normal. PROC MIXED in SAS used restricted maximum
likelihood (REML) approach by default. REML equation can be described with following
(Henderson, 1959;Laird et.al. 1982; Jennrich 1986; Lindstrom & Bates, 1988; Gurka et.al
2006):





 


Where 
Where

 

Where
Where N total number of observations, n number of independent sampling units
(subjects), individual response vector, individual design matrix of fixed effects, vector
of fixed effects parameters, individual covariance matrix for the response vector,
individual design matrix of random effects,  covariance matrix of (random effect),
individual covariance matrix of (residual error).
Finding solution for minimization  respectively to can be done with
Newton’s family methods. In ReplicateBE used two-step optimization with Optim.jl package.
First step used box constrained optimization with BroydenFletcherGoldfarbShanno
method ((L)-BFGS)(Fletcher & Roger, 1987; Wright, 2006). Second step used Newton's
Method. Approach with using BFGS method at first step used because is limited as 
 in CSH (SAS implementation) and   in ReplicateBE, but standard
optimization method Newton() not support bracketing (first step can be disabled for
performance reason).
All steps perform with differentiable functions with forward automatic differentiation
using ForwardDiff package. ForwardDiff is a Julia package for forward-mode automatic
differentiation (AD) featuring performance competitive with low-level languages like C++.
Unlike recently developed AD tools in other popular high-level languages such as Python and
MATLAB, ForwardDiff takes advantage of just-in-time (JIT) compilation to transparently
recompile AD-unaware user code, enabling efficient support for higher-order differentiation
and differentiation using custom number types (including complex numbers). The field of
automatic differentiation provides methods for automatically computing exact derivatives (up
to floating-point error) given only the function itself (Revels et al., 2016; Mogensen et al.,
2018).
After solving optimization problem other statistical parameters can be found (Giesbrecht
& Burns, 1985; Hrong-Tai Fai & Corneliu 1996; Schaalje et al 2002):

Where


And 
Degree of freedom (DF) computed with Satterthwaite approximation or with contain”
method (N rank(XZ)). 

Where 

Where is a vector of known constant, C variance-covariance matrix of fixed effects
(var(β)), H hessian matrix of REML function, N total number of observations.
Validation
ReplicateBE was validated with 6 reference public datasets, 22 generated datasets and
simulation study. ReplicateBE version 0.1.4 and 0.2.0 is compliant to SAS/SPSS, values
checked: REML estimate, variance components estimate, fixed effect estimate, standard error
of fixed effect estimate. Validation procedures included in package test procedure and
perform each time when new version released or can be done at any time on user machine.
Confidence interval (95%) for type I error (alpha) is 0.048047 - 0.050733 (10000 iterations).
No statistically significant difference found with acceptable rate (0.05) found (version 0.1.4).
Installation and using
Installation:
using Pkg; Pkg.add("ReplicateBE")
Basic using
using ReplicateBE
be = ReplicateBE.rbe!(df, dvar = :var, subject = :subject, formulation = :formulation, period = :pe riod,
sequence = :sequence);
ci = confint(be, 0.1)
Standard output:
Bioequivalence Linear Mixed Effect Model (status: converged)
-2REML: 329.257 REML: -164.629
Fixed effect:
───────────────────────────────────────────────────────────────────────────────────────────
Effect Value SE F DF t P|t|
──────────────────────────────────────────────────────────── ───────────────────────────────
(Intercept) 4.42158 0.119232 1375.21 68.9135 37.0838 2.90956e-47*
sequence: 2 0.360591 0.161776 4.96821 63.013 9 2.22895 0.0293917*
period: 2 0.027051 0.0533388 0.257206 159.64 5 0.507155 0.612746
period: 3 -0.00625777 0.0561037 0.012441 199.483 -0.111539 0.911301
period: 4 0.036742 0.0561037 0.428886 199.48 3 0.654894 0.51329
formulation: 2 0.0643404 0.0415345 2.39966 207.65 1 1.54908 0.122884
──────────────────────────────────────────────────────────── ───────────────────────────────
Intra-individual variation:
formulation: 1 0.108629
formulation: 2 0.0783544
Inter-individual variation:
formulation: 1 0.377846
formulation: 2 0.421356
ρ: 0.980288 Cov: 0.391143
Confidence intervals(90%):
formulation: 1 / formulation: 2
99.5725 - 114.221 (%)
formulation: 2 / formulation: 1
87.5496 - 100.4293 (%)
Results
ReplicateBE was developed to get mixed model solution to bioequivalence clinical trial.
Discussion
ReplicateBE not designed for modeling in a general purpose, but can be used in situation with
similar structure. Also ReplicateBE based on direct inversing of variance-covarance matrix V,
so computation of  may be time expensive if size of matrix is big. This does not happen in
bioequivalence study where size of V is no more 4 (4 periods). But in general this can be
serious disadvantage. This situation can be avoided using sweep based transformations
(Wolfinger et al., 1994). In ReplicateBE variance structure strictly denoted and can’t be
changed, but it can be a target in package developing path. In ReplicateBE Satterthwaite
degree of freedom (DF) not equal with SAS/SPSS DF estimate in all datasets, the reason and
need for adjustments remains to be clarified.
Acknowledgments
D.Sc. in Physical and Mathematical Sciences Anastasia Shitova a.shitova@qayar.ru
Literature Cited
1. FDA Guidance for Industry: Statistical Approaches to Establishing Bioequivalence, 2001
2. Fletcher, Roger (1987), Practical methods of optimization (2nd ed.), New York: John Wiley & Sons,
ISBN 978-0-471-91547-8
3. Giesbrecht, F. G., and Burns, J. C. (1985), "Two-Stage Analysis Based on a Mixed Model: Large-
sample Asymptotic Theory and Small-Sample Simulation Results," Biometrics, 41, 853-862.
4. Gurka, Matthew. (2006). Selecting the Best Linear Mixed Model under REML. The American Statistician.
60. 19-26. 10.1198/000313006X90396.
5. Henderson, C. R., et al. “The Estimation of Environmental and Genetic Trends from Records Subject to
Culling.” Biometrics, vol. 15, no. 2, 1959, pp. 192–218. JSTOR, www.jstor.org/stable/2527669.
6. Hrong-Tai Fai & Cornelius (1996) Approximate F-tests of multiple degree of freedom hypotheses in
generalized least squares analyses of unbalanced split-plot experiments, Journal of Statistical Computation
and Simulation, 54:4, 363-378, DOI: 10.1080/00949659608811740
7. Jennrich, R., & Schluchter, M. (1986). Unbalanced Repeated-Measures Models with Structured
Covariance Matrices. Biometrics, 42(4), 805-820. doi:10.2307/2530695
8. Laird, Nan M., and James H. Ware. “Random-Effects Models for Longitudinal Data.” Biometrics, vol. 38,
no. 4, 1982, pp. 963974. JSTOR, www.jstor.org/stable/2529876.
9. Lindstrom & J.; Bates, M. (1988). NewtonRaphson and EM Algorithms for Linear Mixed-Effects
Models for Repeated-Measures Data. Journal of the American Statistical Association. 83. 1014.
10.1080/01621459.1988.10478693.
10. Mogensen et al., (2018). Optim: A mathematical optimization package for Julia. Journal of Open Source
Software, 3(24), 615,doi: 10.21105/joss.00615
11. Patterson, S. D. and Jones, B. (2002), Bioequivalence and the pharmaceutical industry. Pharmaceut.
Statist., 1: 83-95. doi:10.1002/pst.15
12. Revels, Jarrett & Lubin, Miles & Papamarkou, Theodore. (2016). Forward-Mode Automatic
Differentiation in Julia.
13. Schaalje GB, McBride JB, Fellingham GW. Adequacy of approximations to distributions of test statistics
in complex mixed linear models. J Agric Biol Environ Stat. 2002;7:51224.
14. Van Peer, A. (2010), Variability and Impact on Design of Bioequivalence Studies. Basic & Clinical
Pharmacology & Toxicology, 106: 146-153. doi:10.1111/j.1742-7843.2009.00485.x
15. Wolfinger et al., (1994) Computing gaussian likelihoods and their derivatives for general linear mixed
models doi: 10.1137/0915079
16. Wright, Stephen, and Jorge Nocedal (2006) "Numerical optimization." Springer
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We present ForwardDiff, a Julia package for forward-mode automatic differentiation (AD) featuring performance competitive with low-level languages like C++. Unlike recently developed AD tools in other popular high-level languages such as Python and MATLAB, ForwardDiff takes advantage of just-in-time (JIT) compilation to transparently recompile AD-unaware user code, enabling efficient support for higher-order differentiation and differentiation using custom number types (including complex numbers). For gradient and Jacobian calculations, ForwardDiff provides a variant of vector-forward mode that avoids expensive heap allocation and makes better use of memory bandwidth than traditional vector mode. In our numerical experiments, we demonstrate that for nontrivially large dimensions, ForwardDiff's gradient computations can be faster than a reverse-mode implementation from the Python-based autograd package. We also illustrate how ForwardDiff is used effectively within JuMP, a modeling language for optimization. According to our usage statistics, 41 unique repositories on GitHub depend on ForwardDiff, with users from diverse fields such as astronomy, optimization, finite element analysis, and statistics. This document is an extended abstract that has been accepted for presentation at the AD2016 7th International Conference on Algorithmic Differentiation.
Article
Full-text available
Algorithms are described for computing the Gaussian likelihood or restricted likelihood corresponding to a general linear mixed model. Included are arbitrary covariance structures for both the random effects and errors. Formulas are also given for the first and second derivatives of the likelihoods, thus enabling a Newton-Raphson implementation. The algorithms make heavy use of the Cholesky decomposition, the sweep operator, and the W-transformation. Also described are the modifications needed for variance profiling, Fisher scoring, and MIVQUE(0), as well as the computational order of the procedures.
Article
Full-text available
The MIXED procedure of SAS® has made use of the linear mixed model accessible to researchers. However, a sticky problem for the procedure has been the specification of appropriate denominator degrees of freedom for test statistics for fixed effects in both balanced designs with simple covariance structures and complex designs involving complicated covariance structures, unbalanced data and/or small sample sizes. This paper compares the denominator degrees-of-freedom options in Proc MIXED.
Book
Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.
Article
A two-stage analysis for the mixed model in which variance components due to the random effects are estimated and used to compute generalized least squares estimates of fixed effects is developed. Large-sample theory is used to establish asymptotic properties. An approximate t test that can be used to test linear contrasts among fixed effects is discussed. Two modest simulations, based on a model for a grazing trial (Burns, Harvey, and Giesbrecht, 1981, Proceedings of 14th International Grassland Conference, J. A. Smith and V. W. Hays (eds), 497-500, Boulder, Colorado: Westview Press; Burns et al., 1983, Agronomy Journal 75, 865-871) are used to show that the asymptotic results are reasonable for small samples.
Article
Approximate t-tests of single degree of freedom hypotheses in generalized least squares analyses (GLS) of mixed linear models using restricted maximum likelihood (REML) estimates of variance components have been previously developed by Giesbrecht and Burns (GB), and by Jeske and Harville (JH), using method of moment approximations for the degrees of freedom (df) for the tstatistics. This paper proposes approximate Fstatistics for tests of multiple df hypotheses using one-moment and two-moment approximations which may be viewed as extensions of the GB and JH methods. The paper focuses specifically on tests of hypotheses concerning the main-plot treatment factor in split-plot experiments with missing data. Simulation results indicate usually satisfactory control of Type I error rates.
Article
Since the early 1990s, average bioequivalence (ABE) studies have served as the international regulatory standard for demonstrating that two formulations of drug product will provide the same therapeutic benefit and safety profile when used in the marketplace. Population (PBE) and individual (IBE) bioequivalence have been the subject of intense international debate since methods for their assessment were proposed in the late 1980s and since their use was proposed in United States Food and Drug Administration guidance in 1997.Guidance has since been proposed and finalized by the Food and Drug Administration for the implementation of such techniques in the pioneer and generic pharmaceutical industries. The current guidance calls for the use of replicate design and of cross-over studies (cross-overs with sequences TRTR, RTRT, where T is the test and R is the reference formulation) for selected drug products, and proposes restricted maximum likelihood and method-of-moments techniques for parameter estimation. In general, marketplace access will be granted if the products demonstrate ABE based on a restricted maximum likelihood model. Study sponsors have the option of using PBE or IBE if the use of these criteria can be justified to the regulatory authority. Novel and previously proposed SAS®-based approaches to the modelling of pharmacokinetic data from replicate design studies will be summarized.Restricted maximum likelihood and method-of-moments modelling results are compared and contrasted based on the analysis of data available from previously performed replicate design studies, and practical issues involved in the application of replicate designs to demonstrate ABE are characterized.It is concluded that replicate designs may be used effectively to demonstrate ABE for highly variable drug products. Statisticians should exercise caution in the choice of modelling procedure. Copyright © 2002 John Wiley & Sons, Ltd.
Article
In 2008, the European Agency for the Evaluation of Medicinal Products released a draft guidance on the investigation of bioequivalence for immediate release dosage forms with systemic action to replace the former guidance of a decade ago. Revisions of the regulatory guidance are based upon many questions over the past years and sometimes continuing scientific discussions on the use of the most suitable statistical analysis methods and study designs, particularly for drugs and drug products with high within-subject variability. Although high within-subject variability is usually associated with a coefficient of variation of 30% or more, new approaches are available in the literature to allow a gradual increase and a levelling off of the bioequivalence limits to some maximum wider values (e.g. 75-133%), dependent on the increase in the within-subject variability. The two-way, cross-over single dose study measuring parent drug is still the design of first choice. A partial replicate design with repeating the reference product and scaling the bioequivalence for the reference variability are proposed for drugs with high within-subject variability. In case of high variability, more regulatory authorities may accept a two-stage or group-sequential bioequivalence design using appropriately adjusted statistical analysis. This review also considers the mechanisms why drugs and drug products may exhibit large variability. The physiological complexity of the gastrointestinal tract and the interaction with the physicochemical properties of drug substances may contribute to the variation in plasma drug concentration-time profiles of drugs and drug products and to variability between and within subjects. A review of submitted bioequivalence studies at the Food and Drug Administration's Office of Generic Drugs over the period 2003-2005 indicated that extensive pre-systemic metabolism of the drug substance was the most important explanation for consistently high variability drugs, rather than a formulation factor. These scientific efforts are expected to further lead to revisions of earlier regulatory guidance in other regions as is the current situation in Europe.