Error models and objective functions
Objective and introduction
Objectives
The objectives are:
- To recognise the influence of the residual error model on parameter estimation.
- To understand how the residual error model contributes to the objective function.
- To appreciate how an equation solver is used to minimize a function.
Introduction
"In Latin, error means "wandering" or "going away" (1).
Errors are not necessarily incorrect. The language term error may have different concrete and metaphorical meanings: a mistake; a difference or deviation or departure from an expected value or intended procedure; uncertainty or inaccuracy; or even deception (2). Different interpretations of the language term error may be misleading. Error may represent variation and does not necessarily imply that a value is false. Statistically and mathematically the term error refers to a difference from the expected value. This may be due to random chance, biological variability or a mistake. The latin definition above encompasses themes of randomness and deviation.
The term ‘Mixed’ in NONMEM refers to the consideration of both fixed effect and random effects. NONMEM considers two main types of variability: between subject variability (BSV) and residual unexplained variability (RUV). RUV can be thought of as random and unexplained variability. Random error is always present and is unpredictable. It can be increased by using the wrong structural model. One of the aims of modelling is to minimize RUV by describing as much of the BSV as possible.
The nomenclature used to describe error in modelling and error models can be confusing for the novice trying to understand the semantics or precise meanings. The terms variability and variance can cause some confusion in their different uses in statistics and modelling compared to general English language usage. Variance means a difference or disagreement, but in statistics refers the average squared difference of a value from the mean of the distribution of the values. Pharmacometricians use variability to refer generally to the differences of a value from its mean. It is often described by the standard deviation (i.e. the square root of the variance) or a coefficient of variation (standard deviation divided by the mean). The variability of a parameter, such as clearance, in a population is called the population parameter variability (PPV).
Residual refers to what is ‘left over’. Residual error (RE) is used to describe what is left over after all other sources of variability have been accounted for. This typically means the difference between an observation and the model prediction of the observation. A residual is the difference between the observed and predicted values. The distribution of residuals can be described by the statistical variance. The term variability can refer to the variance, the standard deviation or the coefficient of variation of the residuals. This variability is known as the residual unexplained variability (RUV). The RUV term describes variations that arise from several sources (measurement error and model misspecification being the most obvious). Analysis of residuals is important for choosing the correct error model, and forms part of model building. Residual error modelling involves the analysis of residuals. Residual plots can be assessed as part of model diagnostics to aid in discriminating between error models and as part of goodness of fit evaluation.
To some extent the difficulties in understanding the semantics may come from misconceptions about error and how it is measured or estimated, as well as from broader philosophical differences between the statistical and pharmacometric approaches to measurement and perceptions of error. Understanding error is fundamental to pharmacometrics and all systems of ‘-metrics’ where explanation and understanding is the ultimate goal of quantification rather than mere description.
In pharmacometrics error is measured and modelled. The measurement of error in itself can involve estimation (and hence error in the form of imprecision and uncertainty). So there is error in error, imprecision in variance. All models involve some form of regression, and more sophisticated models for complex systems use nonlinear regression.
There are different terms used to describe concepts of error or deviation in statistics and modelling. Measuring errors involves measuring the difference between two values. The two values compared can be within the same individual, the same sample, the same population. Alternatively comparisons can be made between the individual and the overall sample or the ‘rest of the sample’ (e.g. the mean of all observations, or the mean of all observations except the individual). Or we can compare the means (or other parameters) of two populations, or a sample and a population. Descriptive measures describe the sample parameters and distribution. Inferential statistics infers from the sample to the population.
Individual observations can be compared to individual predicted values (Yobs-Ypred). Or individual values can be compared to the mean or other measures of central tendency, in this way incorporating variation. The value may be the observed or the predicted or calculated value.
Deviation describes the difference between the individual value and the mean, and is a measure of the spread of values (variance). Residual describes the difference between observed and predicted values (Yobs-Ypred).
Statistical terms used to describe error can encompass measures of spread or dispersion (SD, Variance) and measures of precision (SE, CI). Statistics makes a distinction between terms that refer to the sample data (statistics) and terms that refer to the parameters of a the distribution assumed for the population data. For example sample mean (x) and population mean (μ); sample variance (s²) and population variance (σ²); sample SD (s) and population SD (σ).
In pharmacometric modelling standard deviation, coefficient of variation and confidence intervals provide information about variability. But this variability is “an estimate of how well data is described by the parameters of the specified model” (3, p. 96). It is not an estimate of population variability in the parameters but is a kind of measure of goodness of fit, or reliability.
Variance in statistics describes spread, or variation from central tendency. In other words, spread is how far away each observation if from the mean. Other descriptions of spread include range, percentiles, and SD. Variance is SD², and is computed as the average squared deviation of each number from its mean. Population parameters are estimated from samples.
Sample variance is:
s²=Σ(x-x)²
n -1
Standard deviation (SD) is one of the most common measures of spread in statistics. Deviation is a term used to describe the spread or dispersion of a collection of observations. In statistics SD is a property of a normally distributed population that describes the spread of a population around the mean. SD is the square root of variance, (SD = s = σ i.e. = √σ²). SD has the advantage of being expressed in the same units as the data. If there are extreme scores then SD is less sensitive than range.
Error measures and differences may be absolute or relative. Standard deviation and variance are absolute, and are not standardized to the mean. Accuracy is referenced to the true value and is expressed as a percentage. Coefficient of variation, standard error and precision are standardised to the mean and so are relative and expressed as percentages.
Coefficient of variation (CV) is a measure of dispersion or variability. Typically reported as a percentage, it is the ratio of standard deviation to the mean. CV = SD/mean x 100%
As CV is relates SD to the mean it is a form of relative or proportional error.
Since it is dimensionless, it can be used to compare the spread of data for datasets with different units of measurement. It is generally part of the Wings for NONMEM output reported alongside parameter estimates. The disadvantage of CV is that it cannot be used to construct confidence intervals. If the value of CV is very close to zero then it is meaningless, being very sensitive to small changes in the mean. High CV (greater than 10-20%) could be caused by a number of different problems: over or under parameterization, incorrect compartmental model, sparse data, or suboptimal sampling times (3, p. 96).
Standard error is an estimate of precision, and is reported as a percentage. Standard error does not describe the variability of the sample data. Standard error estimates the standard deviation of the error. That is the standard deviation of the difference between the true values and the estimated or measured values. It assumes that if multiple samples were taken, the parameter of interest from each sample would be normally distributed, and indicates the spread or distribution. As the true value is usually unknown, standard error is an estimate. The size of the standard error provides an observable estimate of an unobservable error. Standard error of the mean is calculated from SD and sample size and is the standard deviation of the sampling distribution of the man (SEM = SD/ √n). 1 SEM means 68% chance that the true population mean lies within 1 SE of the sample mean. Standard error can be used to calculate confidence intervals. There are different formulas for different standard errors. Wings for NONMEM output provide a % standard error for each parameter estimate.
Confidence intervals (CI) define a range of values that are likely to include the population parameter. CI can only be calculated from SE if the sample sizes are large; otherwise a ‘t’ distribution is used. CI do not define the sample data distribution. Confidence intervals illustrate the accuracy of sample estimates and show the magnitude of effect in the original units of measurement.
Significance may be clinical or statistical. Z transformation is a statistical significance test Z = mean difference/( σ/ √n)
Sources of error
Why does error occur inevitably in pharmacometric models? Real life constraints mean that experiments typically involve only a limited number of observations, drawn at limited number of times from a limited sample of the population of interest. Modelling involves assumption and simplification and all models involve some form of misspecification. Errors that occur in model building can include data entry errors, incorrect structure, biased covariate selection and visual perceptual errors of goodness of fit.
Execution errors result from differences between nominal protocol and actual execution of study. They include mistakes in dose amount, timing, duration, route, formulation, and recording; mistakes in the timing of samples; mislabelling samples with incorrect subject or timing information; errors in sample preparation and storage.
Systematic errors lead to bias in measurement of values. Systematic errors are relatively common in data collection, and may result from a change in the environment or imperfections in the measurement technique. Drift is a form of systematic error that occurs over time and is often easily detectable. Calibration of the measurement technique aims to minimize systematic error. Systematic error can occur in any direction and may be proportional or constant. It is particularly important that the independent variable is measured accurately. That all the error is in the dependant variable is an assumption common to most error models and forms of regression.
Random errors can be decreased by multiple measurements and increasing sample size. Measurement error accounts for a significant amount of residual variability in models. Measurement errors can include random and systematic errors. The assumption that the dependant variable can be accurately measured is rarely correct in the laboratory or in the clinical setting. Calibration curves may help to minimize and quantify systematic error (by measuring known quantities). Accuracy describes the closeness of a measurement to its true value. Accuracy is calculated from bias and expressed as a percentage relative to the true value. Bias is the difference between the true and measured value (Bias = True value-Measured value). If there are only two repeated measurements then Bias mean = (bias a + bias b)/2. Accuracy is thus a relative term, related to the true value.
Precision is concerned with reproducibility and measures the error of repeated measurements. Precision is calculated from the standard deviation of repeated measurement values (replicates) divided by their mean, expressed as a % (SD/mean * 100%). In scientific measurement precision is used to define the lower limit of quantitation (LLOQ). The lowest intra-assay control measurement with a precision of less than or equal to 20% is used to define the LLOQ. LLOQ is used because error is more common in the low concentration range due to the way that dilution is involved in creating standards. Precision is a relative term, related to the mean.
Types of error models
Error modelling is used to describe the variability in how well the data is described by the parameter.
Residual error models aim to minimize the differences between observed and expected or predicted values. There are two main categories of error models: parametric and nonparametric. Parametric methods are generally preferred as although they require more assumptions, they are more powerful and robust.
Error may be absolute or relative. Absolute error refers to the size or magnitude of the deviation. Relative error is standardized (usually to the mean). Error can be constant or proportional. For example if a clock used for sample timing is 30 minutes fast then this is a constant systematic error.
Three types of error models are used: Additive, Poisson, Proportional. Poisson models are rarely used in pharmacometrics as they only apply to a particular distribution where SD = mean, for example rate of events/time period. Poisson distributions have only 1 parameter (mean). Combined error models use both additive and proportional error. Often proportional models are used for RUV in NONMEM. The data will determine the appropriate error model to use.
Error Model | WLS | ELS | NONMEM |
Additive | W=1 | Var=SD² | Y=Y+EPS |
Poisson | W=1/Y | Var=SD²*Y | Y=Y+SQRT(Y)*EPS |
Proportional | W=1/Y² | Var=SD²*Y² | Y=Y+Y*EPS |
Regression and error
Regression is the term given to the techniques of analysis of numerical data that involve modelling the dependant variable as a function or independent variables, with parameters and an error term. The error term represents random unexplained variation in the dependant variable.
Regression analysis may be simple or multiple; linear or nonlinear. Linear regression finds values of parameters that define the line that minimizes the differences between the line and the points of observation (4). Linear regression can estimate parameters in a single step while nonlinear regression starts with initial estimates and then requires a series of steps or iterations (3, p. 57). Regression analysis can be performed on simple spreadsheets such as Excel or with more complex software. Mathematical modelling of data aims to find parameter estimates that reduce the differences between the observed data points and the predictions. The eyeball method of drawing a line of best fit is the simplest example. Computer based algorithms use criterion for best fit, commonly based on least squares criteria.
Regression using the least squares criteria means that the aim is “to reduce the square of the difference between the observed and the calculated y-values (the difference in the vertical direction) (3, p. 16). Squared residual = (Yobserved - Ycalculated)².
Summing the squared residuals (SSR) is known as the ordinary least squares criteria (OLS). Squaring residuals results in only positive values, but gives more weight to large deviations.
Error models involve different types of regression analysis: least squares, Bayesian linear regression and nonparametric. Different Least Squared methods exist to evaluate the fit of the model to the data: extended least squares ELS, iteratively reweighted least squares ILS, weighted WSS.
Transformation and error
Data may be transformed by different methods to allow linear regression or to allow parametric methods of analysis. Examples include Z transformation (multiples of SD or SEM); log transformation; or taking the square root of data. Transformed data may not reflect biology but can be easier to analyse using simpler mathematics.
Weighting
Weighting assigns a factor to each data point to account for the accuracy of that data point and determine the relative importance of a value in the fitting process. The weighting factor in WSS is the reciprocal of the variance for that data point.
The shape of the data plots may suggest if a weighting scheme is needed to deal with overestimation of large concentrations and underestimation of small concentrations. Log transformation provides more weighting to lower values.
If the right weighting method is used there is a better chance of getting the right answer. However weights may be mis-specified which may increase error. Different methods of least squares regression provide different weighting to some observations. ELS is better able to deal with the weighting problem than OLS which pays more attention to high values.
There are several assumptions for regression analysis using least-squares methods:
- sample is representative of population
- model represents true relationship (expected residual error zero)
- no tracking or correlation between parameters and independant variable (linearly independant)
- parameters and residual variables vary over complete range.
- normal distribution
- Homoscedasticity – the variance of the residual error should be constant for all values of the independent variables.
Iteratively reweighted least squares (ILS) assigns different weights for each parameter.
Weighted Least Sum of Squares (WSS), is a minimization criteria calculated by pharmacometric software. As with all objective criteria, WSS should not be considered in isolation as a model may have different WSS value for equally good fits. The assumptions required are that the measurement error variance is known and independent of parameters. Weights are calculated as the inverse of measurement error variance. This compensates for violation of homoscedasticity since the method differentially weights cases so that those corresponding with small variances in the independant value contribute more to the fit of the regression line (5). Standard errors are typically smaller, although the parameter estimates may be similar. WLS is only appropriate when there is relatively low noise in the data. Sampling error may be magnified as measured values are used to quantify error variance.
Extended Least Squares (ELS) is commonly used iterative technique in pharmacometrics, developed 30 years ago by Stuart Beal, and used in NONMEM. Other software using ELS was developed by Nick Holford (MKMODEL). The coefficients of the variance model are determined during data analysis and parameter estimation, rather than prior to the regression or fitting process starting. At least two or more data points are required for each parameter (3, p. 88). ELS can estimate SD (whereas WLS cannot). ELS gives more flexibility, can estimate power and gives more accurate parameter estimates. Measurement error variance depends on the model parameters. ELS is more appropriate for noisy data than WLS, but requires more data points. The equation for objective function is altered.
Objective functions
Objective functions (OF) are statistical criterions applied to nonlinear regression models as an objective measure of the differences between the observed and predicted values of parameters and the dependant variable.
Criteria are selected for the model, depending on the purpose of the model, the data, and the population the data attempts to describe. In WinNonLin and NONMEM there are three levels of objective function: Ordinary Least Squares (OLS); Weighted Least Squares (WLS); and Extended Least Squres (ELS). These are iterative techniques of nonlinear regression with different advantages and disadvantages as explained above. The iterative technique targets the minimum objective function value.
The concept that “Smaller objectives are better” means that aim is to find a parameter estimate that minimizes the objective function. However the absolute value is meaningless and it is the reduction in OF that is important. A reduction of 3.84 corresponds to a p < 0.05 (from chi squared distribution with one degree of freedom). Parameter estimates may be similar, but the standard error is reduced with the optimal method.
Minimisation of objective function value (OFV) is an important part of model discrimination and assessment of goodness of fit. But other methods of model evaluation are necessary and a model with a lower OFV should not be accepted if there is no change or worse graphical fit.
Dealing with outliers
Graphically ‘outlier’ is the term often used to describe values that may represent error. However outliers and error are not synonymous. The term “outlier” is used to describe an observation that is unexpectedly different from the other values (compared with either the next closest values or the mean). Outlying values may be due to error, chance, or they may reflect variation and a ‘true’ difference. Depending on the nature and source of the observations the variability may exist because the observation comes from a different population to the others. Review of the experimental procedures and data handling may suggest the likelihood of the outlier being an erroneous value. Understanding the population the sample is drawn from may assist in assessing whether the outlier could plausibly be due to variation occurring biologically. Various statistical methods exist to deal with outliers. In general terms the magnitude of ‘difference’ is quantified (eg. mean-outlier), and then is divided by a measure of the spread of values (eg range or standard deviation). Z scores are often used for this purpose ( Z = (mean-value)/SD). Z scores work best if the mean and SD are alredy known for the population which was sampled. Statistics such as p values can be calculated to estimate the probability of this result occurring by chance (or randomly). Grubb’s test uses the Z value of the largest single outlier and compares this with a table of critical Z values for the sample size, where calculate values greater than critical values imply p< 0.05 (6) . In general there are different approaches of dealing with outliers. If data points are felt to be erroneous they should be commented out rather than deleted.
Introduction by Anita Sumpter (2008).
References
- http://simple.wikipedia.org/wiki/Error
- http://www.encyclo.co.uk/define/Error
- Bourne D. Mathematical Modeling of Pharmacokinetic Data. CRC Press, Boca Raton, 1995.
- http://curvefit.com/linear_vs__nonlinear.htm
- Spilker ME, VIcini P. 2001. An evaluation of extended vs weighted least squares for parameter estimation in physiological modelling. J Biomed Inform. 34(5):348-64.
- http://www.graphpad.com/articles/outlier.htm, Motulsky, H. "Detecting Outliers" .
Excel
In order to use the Excel Solver function you may need to install it on your computer. First see if it is installed by looking in the Data Analysis section of the ribbon control. If you don't see the word Solver you need to install it as follows:
Click on the Excel circular button at the top left hand side of the Excel menu.
Click Excel Options, Add-Ins, Go...
Check the Solver Add-in box then click OK.
Parameter estimation
Find the file Pharmacometrics Data\Error Models and Objective Functions\olsels.xls
- Open olsels.xls in Excel.
- Find the linols worksheet.
- Examine the linols worksheet.
- How is the value of Ytrue computed?
- What are the parameters of this model?
- Find the cells of the sheet that specify the parameters (Pest column). Change the parameter values until you think you have got the predicted Y line (blue) to 'fit' the Yobs (yellow triangles) as best as you can using the 'eyeball' method.
- Note your parameter value estimates and the value of the OLSi sum of squares (ss).
- Use the Data Analysis Solver to use the Excel Solver to minimize the OLSi ss.
- Experiment with the Solver Options to find out what they do.
- Compare the Solver parameters and OLSi ss with your 'eyeball' estimates.
- Repeat steps 3.1 to 3.6 with the linels, bo1ols, bo1els and ka1ss worksheet. Use the OLSi ss for the ols sheets and ELSi ss for the els and ka1ss sheets.
- Copy the linels worksheet to a new worksheet emaxels and implement an Emax model instead of the linear model. Save this worksheet as a separate file. You may choose to implement any one (or all 3) of the objective function methods with the Emax model.
Learning
- Summarise the results you got from the Workshop Parameter Estimation steps..
- You may find the article: Detecting outliers, of interest in relation to understanding the observations that error models try to describe.