# User-defined models

### Objective and introduction

# Objective

The objective is to learn how to define common pharmacological models using Microsoft Excel and four dedicated modelling applications: Berkeley Madonna, Monolix and NONMEM

# Introduction

"All models are wrong, some models are useful" George Box 1979 (1)

A model is a representation of observed data or an observed system. Models may be useful to describe a collection of observations and their relationship with one another. More importantly, in attempting to explain and predict observations and their relationships, modelling can be the mechanism to connect description and understanding. Pharmacometrics involves the analysis and interpretation of data and seeks to bridge stamp collecting and physics (to paraphrase Ernest Rutherford quote (2)).

Metrics is a term used to denote a system of measurement. It involves a certain way of assessing and interpreting parameters, and is used in a variety of fields including economics, education, politics as well as pharmacology. However in pharmacometrics, as with any metrics system, quantitative measurement and estimation does not automatically imply validity and truth.

Models vary in complexity; but inherently modelling involves simplification. Not only does the model have to deal with the known knowns, the known unknowns, but perhaps also the unknown knowns, and the unknown unknowns (paraphrasing a confused/confusing quote from Donald Rumsfield (3)). So while no model can ever speak the truth, a model may be more or less valid depending on the perspective of the modeller and the problem to be solved. There is error inherent in the process of observation and collection of information. Error and variability exists in every step of model building and testing. More useful models attempt to quantify and classify error and variability.

"In pharmacology models are ultimately expressed by mathematical equations" (Bourne 1995 (4)). Equations can encapsulate and describe large amounts of information. Differential equations are used to describe rate of change of a variables over time. There are different numerical methods of solving differential equations such as the Runge-Kutta method (RK4) developed by mathematicians several decades ago. The robustness of this and other forms of iterative solution has been demonstrated by their successful use for satellite exploration of the solar system. Improvements in accuracy with which a model describes observations and estimatesparameters, may come from more powerful and robust mathematical functions and repeated observations.

Pharmacometrics uses mathematical modelling enabled by software applications to estimate parameters and define and predict relationships between variables. Several different computer software programs can be used for this purpose, all with unique features and limitations. These software programs can render complex mathematics useful to the nonmathematician in solving pharmacometric problems.

In order to use pharmacometric modelling, it is necessary to understand the components of model equations. Equations summarise large amounts of data with a small number of parameter values. A parameter is a constant that is estimated, e.g. volume of distribution, clearance, rate constants for elimination/absorption/equilibration, EC50. Parameters are adjusted during the modelling process to obtain the best fit to the data. Variables that are fixed and do not change in the modelling process are true constants e.g. dose, infusion time. The dependant variable is the variable whose value the equation describes, e.g. concentration in pharmacokinetic equations, effect in pharmacodynamic equations. Parameters and the dependant variable are related by mathematical operators. Time is the only truly independent variable, as it is the only variable we cannot control.

In pharmacology models can not only describe data such as dose-concentration, concentration-effect, but more usefully predict these relationships. This is most useful if the model can be used to define parameters not just for the data observed for particular individuals at particular times, but for different individuals at different times. Population modelling has been used since the 1970s to describe and predict pharmacokinetic, pharmacodynamic and linked PKPD relationships for a population rather than just individuals. NONMEM is the most widely used software tool for population modelling, and uses Fortran source code.

Simulation involves the generation of data from a proposed model equation. Simulation has a role in both model generation and model evaluation. Software programs such as Excel and Berkeley Madonna involve simulation of data. Software programs such as NONMEM and Monolix use nonlinear regression analysis to simulate data and estimate parameters. Observed values of collected data can be compared with predicted values for the dependant variables.

Differences in predicted and observed values may be due to error and variation. The goal of model development is to find parameter values that decrease the differences between predicted and observed values, in other words to develop a model that fits the data and describes it well. Models are evaluated by tabular and graphical representations and statistical methods. Error modelling is used to describe the variability in how well the data is described by a parameter. In fact measurement of error involves estimation. Statistical error describes the difference between observed and expected values. Residual error is an observable estimate of unobservable statistical error. Residuals is the term given to describe the difference between group mean and individual values in a sample. Standard error (SE) is a measure of precision. It is used to estimate a population parameter from a sample. Standard error is used to define a range in which the true population mean value should lie. Confidence intervals are calculated from standard error. Coefficient of variation is another measure of variability, and is equal to the standard deviation divided by the mean (multiplied by 100 to express as a percentage). All modelling solutions involve assumption. A common assumption is that the independent variable can be measured without significant error, however this is unlikely to be always true.

Modelling is used to quantify and summarise data and furthermore to explain and explore mechanism and predict values. The usefulness of a model depends on the problem to be solved, the population to which it is applied, and the perspective of modeller.

There are different types of structural models used in pharmacology: compartment models, mechanistic or physiological models, empirical models, and hybrid approaches. Traditional compartmental models describe a number of well stirred compartments. The number of compartments and structure of the model is determined by the data and route of administration. Noncompartmental modelling can be described as including nonparametric data, and a more general approach rather than a structural system. Mechanistic models are based on physical and physiological principles. For example physiological models may include factors describing blood flow, elimination rate, partition coefficients, diffusion, and the kinetics of receptor binding. Empiric modelling however requires few assumptions about the data generating mechanisms and is useful when little is known about underlying physiological processes (black boxes).

Model equations and parameter estimates are useful to summarise large amounts of data. Excel & Berkeley Madonna simulate data using initial estimates and a model equation. This is a potentially useful step for developing a model. MONOLIX and NONMEM are more powerful software packages that perform nonlinear regression analysis and can be used to estimate parameters, in addition to simulation.

Introduction by Anita Sumpter (2008).

## References

- Box GEP. Robustness in the strategy of scientific model building. In: Launer RL, Wilkinson GN, editors. Robustness in Statistics; 1979. p. 202.
- Reference to Ernest Rutherford quote "Science is either stamp collecting or physics". http://en.wikiquote.org/wiki/Ernest_Rutherford. All science is either physics or stamp collecting. As quoted in "Rutherford at Manchester" (1962) by J. B. Birks.
- Paraphrased Donald Rumsfield quote http://en.wikiquote.org/wiki/Donald_Rumsfield. "Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know. And if one looks throughout the history of our country and other free countries, it is the latter category that tend to be the difficult ones. " Department of Defense news briefing, February 12, 2002 [1].
- Bourne D. Mathematical Modeling of Pharmacokinetic Data. CRC Press, Boca Raton, 1995. Preface x.

### Workshop hints

Note: All files should be loaded from and saved to your Pharmacometrics Data\User Defined Models folder for this assignment. You may find it useful to create a separate folder for each program inside the User Defined Models folder as you will create several files during this assignment.

Some hints (shown in grey boxes) refer to changes needed for subsequent problems.

## You may find it easier to read this page if your click on the "[Hide Side Menu]" text at the top

### Excel

Find the file **Pharmacometrics
Data\User Defined Models\3models.xls**

- Open 3models.xls with Excel.
- Look at each of the 3 worksheets (bo1, k01, emaxc).
- Identify:
- The model equation
- The model parameters
- The independent variable

- Export the simulated data from each of the worksheets in 3models.xls to a format that other programs can read using the following steps
- Create
a new Excel workbook to make a data file for each worksheet.
Add the following headings in the top row of the data file. It
is important to put the # before ID so that the file can be
used
for NONMEM. The DV column name means Dependent Variable (i.e. the
observation).

#ID TIME DVFor emaxc you should use #ID CONC and DV headings for the data file columns.

- Fill
the data file #ID column down to row 12 with the value 1.
Copy the values from column C in the bo1 model worksheet into the data file TIME
column. Copy the
concentrations from column E (simulated observations) in the bo1 model
worksheet into the data file DV column ().
Important: use Paste Special Values rather than a simple paste to prevent the formulae in column E being copied to the data file.

The data file for the bo1 model should look like similar to Table 1, but your DV values will be different.

A B D 1 #ID TIME DV 2 1 0 9.9 3 1 1 7.2 4 1 2 5.3 5 1 3 4.6 6 1 4 2.4 7 1 5 3.2 8 1 6 2.6 9 1 7 1.1 10 1 8 -1.4 11 1 9 0.4 12 1 10 1.8 - Now save the data file using 'Save As' and choose CSV (Comma delimited, *.csv) format.
- Save in your Pharmacometrics Data\User Defined Models folder
- Name the file bo1.csv (Be careful -- use bo1.csv NOT b01.csv (oh NOT zero).
- Repeat these steps to create k01.csv and emaxc.csv files from the worksheets in 3models.xls.
- The k01.csv data file should look like this:
A B E 1 #ID TIME DV 2 100.53 117.84 1214.15 1319.66 1424.47 1525.48 1626.59 1729.110 1830.611 1930.712 11030.4 - The emaxc.csv data file should look like this. Check that the column B heading is CONC.
A B C 1 #ID CONC DV 2 10-1.53 1118.94 1232.95 1360.16 1451.97 1568.58 1669.09 1774.810 1877.111 1972.612 11069.3

### Berkeley Madonna

- Open Berkeley Madonna shortcut in folder "Pharmacometrics Programs".
- Add code to the Equations window defining the STARTTIME, STOPTIME, integration stepsize (DT), the output interval (DTOUT), the model parameters and the model equation (code shown in Figure 1 for the bo1 model).
- Click on Run to run the model.
- Save your model with the name bo1.mmd (in your Pharmacometrics Data\User Defined Models\Madonna folder).
- View a table of times and concentrations by clicking on the Table icon in the Run 1 graph window.

Repeat the above steps for the k01 and emaxc models. Use 3models.xls to find the formula to code the model equation for "Conc" for k01 and "Effect" for emaxc. Use parameter names and values which are the same as those in the 3models.xls worksheets. Berkeley Madonna is not case sensitive but it is good practice to try and use the same case for any variables you use in the code.

STARTTIME = 0

STOPTIME = 10

DT = 0.02

DTOUT = 1

Dose = 100

CL = 3

V = 10

Conc = Dose/V*EXP(-CL/V*Time)

Figure 1. Code for bo1.mmd (Berkley Madonna)

Hint: for the Emax model which uses Conc as the independent variable you may wish to rename the default variable called Time with a RENAME statement:

RENAME Time=ConcPut this statement after the DTOUT statement.

STARTTIME, STOPTIME and DTOUT should still used to define the starting, ending and output interval values for Conc.You will need to add Effect as a variable to the graph by clicking on the graph and using the Graph menu.

### Monolix

- Open the Monolix shortcut in the folder "Pharmacometrics Programs".
- If you are asked for a Monolix licence key then open the "Pharmacometrics Programs\license.txt" file. Then copy and paste the Monolix license key. You will only need to do this the first time you use Monolix.
- Click on the 'New Project' icon.
- In Monolix, click 'Data' and use 'Browse' to locate your bo1.csv file. This can be done by copying the path to the folder from Windows Explorer into the Monolix "Load data file File name:" box. e.g. P:\My Pharmacometrics\Pharmacometrics Data\User Defined Models.
- Select bo1.csv and click 'Open'. The data information window will appear.
- Monolix should identify that your input file has a header
row and your data should appear under the headings: _ID TIME OBSERVATION. Click "Accept" and "Next" if Monolix has assigned the
correct headings.

You can select the type of data associated with the headings manually by clicking the drop down menus. If the independent variable is not TIME then select REGRESSOR for the type. You will need to do this for the CONC variable for the emaxc.csv file.

- Instead of using the model library, we will use MLXTRAN to write out the bo1 model. Start by opening a text editor (e.g. Editplus) and enter the code shown in Figure 2.
- Save the model as bo1_mlxt.txt in a User Defined Models\Monolix folder.
- In Monolix, click 'Structural model' and then 'Browse'. Navigate to your "User Defined Models"
folder, select bo1_mlxt.txt and open it. You may have to wait a few seconds for the model to
compile.

If you get a compile error: Double check that your code is identical to that shown in the Figure and be sure to press ENTER after the last line of code - this is required to 'end' the last statement.

- Change the initial parameter estimates by editing their values at the bottom of the 'Initial Estimates' window that appears. A plot of predictions based on the model and initial parameter estimates will be displayed along with observed values. Once these are reasonable, click 'Set as initial values'.
- Click on 'Statistical Model and Tests'.
- Click on the small circle in the blue boxes to enable "Population Par", "Standard Err" and "Plots".
- Click on the grey bar next to the blue part of "Plots" and click to enable "Observed data", "Individual fits" and "SAEM". Disable all other options. Close the Plots options window.
- In the "Observation Model" section choose the "Error Model" called "COMBINED1" from the drop down list.
- In the "Individual Model" section click on "None" for Random Effects"

Because these are data from a single individual, there is no between subject variability (random effects). The SD of random effects is therefore 0.

- IMPORTANT: Save the project as bo1_project.mlxtran in your User Defined Models\Monolix folder before trying to "Run" the model.
- Estimate the parameters by clicking on 'Run'. This will take a while depending on the complexity of the model. During the estimation process you can see how the parameter estimates are being changed and settle down towards the final value.
- When the estimation finishes close the Monolix Scenario window.
- Click on 'Plots' and look at the individual fit. Use the "Display" options to show "Predicted Median" and "Prediction Interval" for the "Individual predictive check". You can alter the names of plot axes using the ‘Axes’ tab. When you are ready, you can export all of your plots together into your project folder automatically using the Export option in the main menu.
- View the parameter estimates by clicking 'Results'. A text file containing these results is saved in a 'pop_parameters.txt' in the project folder.
- Repeat the above steps for the k01 and emaxc models using the following MLXTRAN codes.
- When loading the data for the emaxc model set the column named CONC to REGRESSOR.
- Save each project as k01_project.mlxtran or emaxc_project.mlxtran before running the model.

INPUT:

parameter={V,CL} ; PK parameters

EQUATION:

dose=100 ; nominal dose

conc=dose/V*exp(-CL/V*t)

OUTPUT:

output = {conc}

IMPORTANT: MLXTRAN is case sensitive. Take care to be consistent with upper and lower case names.

INPUT:

parameter={V,CL} ; PK parameters

EQUATION:

dose=100 ; nominal dose rate

conc=dose/CL*(1-exp(-CL/V*t))

OUTPUT:

output = {conc}

Figure 3. Code for k01_mlxt.txt

INPUT:

parameter={Emax,C50} ; PD parameters

regressor={conc} ; Conc is the independent variable

EQUATION:

e=Emax*conc/(C50+conc) ; do not use 'effect'. This is a reserved word in Monolix.

OUTPUT:

output = {e}

Figure 4. Code for emaxc_mlxt.txt

### NONMEM

- Open the NONMEM shortcut in the folder "Pharmacometrics Programs".
- Change directory to the User Defined Models folder by
typing this command in the NONMEM window then press
<ENTER>.
cd User*

All commands shown in the green boxes should end by pressing <Enter>.

- Make a NONMEM directory then change to the NONMEM folder by
typing this command in the NONMEM window.
md NONMEM

You only have to make the directory once. You dont have to give the md NONMEM command when you are working on the K01 and emaxc problems.

- Change to the NONMEM directory from the User Defined Models
directory.
cd NONMEM
- Use the Windows Notepad (or EditPlus) to create the
following code in a file named bo1.ctl e.g. by using the command:

notepad bo1.ctl - Enter the code for bo1.ctl shown in Figure 2.

$PROB one compartment bolus

$INPUT ID TIME DV

$DATA ..\..\bo1.csv

$ESTIM METHOD=CONDITIONAL

$COV

$THETA

(0,3,) ; POP_CL

(0,10,) ; POP_V

$OMEGA

0 FIX ; ETA_CL

0 FIX ; ETA_V

$SIGMA 1 ; ERR_SD

$PRED

CL=THETA(1)*EXP(ETA(1))

V=THETA(2)*EXP(ETA(2))

CONC=100/V*EXP(-CL/V*TIME)

Y=CONC + ERR(1)

$TABLE ID TIME CL V Y

NOPRINT ONEHEADER FILE=bo1.fit

Figure 2. NM-TRAN code for bo1.ctl (NONMEM)

The $OMEGA parameters are FIXed to 0. This is because there is only one individual being modelled. These random effects parameters are used to describe between subject variability when there is more than one individual.

- Save the file. Check using Windows Explorer that you can find the file "bo1.ctl" in the User Defined Models\NONMEM folder.
- The data file name must match in the $DATA record
of the bo1.ctl file.

In order for NONMEM to find the data file the $DATA record has to include a path relative to the folder where NONMEM is executed. This is why the '..\..\' is put before the name of the data file which is located in the User Defined Models folder.

- Execute NONMEM with this command in the NONMEM window:

nmgo bo1

When you get errors from NONMEM with the nmgo command then please read the error message carefully and try to understand what it is telling you. The usual errors that occur with these example problems will give you some clues to what you might need to change in your ctl file.

Commands can be recalled by using the Up arrow on your keyboard. You can easily repeat the command without more typing or edit it to save the amount of typing you do.

- Use
the Windows Notepad or EditPlus to open this file and look at the saved
results in the '.smr' file. The results are saved in a folder with the
same name as the ctl file but with the extension '.reg' e.g. use the
following command or open the file using the Windows Explorer.

notepad bo1.reg\bo1.smr

- Create a plot of TIME versus Y (predicted conc) and DV ( observed conc) using the bo1.fit file in the bo1.reg results folder.
- Use Excel to open the fit file. You need to reformat the Excel file in order to see its contents.
- Select column A then click on the "Data" menu item then click on "Text to Columns". Click on "Finish". This will separate the values into separate columns.
- Select all cells in the worksheet (e.g. press ctrl-A) then right click on a cell and click on "Format cells". Click on the "General" option.. This will make the numeric values easier to read.
- Delete row 1 which contains the value "TABLE 1".
- Click on cell A1 then click on the "Data" menu item then click on "Data Filter". This will let you choose rows with specific values of DVID by clickiing on the arrow in the corner of the column with the heading "DVID".
- Filter the rows using DVID=1 then create a graph of TIME versus Y (predicted conc) and DV ( observed conc).
- Repeat the above steps for the K01 and emaxc models. For NONMEM you should change the bo1.ctl file and save it with a suitable name such as K01.ctl or emaxc.ctl.
- Change the NONMEM model code as follows for the K01 and emaxc data.

$INPUT ID TIME DV

$DATA ..\..\k01.csv

$ESTIM METHOD=CONDITIONAL

$COV

$THETA

(0,3,) ; POP_CL

(0,10,) ; POP_V

$OMEGA

0 FIX ; ETA_CL

0 FIX ; ETA_V

$SIGMA 1 ; ERR_SD

$PRED

CL=THETA(1)*EXP(ETA(1))

V=THETA(2)*EXP(ETA(2))

CONC=100/CL*(1-EXP(-CL/V*TIME))

Y=CONC + ERR(1)

$TABLE ID TIME CL V Y

NOPRINT ONEHEADER FILE=k01.fit

Figure 3. NM-TRAN code for k01.ctl (NONMEM)

$INPUT ID CONC DV

$DATA ..\..\emaxc.csv

$ESTIM METHOD=CONDITIONAL

$COV

$THETA

(0,100,) ; POP_EMAX

(0,3,) ; POP_C50

$OMEGA

0 FIX ; ETA_EMAX

0 FIX ; ETA_C50

$SIGMA 1 ; ERR_SD

$PRED

EMAX=THETA(1)*EXP(ETA(1))

C50=THETA(2)*EXP(ETA(2))

EFFECT=EMAX*CONC/(CONC+C50)

Y=EFFECT + ERR(1)

$TABLE ID CONC EMAX C50 Y

NOPRINT ONEHEADER FILE=emaxc.fit

Figure 4. NM-TRAN code for emaxc.ctl (NONMEM)

### Learning

- Show a graph of the Excel simulated data for each model.
- What do the model parameters mean?
- What are the kinds of experimental data they might be used to describe?
- Can you suggest a re-parameterization for each of the models?