How to define error

I am not understood how to define error (relating dv to cp) in PUMAS , can any one share some link where i can read about error concepts in PUMAS.

Hi Sai,

Thank you for asking this. I am still trying to learn myself, but I will give my interpretation as I don’t see a full explanation in the documentation. I request the experts here to clarify.

  • Let us say that the measured/observed concentration in your dataset is dv.
  • We use a structural model that is defined by a set of parameters to come up with individual predictions for each subject. Let’s say that these individual predictions are derived to be cp. e.g. from a one compartment model, cp = Central/V.
  • These individual predictions cp will be different than the observed dv due to random errors that we usually model as random unexplained variability.
  • The distribution of these random errors have a mean and a variance and we usually assume that these errors are normally distributed with zero mean and a variance of σ_prop^2
  • Hence, we can say that the dv comes from a distribution with a mean and variance.
  • To put this into syntax as how it is written in the
    @derived block => dv ~ @. Normal(cp, sqrt(cp^2*σ_prop))
  • Above, cp represents the mean of distribution at every time point and σ_prop is the variance. (here we are using a proportional error model, and hence we multiply the cp to σ_prop).
  • The reason why we take a sqrt for the variance term sqrt(cp^2*σ_prop) is (I think) because the Normal function takes parameters of mean and standard deviation as per this link (Note that variance = sqrt(sd)
  • The one thing I don’t clearly understand is the @. before the actual distribution. @pkofod can you please explain

I apologize if this is a long explanation, but I was hoping that I would better understand myself by writing out in this manner. I would however request the other experts in this forum to correct me. I would love to know more about other kind of error models that we can usually specify in NONMEM and how to code them up.



Hi Madhu
Thank you very much for sharing this information

Thanks for writing this up. Let σ be the standard deviation, then the common error models are


dv ~ @. Normal(cp, σ)


dv ~ @. Normal(cp, abs(cp)*σ)

(Taking the absolute value shouldn’t be necessary in most cases but when using an ODE solver, there is a small risk of a negative cp value if observed long after a dose.)

Proportional and additive

dv ~ @. Normal(cp, sqrt((cp*σ₁)^2 + σ₂^2))

Beyond the Gaussian error models

An advantage with the Pumas approach where we specify the conditional distribution of the dependent variable explicitly is that you are free to use other distributions as well. E.g. it can be argued that the (Gaussian) proportional error model is just an approximation of a Gamma model. Hence, instead of approximating the Gamma model with the proportional error model, you could write

dv ~ @. Gamma(ν, cp/ν)

where ν is the dispersion parameter. An advantage with this approach is that you avoid the risk of negative concentrations when simulating this model. A potential drawback is that you might need to remove any zero concentrations from your sample when estimating as zero is often unlikely in a Gamma distribution. For the time being, you’d have to use LaplaceI for estimation when using the Gamma distribution but there is no reason why we couldn’t add support for using FOCEI as well.

Yet another alternative you could consider is LogNormal, i.e.

dv ~ @. LogNormal(log(cp), σ)

which is very similar to the Gamma and proportional error model for small σ though the relationship between the conditional mean and cp is nicer for Gamma. For the LogNormal both the FOCEI and LaplaceI apprxomations of the marginal likelihood are supported.

The reason why you have to add the @. when specifying the distribution is that you are specifying the distribution along the time dimension for a single subject, i.e. dv is actual a vector of the observations for a subject. In Julia @. mean that the expressions right after are broadcasted to each element of the vector. So effectively we loop through the cp vector and generate a distribution for dv for each element of cp.

Please let us know if there are other models that you are interested in and we can try to update the post with examples. The documentation and tutorials will also get updated over time.


Thank you very much for sharing this information

Thank you so much for the write-up @andreasnoack. I will have many questions coming up, but can you please explain me in the below error model -

The reason we square the σ terms and then add them (and then take a square root to get the standard deviation back) - does this have to do anything with how variance terms can be added but not standard deviation terms? It would be helpful if you can expand on that here.

1 Like

In NONMEM, you write a regression type expression like Y = F(1 + EPS(1)) + EPS(2) which in math would be

Y = f(\theta,\eta)(1 + \varepsilon_1) + \varepsilon_2

where \varepsilon_1 and \varepsilon_2 are Gaussians with variances \sigma_1^2 and \sigma_2^2 and typically uncorrelated. Hence you have that

\mathrm{E}(Y|\eta) = f(\theta,\eta) \\ \mathrm{Var}(Y|\eta) = f(\theta,\eta)^2 \sigma_1^2 + \sigma_2^2

In Pumas, you specify the conditional (on the random effect) distribution of the dependent variable and the Normal in Julia is parameterized by its standard deviation so you need to take the square root and you end up with

Normal(cp, sqrt((cp*σ₁)^2 + σ₂^2))

if you parameterize with two standard deviation parameters. I hope it answered your question.

1 Like

Thank you. That cleared it up. This post is going into my bookmarks :slight_smile:

Can I use this forum to ask a layman explanation for what conditional (on the random effect) means. I frequently see the mention of conditional and marginal in terms of distributions and in NONMEM lingo. Is there a simple way you could map it to the code giving examples?

My naive understanding - conditional distribution of the dependent variable (on the random effect) is basically the predicted concentration distribution of the individual after taking into account the random errors? In that case, unconditional distribution would the individual concentration predictions of the subjects without the random epsilons?

In this context, the conditional distribution refers to the unsystematic part of the randomness in the dependent variable for a specific subject. There are (typically) two sources of randomness for a subject: one that is fixed within the subject but varies randomly across subject, called a random effect with the subject id as the blocking factor, and one that varies completely randomly between any two samples, which is the type mentioned in the first sentence.

The two sources of randomness enter the model differently. First, the random effect enters and to create individual specific parameters and the consequently individual predictions. These individual predictions are then the (typically) the conditional mean of the noise distributions added in the second level, i.e. this distribution is specified conditionally on the first level of randomness which is sometimes called the intersubject variability.

In short, when specifying a non-linear mixed effect model, the process can be described roughly as

  • Specify distributions the random effect, say η_i, which is assumed to be independent of everything
  • Specify the (nonlinear) conditional mean , say \mu_i = \mathrm{E} (y_i | \eta_i) = f(\theta, \eta_i, x_i), of the observed variable, in terms of population parameters (\theta), covariates (x_i), and random effects (\eta_i).
  • Specify the conditional distribution of the dependent variable as a function of the mean, y_i | \eta_i \sim D(\mu_i, \phi), where \phi is called a dispersion parameter and is not always present.

In pharmacometrics it seems to be popular to set y_i | \eta_i \sim N(\mu_i, \mu_i^2 \sigma^2) which means that the dispersion depends on the data which makes the things much more complicated. As discussed above, there are other options such as the Gamma distribution where you can avoid that.


3 posts were split to a new topic: Optimization error message during fit

Hi Andreas,

I want to quickly pick this up from here:
Y = F(1 + EPS(1)) If this is my error model on NONMEM and to understand with an example with 20% variability : Y = 10(1+0.04) = 10.4 (10.2)

So on pumas if is modeled as Normal(cp, sqrt((cp*σ)^2) can you please explain with the similar example and what values go in. Whether it is σ = 0.2 the SD or σ = 0.04 the Variance. I want to know how Pumas arrives at this value.

I’m afraid I don’t understand the example. Could you please try to rephrase?

How a proportional error works in NONMEM and Pumas with IPRED(F) =10 and 20% variabilty as assumption?

The correct way to model it on pumas would be Normal(cp, sqrt((cpσ)^2)) or Normal(cp, sqrt(cp^2σ))?

par = (σ = 0.04 or 0.2)?

You can either to that with

Normal(cp, cp*σ)

with σ = 0.2 or

Normal(cp, cp*sqrt(σ²))

with σ² = 0.04. I perfer the former. I’d also recommend using adding the square to the name when it’s a variance to avoid confusion.

Hi @andreasnoack
dv ~ @. Normal(cp, sqrt(cp^2*σ_prop)) this is error defined in workshop examples (

which one equivalent to that Normal(cp, cpσ) or Normal(cp, cpsqrt(σ²)) ??

This is the same as dv ~ @. Normal(cp, cp*sqrt(σ²)), i.e. σ_prop is a variance parameter in that example. The more appropriate name would have been σ²_prop.

Thanks @andreasnoack.

thankyou @andreasnoack

Correct me if I’m wrong but is it not sd = sqrt(variance) or put in other words variance = (sd)^2