Negative PRED during estimation

Hello Experts,

I ran some estimation for my data. When I used predict() to get the prediction from the model estimation, I found there are five observations for one patient and one observation from another patient, six predictions in total, have negative values for population prediction, DV.pred, which is improbable. But, the DV.ipred are good. Do you happen to have any ideas about why the negative prediction would ever happen? Any help is appreciated. Thank you!

You are right that both the individual predictions \mathrm{E}(y|\eta=\eta^*) (for any choice of \eta^*) and the population predictions \mathrm{E}(y) are supposed to be positive when y models drug concentrations. However, E(y) is a complicated integral and typically we approximate it by linearizing the conditional expectation \mathrm{E}(y|\eta) since \mathrm{E}(y)=\mathrm{E}(\mathrm{E}(y|\eta)). So if we denote f(\eta) = \mathrm{E}(y|\eta) then we have

\mathrm{E}(y) = \mathrm{E}(\mathrm{E}(y|\eta)) = \mathrm{E}(f(\eta)) \approx \mathrm{E}(f(\eta^*) + f'(\eta^*)(\eta - \eta^*)) = f(\eta^*) - f'(\eta^*)\eta^*

i.e. the first order approximation of the population prediction is the individual prediction minus a first order term and this quantity might sometimes be negative even though \mathrm{E}(y) is positive.

Negative approximate population predictions suggest that a first-order approximation might be too rough and you might want to consider a Monte Carlo approximation of the integral instead. You can currently do that in Pumas per subject with

Pumas.epredict(fpm, subject, nsim)

where fpm is your model fit and nsim is the number of simulations used in the Monte Carlo integration. For a large enough nsim you shouldn’t see any negative predictions. (Be warned, though, that this API will most likely change soon.)

1 Like

I should add that if you believe that a first order approximation of the conditional expectation isn’t sufficient then it implies that your estimates probably aren’t trustworthy either if they are based on one of the approximate methods such as FOCE(I) or LaplaceI. In that case, you’d want to consider an estimation method with a better approximation of the marginal likelihood such a adaptive Gauss-Hermite or SAEM. We are currently testing these methods and it hopefully won’t be too long before we can include them in a release.

1 Like

@andreasnoack I understand now. Thank you very much Andreas!

I’m also interested in this topic and I got the same issue. I don’t know the detailed procedure of estimation with approximation in each estimation algorithm, but population prediction (PRED) is just function of population parameter of structural model. So I’m just wondering it would be possible to calculate directly using final parameter, rather than derived based on the above approximation equation.

I compared PRED and IPRED of Pumas and NONMEM as below.


It could look like this is because of differences in nomenclature between NONMEM and Pumas. In Pumas, we use the approximation from the estimation to compute the population predictions. It looks like the NONMEM population predictions that you are plotting are the ones based on \eta^*=0 since there are several identical predictions. In Pumas, you should be able to get the same value by passing a last Pumas.FO() argument to the predict function. I think that NONMEM’s CPRED or CPREDI would produce the same prediction as the Pumas population predictions in your plot since they are based on setting \eta^* to the mode of penalized likelihood.

Thank you. I tested CPRED in NONMEM and it matched exactly with dv_PRED in Pumas as below.

Also, dv_PRED obtained by predict(result, Pumas.FO()) is matched with PRED in NONMEM.

It would be better if the default value of population prediction is PRED obtained using Pumas.FO() argument.

I’m not so sure about that. It’s worth a discussion point.