Predict function after Bayesian estimation

Hello I am using the predict or simobs functions to sample from posterior distribution for each subject after using Bayesian estimation as following code:

Random.seed!(12345)
sims = vcat([DataFrame(predict(result_PKPD; subject = i, samples = 1000 ,obstimes = 0:0.5:125)) for i in 1:343]...) 
Random.seed!(12345)
simulations = vcat([DataFrame(simobs(result_PKPD; subject = i, samples = 1000, simulate_error = false,obstimes = 0:0.5:125)) for i in 1:343]...)

This is how my derived block look like

      @derived begin
          CP = @. Central / Vc
          E = @. alpha + (Beta*CP)
          CONC = @. Normal(CP, CP*σ_PK)
          PTT ~ @. Normal(E, σ_PD)
    
      end

in the output both CONC and CP results are exactly the same. I read in the documentation that for CONC if I passed simulate_error = false then it will use the estimated mean error. Is there a way to calculate CP in the output without the added residual error (as the case for CONC ?

thanks

Hello @ahmed.salem. predict and simobs with simulate_error = false are identical in that they give the posterior predictive distribution of predictions. The prediction is the conditional mean of the random variables in the derived block (CONC and PTT). The conditional means in your model are CP and E. If you want to simulate the residual error in the derived block, you can use simobs with simulate_error = true. Does this make sense?

1 Like

Hi @mohamed82008 , Thanks for the reply, this is very helpful. I also do have couple of questions:

1- I saw in the tutorial that the Omega in @param block is written as # Mode at diagm(fill(0.2, 3)) Ω ~ InverseWishart(6, diagm(fill(0.2, 3)) .* (6 + 3 + 1)). Is there a reason it is written in that format rather than just Ω ~ InverseWishart(6, diagm([2.0,2.0,2.0]) both should give the same matrix. I am not sure if this something to make the computation more efficient.

2- Is there a statistical criteria to compare between competing models (for example a model with and without certain covariate). In NONMEM it provides the Bayesian information criterion (BIC) for each iteration and the global BIC so that I can comparte between model. Can I get this statistic (or similar one) in Pumas?

3- When I write the Pumas model, I specify the initial estimates for all parameters, but still I specify them again in a NamedTuple and pass this into the fit function. Which initial parameters the fit function use? or the second one is just to make the fit function works since a named tuple is generally required for the fit() function ?

4- This is a general question for Bayesian estimation. In the tutorial when the fixed effects are specified, it was written like this θ ~ Constrained( MvNormal( [2.0, 0.2, 0.8, 2.0], Diagonal(ones(4)) ), lower = zeros(4), upper = fill(10.0, 4), init = [2.0, 0.2, 0.8, 2.0]) so we consider the variance (or uncertainty) in our priors as 1. In the literature, the authors usually use terms like strong, moderate and weak priors. so is it fair to consider 1 as strong prior and 10 as moderate prior and 100 as weak prior ? I mean how would I specify if my priors are weak, moderate or strong.

Thanks !

Both are identical. The @param block is only evaluated once at model construction time. So both distributions evaluate to the same thing.

Is there a statistical criteria to compare between competing models (for example a model with and without certain covariate). In NONMEM it provides the Bayesian information criterion (BIC) for each iteration and the global BIC so that I can comparte between model. Can I get this statistic (or similar one) in Pumas?

In Bayesian inference, the recommended metric to use is the expected log predictive density (ELPD) which is a measure of predictive accuracy for unseen data. In Pumas 2.3, you can do cross-validation followed by an estimate of the ELPD which you can use to compare models. The reason this is better than AIC or BIC is that it directly penalises under-fitting and over-fitting by using predictive accuracy on unseen data as the model evaluation criteria. Unfortunately you have to wait for Pumas 2.3 to get that. Documentation will be made public by ACoP.

Which initial parameters the fit function use?

The parameters you pass to the fit function are used. But the values hard-coded in the model will be returned from the Pumas.init_params(model) function which can be considered the default initial parameters. Also defining init in the model directly is optional so you don’t have to pass init, e.g. in Constrained.

In the literature, the authors usually use terms like strong, moderate and weak priors. so is it fair to consider 1 as strong prior and 10 as moderate prior and 100 as weak prior ? I mean how would I specify if my priors are weak, moderate or strong.

Strength is relative. What matters is the relative strength of the prior and likelihood. The same prior can be considered strong if we have only 10 subjects but not that strong if we have 1000 subjects. The prior encodes domain knowledge about the parameters. This domain knowledge is typically obtained from previous similar studies’ posteriors. The more data we have, the less important the prior becomes because the likelihood term dominates the posterior.

If you want some more math, the probability density function (PDF) of the posterior distribution is proportional to the PDF of the joint distribution of the parameters and observations, which is the product of the prior PDF and likelihood function. Let \theta be the parameters and D be the observed data:

p(\theta | D) = \frac{p(D, \theta)}{p(D)} = \frac{p(D | \theta) \times p(\theta)}{p(D)} \\ p(\theta | D) \propto p(D | \theta) \times p(\theta)

The product of the prior PDF p(\theta) and likelihood function p(D | \theta) is what matters, therefore strength is relative.

Typically you want to check the prior predictive distribution and make sure it has a decent coverage of the data. In Pumas 2.3, we will have an easy way to simulate from the prior predictive distribution and plot a VPC of the predictions next to the data. A prior too weak will have predictions with a very wide coverage of the data. A prior too strong might be giving predictions with a low uncertainty compared to the data. The worst type of prior is a strong and heavily biased one that gives predictions far away from the data with a low uncertainty in the predictions. This is like walking into a burger shop with a very high confidence that they sell pizzas. You will need to observe more than a few customers to realise that you might not be in a pizza shop.

3 Likes

Thanks so much for the detailed answers. Looking forward to seeing these nice features in Pumas 2.3.

2 Likes