# SAEM algorithm parameters

Hi!
I’m currently using SAEM algorithm to perform estimation, and I had a question about the parameters we can modify in Pumas.SAEM(): does the “repeat” parameter correspond to the number of chains used in a Monte Carlo Markov Chain version of the algorithm (such as in saemix or Monlolix for instance)?

Lucie

No, `repeat` restarts SAEM `repeat-1` times, using the final values of the previous run as the initial for the next, except for the variance parameters of the random effects and dispersion parameters of the error model, which are reset to the initial values.
Additionally, the `λ` parameter is squared on each repeat, letting the reset parameters approach the MLE more quickly.

2 Likes

Ok, thanks for the clarification!

Dear Dr Elrod - What might be situation that calls for the use of FOCE vs SAEM algorithms? Is the model evaluation (model fit diagnosis) approach for SAEM identical to FOCE?
Bob

I am also interested in using SAEM. How it is better than FOCE. In what situations we should apply SAEM .

@bobbrown @Diptimangal I am no expert by any means, but from what I can tell SAEM is beneficial when you have complex models (PBPK, lots of metabolites, etc.) and sparse data. The paper I’m linking below did a full comparison of FOCEI and SAEM performance and talks a bit about the advantages & disadvantages of SAEM in the discussion. The two main points I took away was 1) SAEM can be useful for sparse data and 2) the SAEM sampling is highly adjustable and can therefore be tuned for optimal performance.

I need to employ SAEM estimations in an upcoming project and would be grateful for any other information people may have.

@adunn34 Thanks for sharing the paper. SAEM is new to me. But my understanding is:
SAEM can be used for large random effect error. In statistics course we came to know that if data has overdispersion, meaning error is larger than mean, then estimated parameter standard error could be questionable. To account the overdispersion we adjusted the dispersion parameter by using negative binomial distribution. Similarly, SAEM is like negative nominal distribution and FOCE is more like other distributions. FOCE can be used when error is not larger than mean.

1 Like

@Diptimangal @adunn34 Perhaps we learn SAEM together, as we might be by ourselves. Here is what I have done. If you followed my nibr data discussion. I used that dataset to run ML (FOCE) and EM (SAEM) estimations using this code. Might I request you to see if you are able to reproduce my results? rectify any errors? improve? I was unable to make some informative plots (FOCE estimates vs SAEM estimates; ipred FOCE vs ipred SAEM, pred…). May be you will be able to.

``````using Pumas
using PumasUtilities
using Dates
using CairoMakie
using AlgebraOfGraphics
using DataFramesMeta
using Random

#################
# TIME in hours
# CONC in ug/L
# AMT_UG in ug
# WEIGHTB in kg
# DOSE in mg
#################

observations   =   [:CONC],
id             =   :ID,
time           =   :TIME,
amt            =   :AMT_UG,
cmt            =   :CMT,
evid           =   :EVID,
covariates     =   [:WEIGHTB, :SEX, :DOSE]
)

ka1cmt = @model begin

@param begin
tvka     ∈ RealDomain(lower=0.001)
tvcl     ∈ RealDomain(lower=0.001)
tvvc     ∈ RealDomain(lower=0.001)
Ω        ∈ PDiagDomain(3)
end

@random begin
η        ~ MvNormal(Ω)
end

@covariates WEIGHTB DOSE SEX

@pre begin
Ka       = tvka  *  exp(η)
CL       = tvcl  *  exp(η)
Vc       = tvvc  *  exp(η)
end

@dynamics Depots1Central1

@derived begin
cp         = @. (Central/Vc)
end
end

####
#tvka = 1/hr
#tvcl = L/hr
#tvvc = L
###

param = (tvka    = 0.5,
tvcl    = 6000.0,
tvvc    = 200000.0,
Ω       = Diagonal([0.05,0.05,0.05]),

# Start initial estimates explorer app
ee_1cmp = explore_estimates(ka1cmt,
pop_nmle,
param)

ka1cmt_fit = fit(ka1cmt,
pop_nmle,
param,
#constantcoef=(tvka = 10,),
Pumas.FOCEI())

#
ka1cmt_inspect = inspect(ka1cmt_fit)
ka1cmt_evaluate = evaluate_diagnostics(ka1cmt_inspect)
ka1cmt_inspect_df = DataFrame(ka1cmt_inspect)
sort!(ka1cmt_inspect_df, ([:id,:time]))

ka1cmt_infer   = infer(ka1cmt_fit)
ka1cmt_icoef = reduce(vcat, DataFrame.(icoef(ka1cmt_fit)))
@rtransform! ka1cmt_icoef method = "foce"

################
## SAEM Model
################

ka1cmt_saem = @emmodel begin
@param begin
tvka ~ 1 | LogNormal # RealDomain(lower=0.01)
tvcl ~ 1 | LogNormal # RealDomain(lower=0.2)
tvvc ~ 1 | LogNormal # RealDomain(lower=0.1)
end
@random begin
η1 ~ 1 | Normal
η2 ~ 1 | Normal
η3 ~ 1 | Normal
end
@covariance (1, 1, 1)
@pre begin
Ka = tvka * exp(η1)
CL = tvcl * exp(η2)
Vc = tvvc * exp(η3)
end
@dynamics Depots1Central1
@post begin
cp = @. Central / Vc
end

#@derived begin # DERIVED BLOCK DOES NOT WORK IN SAEM - ERROR: LoadError: LoadError: "Macro block @derived not supported."
#  cp         = @. (Central/Vc)
#  CONC       ~ @. Normal(cp, sqrt(σ²_add)) #THIS FORMAT OF RESIDUAL ERROR MODEL DOES NOT WORK FOR SAEM
#end

@error begin
CONC ~ Normal(cp)
end
end

param_saem = (
tvka = 0.25,
tvcl = 6000.0,
tvvc = 200000.0,
# Random effects must be initialized as well
η1 = 0.0,
η2 = 0.0,
η3 = 0.0,
Ω       = (0.05,0.05,0.05),
)

## Fit base model with SAEM
ka1cmt_em_fit = fit(ka1cmt_saem, pop_nmle, param_saem, Pumas.SAEM())

ka1cmt_em_fit = fit(ka1cmt_saem, pop_nmle, param_saem, Pumas.SAEM(), ensemblealg = EnsembleThreads(), rng=rngv)

ka1cmt_em_inspect = inspect(ka1cmt_em_fit)
ka1cmt_em_evaluate = evaluate_diagnostics(ka1cmt_em_inspect)
ka1cmt_em_inspect_df = DataFrame(ka1cmt_em_inspect)
sort!(ka1cmt_em_inspect_df, ([:id,:time]))

ka1cmt_em_infer   = infer(ka1cmt_em_fit,level = 0.95)
ka1cmt_em_icoef = reduce(vcat, DataFrame.(icoef(ka1cmt_em_fit)))

@rtransform! ka1cmt_em_icoef method = "saem"

###########
ka1cmt_icoef = @chain ka1cmt_icoef begin
@transform(:CLfoce = :CL)
@transform(:Vcfoce = :Vc)
@transform(:Kafoce = :Ka)
@select(:id,:CLfoce,:Vcfoce,:Kafoce)
end

ka1cmt_em_icoef = @chain ka1cmt_em_icoef begin
@transform(:CLsaem = :CL)
@transform(:Vcsaem = :Vc)
@transform(:Kasaem = :Ka)
@select(:id,:CLsaem,:Vcsaem,:Kasaem)
end

icoef_foce_saem = innerjoin(ka1cmt_icoef,ka1cmt_em_icoef, on=:id,makeunique=true)

icoef_foce_saem = @chain icoef_foce_saem begin
@transform(:dCL = (:CLsaem - :CLfoce)*100 ./ :CLfoce)
@transform(:dVc = (:Vcsaem - :Vcfoce)*100 ./ :Vcfoce)
@transform(:dKa = (:Kasaem - :Kafoce)*100 ./ :Kafoce)
@select(:id,:dCL,:dVc,:dKa)
end

summarize(icoef_foce_saem;
parameters = [:dCL,:dVc,:dKa],
stats = [NCA.extrema, NCA.mean, NCA.std])

#
inspect_foce_saem = innerjoin(ka1cmt_inspect_df,ka1cmt_em_inspect_df, on=[:id,:time],makeunique=true)
inspect_foce_saem = @chain inspect_foce_saem begin
@transform(:dCONC_ipred = (:CONC_ipred - :CONC_ipred_1)*100 ./ :CONC_ipred)
@transform(:dCONC_pred  = (:CONC_pred - :CONC_pred_1)*100 ./ :CONC_pred)
@select(:id,:dCONC_ipred,:dCONC_pred)
end

summarize(inspect_foce_saem; # I CANNOT FIGURE OUT THE ERROR WITH THIS BLOCK
parameters = [:dCONC_ipred,:dCONC_pred],
stats = [NCA.extrema, NCA.mean, NCA.std])

``````

julia> ka1cmt_fit # FOCE FIT
FittedPumasModel

Successful minimization: true

Likelihood approximation: Pumas.FOCE
Log-likelihood value: -1291.1646
Number of subjects: 50
Number of parameters: Fixed Optimized
0 7
Observation records: Active Missing
CONC: 600 0
Total: 600 0

``````       Estimate
``````

## tvka 1.3204 tvcl 10227.0 tvvc 51450.0 Ω₁,₁ 0.21713 Ω₂,₂ 0.10226 Ω₃,₃ 0.22723 σ²_add 3.3067

julia> ka1cmt_em_fit # SAEM FIT
FittedPumasEMModel

``````     Estimate
``````

## tvka 1.2853 tvcl 8980.0 tvvc 47547.0 η1 -0.029223 η2 0.10481 η3 0.084683 Ω₁,₁ 0.023733 Ω₂,₂ 0.021677 Ω₃,₃ 0.1683 σ 2.0323

@Diptimangal @adunn34 Before I forget few notes, assuming my code above is accurate.

1. inspect() does not provide sorted data. The dosing records appeared at the end, hence I had to sort them.
2. FOCE and SAEM estimates and predictions have some serious deviations. We may need to probe more into these aspects once we get past the reproduction of the results step.

Bob

@bobbrown, I have few questions about the code.

1. Are you comparing FOCEI with SAEM or FOCE with SAEM.
2. In param block why do we need log normal in SAEM and in FOCE it`s Normal
3. Why are using 1 for CL, V and Ka, if you have initial estimates, I would prefer to take log of the initial estimates and then take exponential before using in the equation.
4. You have dose, sex and weight as covariates but pre block does not have covariates.
5. if in SAEM you are using lognormal than your continuous covariate would be like wt = log(wt)
I am having problem in accessing pumas and waiting fixing the problem. once fixed I will run the code.

@Diptimangal @bobbrown I have some of the same questions and am running it now. I need to do more research into how the model should be set up, because I am also confused on the use of LogNormal vs Normal in the param/random blocks.

Sometimes I will add covariates into the base model (even if i’m not including them in the model) just because they will output in the inspect dataframe. It just saves you a step if you are choosing to make covariate plots in R or any other analysis.

@bobbrown Parameters are implicit with the random effects, therefore you should remove the etas:

``````ka1cmt_saem = @emmodel begin
@random begin
tvka ~ 1 | Normal
tvcl ~ 1 | Normal
tvvc ~ 1 | Normal
end
@covariance (1, 1, 1)
@pre begin
Ka = tvka
CL = tvcl
Vc = tvvc
end
@dynamics Depots1Central1
@post begin
cp = @. Central / Vc
end

#@derived begin # DERIVED BLOCK DOES NOT WORK IN SAEM - ERROR: LoadError: LoadError: "Macro block @derived not supported."
#  cp         = @. (Central/Vc)
#  CONC       ~ @. Normal(cp, sqrt(σ²_add)) #THIS FORMAT OF RESIDUAL ERROR MODEL DOES NOT WORK FOR SAEM
#end

@error begin
CONC ~ Normal(cp)
end
end
``````

The model as you specified:

``````  @param begin
tvka ~ 1 | LogNormal # RealDomain(lower=0.01)
tvcl ~ 1 | LogNormal # RealDomain(lower=0.2)
tvvc ~ 1 | LogNormal # RealDomain(lower=0.1)
end
@random begin
η1 ~ 1 | Normal
η2 ~ 1 | Normal
η3 ~ 1 | Normal
end
@covariance (1, 1, 1)
@pre begin
Ka = tvka * exp(η1)
CL = tvcl * exp(η2)
Vc = tvvc * exp(η3)
end
``````

Means that we have

``````Ka = tvka * exp(η1mean + η1randeff) = exp(log(tvka) + η1mean + η1randeff)
CL = tvcl * exp(η2mean + η2randeff) = exp(log(tvcl) + η2mean + η2randeff)
Vc = tvvc * exp(η3mean + η3randeff) = exp(log(tvvc) + η3mean + η3randeff)
``````

Where `log(tvka)`, `η1mean`, `log(tvcl)`, `η2mean`, `log(tvvc)`, and `η3mean` are all population parameters.
This means that
`log(tvka) + η1mean`, `log(tvcl) + η2mean`, and `log(tvvc) + η3mean` are all non-identifiable. For this reason, I suggest you remove the redundant parameters.

if in SAEM you are using lognormal than your continuous covariate would be like wt = log(wt)

`LogNormal` exponentiates. If you’d like your parameters to be positive, use the `LogNormal`.
So, `theta ~ 1 + x + y | LogNormal` means we have `theta = exp(theta_mean + x*theta_x + y*theta_y)`.
When specifying initial values, you would need to specify `theta = (exp_theta_mean, theta_x, theta_y)`.

1 Like

@chriselrod Thank you. I will try your code.
@Diptimangal @adunn34 hope your questions are addressed by the above elegant post. I am no expert by any means on this topic.
Bob

While I was writing following code for SAEM

pk_2cmp = @emmodel begin
desc = “base model: 2comp”
timeu = u"hr"
end
@random begin
tvCL ~ 1
tvCL ~ 1
tvVc ~ 1
tvQ ~ 1
tvVp ~ 1
end
@covariance (1, 1, 1, 1, 1)
@covariates begin
“ascl”
ascl
“asv”
asv
“ascrcl”
ascrcl
end

@pre begin
CL = 0.7 * tvCL * ascl + 0.3 * tvCL * ascrcl
Vc = tvVc * asv
Ka = tvka
Vp = tvVp * asv
Q = 0.7 * tvQ * ascl + 0.3 * tvQ * ascrcl
end
@dynamics Depots1Central1Periph1
@post begin
cp := @.(Central/Vc)
end
@error begin
cobs ~ ProportionalNormal(cp)
end
end

I got following error

ERROR: LoadError: type Int64 has no field args
Stacktrace:
 getproperty(x::Int64, f::Symbol)
@ Base ./Base.jl:33
 build_emmodel(expr::Expr)
@ Pumas /builds/PumasAI/PumasSystemImages-jl/.julia/packages/Pumas/HDuXQ/src/dsl/emmodel_macro.jl:113
 var"@emmodel"(source::LineNumberNode, module::Module, expr::Any)
@ Pumas /builds/PumasAI/PumasSystemImages-jl/.julia/packages/Pumas/HDuXQ/src/dsl/emmodel_macro.jl:358
 eval
@ ./boot.jl:360 [inlined]

Please help me to find out what is the error and how can I correct . Thanks

@chriselrod I tried the code you suggested verbatim. For some reason, some parameters are not being optimized. the CL, Vc are not estimated. I also tried with etas=0.0 initial estimates.

``````(tvka = 0.25, tvcl = 6000.0, tvvc = 200000.0, η1 = 0.01, η2 = 0.01, η3 = 0.01, Ω = (0.05, 0.05, 0.05), σ²_add = 2)

FittedPumasEMModel

--------------------
Estimate
--------------------
tvka      3.8945
tvcl   5999.6
tvvc 200000.0
Ω₁,₁      0.02029
Ω₂,₂      0.0051455
Ω₃,₃      0.014775
σ         4.1736
--------------------
``````

I just want to through out an idea about the difference among the estimation algorithm in this discussion.

First order conditional estimation (FOCE) considered an approximation to the maximum likelihood. We take into consideration the random quantities with differentiation. This method has shown its efficacy since it got here. You can use it with intense or sparse data (consider it when you have parameters that do not change in population).

Expectation-Maximization (EM) have different flavors and stocastic approximation (SAEM) is one of those. They tend to have different phases with “stochasticity” being the first phase. EM alogirthm tend to have the exact maximum likelhood (unlike FOCE). In my humble opinion, I saw EM algorithm are efficient in all cases (more compuationally demanding in simple models than FOCE). When you have all parameters changing in population (adding variability among all parameters ) its more efficient than FOCE since it can handle full variance matrix (a.k.a OMEGA and SIGMA matrices).

The downfall with some EM methods are the sensitivity to initial guesses, but that can overcome by running FOCE with few subjects at the beginning then update the SAEM run.

@bobbrown I’m not sure if this was already resolved, but the error/missing values you get when trying to summarize your inspect foce/saem comparison is just because there are missing values from where no concentration data is present. You can fix this by removing rows with missing values:

inspect_compare = dropmissing(inspect_foce_saem)

summarize(inspect_compare; # FIXED
parameters = [:dLIDV_ipred, :dLIDV_pred],
stats = [extrema, mean, std])

@bobbrown I am still learning & trying to figure out SAEM, BUT the following model/code estimates for me. I am a bit unsure about how to pick the number of iterations / the general workflow with SAEM, however.

``````
ka1cmt_saem = @emmodel begin
@random begin
Ka ~ 1 | LogNormal # no bounds needed
CL ~ 1 | LogNormal
Vc ~ 1 | LogNormal
end

@dynamics Depots1Central1

@post begin
cp = Central / Vc
end

@error begin
LIDV ~ Normal(cp)
end

end

param_saem = (
Ka = 2,
CL = 20000,
Vc = 50000)
#

ka1cmt_em_fit = fit(ka1cmt_saem, pop_nmle, param_saem, Pumas.SAEM())
``````

@adunn34 I was able to run the code successfully upon dropping the initial estimates for the random effects.
Few learnings:

1. The FOCE and SAEM fits match - almost perfect. However, even the `ipreds` had an average deviation of about 6-7%. I could not figure out why. Until…I made the next graph - the primary source of the deviation is because of the very low concentrations. Percent deviation is not wise at all times.
2. I had to filter the inspect() rows for evid=0 and did not know where to find help. I used Pumas Labs/Tutorial Notebooks. The topics are very nicely organized- good job Pumas folks. It was very easy to find that I needed @subset and its syntax. The tutorials are very well organized, very professional.
3. Request: In the Plotting tutorial can you please add how to make basic plots using inspect() result? Especially, I did not know how to make trellis plots by id. The previous help you provided used read_nca which I cannot use for inspect().
4. To probe more re SAEM - I varied the SAEM arguments: (rapid exploration, convergence, smoothing) from (500,500,500) to (10,000, 10,000,10,000). I did not find any difference in the quality of the estimates. Not sure how to choose how many iterations to use. I also found Pumas SAEM is super fast - it was done is couple of seconds. I think you have a winner!

Bob

Thanks! We are really proud of how the tutorials are coming up. But is still a work in progress. We have a lot of new tutorials written and also in screencast (video) format. They will be included at tutorials.pumas.ai. Make sure to check it for new updated content.

Thanks for the feedback and request. I’ve included it in our list of written tutorials that we are developing. We’ll make sure to cover that in an upcoming “Customizing Pumas’ Plot” written tutorials.