# truncated versus Constrained functions

Hello,

I am reading through the Bayesian workflow and I noticed that when the distribution is assumed to be within certain domain either Constrained() or truncated() functions are used. I am wondering if both functions serve the same purpose or are there differences ?

Thanks

Also I wrote my model as following

BAYESIAN_PKPD = @model begin
@param begin
tvKTV1 ~ LogNormal(log(0.4), 1.0)
tvKTV2 ~ LogNormal(log(0.1), 1.0)
tvecmo ~ Normal(-0.7, 1.0)
tvCL1 ~ LogNormal(log(0.9), 1.0)
tvCL2 ~ LogNormal(log(0.8), 1.0)
tvcircuit ~ LogNormal(log(1.2), 1.0)
tvVc1 ~ LogNormal(log(1.8), 0.1)
tvVc2 ~ LogNormal(log(3.0), 0.1)
σ_prop_1 ~ Constrained(Normal(0.18, 1); lower = 0.0)
σ_prop_2 ~ Constrained(Normal(0.02, 1); lower = 0.0)
σ_prop_3 ~ Constrained(Normal(0.05, 1); lower = 0.0)
σ_add ~ Constrained(Normal(5000, 1); lower = 0.0)
ω²KTV ~ Constrained(Normal(1.8, 1); lower = 0.0)
ω²CL ~ Constrained(Normal(0.2, 1); lower = 0.0)
ω²Vc ~ Constrained(Normal(0.9, 1); lower = 0.0)
end

@random begin
ηKTV ~ Normal(0.0, sqrt(ω²KTV))
ηCL ~ Normal(0.0, sqrt(ω²CL))
ηVc ~ Normal(0.0, sqrt(ω²Vc))
end
@covariates WT ECMO_DAYS IS_CIRCUIT_CHANGE site T
@pre begin

# PK parameters

siteeffect_1 = if site==1
σ_prop_1
elseif site==2
σ_prop_2
else
σ_prop_3
end
siteeffect_2 = site==2 ? σ_add : 0.0

tvKTV = site == 1 ? tvKTV1 : tvKTV2*(ECMO_DAYS/6)^tvecmo
KTV    = tvKTV*exp(ηKTV)
TVCL = site == 1 ? tvCL1 * (WT/10)^0.75 : tvCL2* (WT/10)^0.75  * tvcircuit^IS_CIRCUIT_CHANGE
CL     = TVCL * exp(ηCL)*(1-exp(-KTV*T))
TVVc = site == 1 ? tvVc1 * (WT/10)^1 : tvVc2 * (WT/10)^1
Vc     = TVVc * exp(ηVc)
end

@dynamics Central1
@derived begin
CP = @. Central / Vc
CONC = @. Normal(CP, sqrt(CP^2*siteeffect_1 + siteeffect_2))
end
end

When I checked my initial parameters from the model using init_params() function I found that the parameters for random effects and sigmas are zeros

init_params(BAYESIAN_PKPD)
(tvKTV1 = 0.4,
tvKTV2 = 0.10000000000000002,
tvecmo = -0.7,
tvCL1 = 0.9,
tvCL2 = 0.8,
tvcircuit = 1.2,
tvVc1 = 1.8,
tvVc2 = 3.0000000000000004,
σ_prop_1 = 0.0,
σ_prop_2 = 0.0,
σ_prop_3 = 0.0,
ω²KTV = 0.0,
ω²CL = 0.0,
ω²Vc = 0.0,)

I was expecting the mean value from the prior distribution to be the initial estimate (for example 5000 for σ_add ). Or there is another way to determine the initial estimates?

Thanks

hi Ahmed -

Both Pumas.Constrained and Pumas.truncated have a somewhat similar signature (dist, lower, upper) where dist is. distribution and lower and upper are bounds that we want to apply. However, one fundamental difference is that Pumas.Constrained supports multivariate distributions for dists whereas Pumas.truncated only supports univariate distributions for dists. Look at the example below

## multivariate distribution
julia> mvnd = MvNormal(Diagonal([0.04, 0.04, 0.04]))
ZeroMeanDiagNormal(
dim: 3
μ: 3-element Zeros{Float64}
Σ: [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04]
)

## univariate
julia> unvd = Normal(0,1)
Normal{Float64}(μ=0.0, σ=1.0)

## truncated works on univariate
julia> Pumas.truncated(unvd, 0.5, 0.5)
Truncated(Normal{Float64}(μ=0.0, σ=1.0); lower=0.5, upper=0.5)

## truncated does not work on multivariate
julia> Pumas.truncated(mvnd, 0.5, 0.5)
ERROR: MethodError: no method matching truncated(::ZeroMeanDiagNormal{Tuple{Base.OneTo{Int64}}}, ::Float64, ::Float64)
Closest candidates are:
truncated(::UnivariateDistribution, ::T, ::T) where T<:Real at ~/actions-runner/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/Pumas/89e4W/src/deps_compat/distributions.jl:9
truncated(::UnivariateDistribution, ::Real, ::Real) at ~/actions-runner/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/Pumas/89e4W/src/deps_compat/distributions.jl:5
Stacktrace:
[1] top-level scope
@ REPL[41]:1
[2] eval_user_input(ast::Any, backend::REPL.REPLBackend)
@ REPL /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[3] repl_backend_loop(backend::REPL.REPLBackend)
@ REPL /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[4] start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
@ REPL /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
@ REPL /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[6] run_repl(repl::REPL.AbstractREPL, consumer::Any)
@ REPL /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[7] (::Base.var"#936#938"{Bool, Bool, Bool})(REPL::Module)
@ Base /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[8] run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
@ Base /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[9] exec_options(opts::Base.JLOptions)
@ Base /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1
[10] _start()
@ Base /Applications/Pumas-2.3.0.app/Contents/Resources/julia/Contents/Resources/julia/lib/julia/sys.pumas.dylib:-1

## Constrained works on univariate
julia> Pumas.Constrained(unvd,lower =  -0.5, upper =  0.5)
Constrained{Normal{Float64}, RealDomain{Float64, Float64, Float64}}(Normal{Float64}(μ=0.0, σ=1.0), RealDomain{Float64, Float64, Float64}(-0.5, 0.5, 0.0))

## Constrained works on multivariate
julia> Pumas.Constrained(mvnd,lower =  -0.5, upper =  0.5)
Constrained{ZeroMeanDiagNormal{Tuple{Base.OneTo{Int64}}}, VectorDomain{Vector{Float64}, Vector{Float64}, Vector{Float64}}}(ZeroMeanDiagNormal(
dim: 3
μ: 3-element Zeros{Float64}
Σ: [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04]
)
, VectorDomain{Vector{Float64}, Vector{Float64}, Vector{Float64}}([-0.5, -0.5, -0.5], [0.5, 0.5, 0.5], [0.0, 0.0, 0.0]))

In general, it’s recommended to use Pumas.Constrained in the @param block and Pumas.truncated in the @random and @derived blocks (cc @mohamed82008 can you confirm this statement)

1 Like

Yes I wrote this sentence in the Bayesian docs For a detailed discussion, see Bayesian Workflow · Pumas.

@ahmed.salem Regarding the initial value for Constrained, you can use the init keyword argument in Constrained to set a default initial value. We should document this. The default value for init is 0 which is unfortunate and arbitrary.

1 Like

I had remembered reading it somewhere, but I dd not locate it in the docs, earlier, that’s why I pinged you

1 Like

Thanks @mohamed82008 & @vijay for the answer … Yes this part is written in the documentation, I may have scanned it very fast. I have a follow up question on the prior distribution. In the documentation, the prior distribution for sigma was defined as following:

σ ~ Constrained(Cauchy(0, 5); lower = 0.0)

My understanding is that the first value is the μ(location) which is equal to zero. I was expecting to use our belief on the sigma value, which should be higher than zero. Please correct me if I my understanding is wrong. Similarly also for the `MvNormal()’ for omegas, it was written as follows:

ω ~ Constrained(
MvNormal(zeros(5), Diagonal(0.4^2 * ones(5))),
lower=zeros(5),
)

Also My understanding the zeros(5) is our prior belief on the omegas values which could be another value rather than zero. Also, in Constrained() we can supply initial estimates as following:

ω ~ Constrained(
MvNormal(zeros(5), Diagonal(0.4^2 * ones(5))),
lower=zeros(5), init = ones(5)
)

I understand that the init_params() function will use the input from init = ones(5), in that case what is the purpose of zeros(5)

Thanks!

Just a simple way to generate a vector of zeros:

julia> zeros(5)
5-element Vector{Float64}:
0.0
0.0
0.0
0.0
0.0

An easy way to specify that the lower bound for all of these ωs is zero.

Hi @storopoli , Thanks for the reply. I was asking about the first zeros(5) inside the MvNormal() function. I am expecting this is the prior mean for omegas so my thinking is that it should be something rather than zeros. Please correct me if I am wrong

Yes, prior mean. Your intuition is correct.

@ahmed.salem using Constrained or truncated is just a way to define a prior over a constrained domain. The “base distribution” can be unconstrained but it will be truncated to the constrained domain. Let the parameter of interest be x and the domain of x be D, e.g. the domain of positive numbers x > 0. Any non-negative function f(x) \geq 0 \, \forall x whose integral is

\int_{D} f(x) dx = 1

is a valid probability density function (PDF).

So we can use this general definition of a PDF to define our own distribution by simply truncating the PDF of another random variable and then re-normalising. Let’s say g was the the PDF of a Gaussian x so:

\int_{\mathcal{R}} g(x) dx = 1

where the domain of x here is all real numbers \mathcal{R}. We can truncate the PDF and compute the integral:

C = \int_{D} g(x) dx

which will be less than 1 because D \subset \mathcal{R}, that is D is a subset of \mathcal{R} so the area under the curve over D will be less than 1. Notice that C is a constant in x, not a function of x. So if we then define f(x) = g(x) / C, the area under the curve over D will be:

\int_{D} f(x) dx = \int_{D} \frac{g(x)}{C} dx = \frac{1}{C} \int_{D} g(x) dx = \frac{C}{C} = 1

So by defining f(x) as the “re-normalised” version of g(x) we are able to get a new PDF for a random variable that’s constrained to be in D, e.g. D = \mathcal{R}^+ is all the numbers x > 0.

The main difference between Constrained and truncated is in the calculation of C. In MCMC, we can drop any constant that’s not a function of the model’s parameters. So if x is a parameter in the @param block, C will not be a function of the parameters in the @param or @random blocks and it can be safely dropped when doing MCMC, saving computation. If x is a parameter in the @random block or an observed variable in the @derived block then the re-normalisation constant C will likely be a function of some of the parameters in the model and can’t be ignored. truncated computes C always and it’s always safe to use. Constrained does not compute C and that’s why you should only use in the @param block to save some computations.

When defining a prior over a positive parameter like the standard deviation, it is common to use a zero-centred Gaussian base distribution and then truncate it to the positive domain. You can visualise this as just the right hand side of the Gaussian PDF scaled up by 1 / C where C = 0.5 in this case. You can do the same with a Cauchy distribution as well. These are commonly known as half-normal and half-Cauchy distributions so they are common enough to have names.

I hope this was useful.

3 Likes

@mohamed82008 Thanks so much for the detailed answer, it is quite helpful.