Contamination Model Using Mixture Models?

wrholmes · August 25, 2025, 2:39pm

Hello all,

I am trying to determine if there is a way to construct mixture error models in Pumas for the purpose of construction on the fly contaminant / outlier handling. Likelihood based methods are sensitive to outliers and so one common approach is to include a mixture component that flattens the exponential tails of distributions (beyond what the T-dist does). Conceptually this looks like

data \sim (1−w) \cdot truncated(N(\mu,\sigma),lb,ub))+w \cdot U(lb,ub)

where w is some small number fixed or estimated. Is there a way to do this with the MixtureModel in Distributions.jl. I’ve used this in both Turing and MTK models in the past with success and I know these have the necessary logpdf function and can be operated on by ForwardDiff. But am not sure if it is possible in Pumas. I’ve tried the following and a few variants but have run into errors running the fit command. The error is a no method matching on the MixtureModel line.

Any thoughts / suggestions?


diff = DV_comp - DV_obs;

dist1 = truncated(Normal(0.0,1.0); lower = -3.0, upper = 3.0);

dist2 = Uniform(-3.0,3.0);

mix_dist = MixtureModel([dist1, dist2], [0.99, 0.01]);

diff ~ @. mix_dist;

patrick · August 25, 2025, 6:43pm

Dear wrholmes,

The way you write it out here seems a little confusing, but let’s see if we can figure it out. You have diff on the left hand side in two places. Can you maybe rewrite it in a form where dv_name_in_subject ~ @. dist is in the last line? Where you replace the two names of course. You of course need to shift the bounds according to DV_comp if I understand what you’re trying to do correctly. A more complete (minimal example) would also help.

Best,

patrick · August 25, 2025, 7:12pm

Okay, I looked into it with a simple example. Notice that you have to be a bit careful here, because we need to get the dimensions right. diff~ needs a right hand side that is a vector of MixtureModels each with a vector of two Normals and a vector of priors. So consider something like the following built using a sample example I have in my own folder, so just an illustration

using Pumas, PharmaDatasets

data = read_pumas(dataset("pumas/sim_data_model1"))

mdsl1 = @model begin
    @param begin
        θ ∈ VectorDomain(1, init = [0.5])
        "volume with these units"
        θvc ∈ RealDomain(init = 1.0)
        Ω ∈ PDiagDomain(init = [0.1])
        Ωvc ∈ RealDomain(lower = 0.01, init = 0.04)
        σ ∈ RealDomain(lower = 0.001, upper = 1.0, init = 0.1)
    end

    @random begin
        η ~ MvNormal(Ω)
        ηvc ~ Beta(Ωvc)
    end

    @pre begin
        CL = θ[1] * exp(η[1])
        Vc = θvc + ηvc
    end

    @vars begin
        conc = Central / Vc
    end

    @dynamics Central1

    @derived begin
        dvn1 = @. Normal(conc, σ)
        dvn2 = @. Normal(conc, 1.0)

        dv ~ map(dvn1, dvn2) do d1, d2
            MixtureModel([d1,d2],[0.1,0.9])
        end
    end
end

param = init_params(mdsl1)

ft = fit(
    mdsl1,
    data,
    param,
    LaplaceI();
    optim_options = (show_trace = true,),
    constantcoef = (:θvc,),
)

wrholmes · August 25, 2025, 7:28pm

Thanks for the response. A little more explanation. In a typical model you will have DV_obs as data and the model will somehow compute DV_comp. A standard additive error model would be

DV_obs ~ @. Normal(DV_comp , sigma)

However I want a slightly more complicated error model so that the Normal on the right side is a mixture of the normal with a uniform. Something like the following

DV_obs ~ @. MixtureModel(
                                        [truncated(Normal(DV_comp, sqrt(σ²_add)) ; lower = DV_comp - 3.0, upper = prop_chng_comp + 3.0),
                                        Uniform(DV_comp - 3.0, DV_comp + 3.0)],
                                        [0.99,0.01]
                                        )

In the above code I just defined diff = DV_obs - DV_comp because it makes the code visually cleaner.

The purpose here is that by mixing the usual normal error model with the uniform, the uniform will keep the exponential tail of the Normal from assigning extremely low likelihoods. This sort of on-the-fly outlier / contamination handling is common in other domains (though not Pharma from what I can tell).

I have done this in MTK and Turing models, though it’s been awhile. The MixtureModel should dispatch a logpdf that is differentiable (thanks to the Distributions.jl people!).

When I try to put this in the @derived block of a Pumas model I get errors. It is possible that Pumas doesn’t recognize MixtureModel, but it is a core type in Distributions.jl.

Is this sort of thing possible (either this way or some other)? I am not sure if Pumas allows any distribution from Distributions.jl with a differentiable logpdf or only select distributions.

patrick · August 25, 2025, 7:32pm

I think we were writing simultaneously. Did you see my updated response? MixtureModel does work in my example

wrholmes · August 25, 2025, 7:47pm

Yes, I was responding at the same time.

Thanks.

wrholmes · August 26, 2025, 2:04pm

I did get a version of what you suggested working for me. Thanks again!

One followup note. It appears your example works with LaplaceI() but not FOCE(). The error states that something about the MixtureModel is not supported for the FOCE method. This is not an issue since I’m happy to proceed with LaplaceI, but it may point to some sort of internal inconsistency between the two methods. I don’t know if there is some other difference above and beyond the simplification random effects integral approximation that may cause it.

Just thought I would point it out.

Topic		Replies	Views
Mixture model in Pumas How-to	7	731	June 22, 2023
Multiple errors while fitting model Estimation	6	369	June 4, 2022
How to define error How-to	23	3013	April 3, 2024
Help with Fitting Estimation	54	2499	March 5, 2020
Fitting error of a tumor growth model Basic Usage	13	81	March 7, 2025

Contamination Model Using Mixture Models?

Related topics