Speeding up NCA on Simulations

donaldlee3 · August 27, 2021, 1:12am

I am performing NCA on simulations (x1000) with varying Nsubjects (x7) and sampling scenarios (x8), and it tests out successfully on 2 reps. At 100 reps, it fails to complete in 6 hrs.

Is there a way to speed up the following code?

cum_sim_df = CSV.read("./modeling/Depo-SubQ Provera 104/abs1st_PK_sims_SD_adjust_params.csv", DataFrame, missingstrings = [""])

cum_sim_NCAresults = map(Base.product(1:2, 1:length(NSubjs), 1:length(samp_names))) do i
    println(i[1])
    i_sim_df = filter([:rep, :NSubj, :samp] => (x, y, z) -> x==i[1] && y==NSubjs[i[2]] && z==samp_names[i[3]], cum_sim_df)
    
    i_sim_df[!, :route] .= "ev"

    i_sim_NCA = read_nca(i_sim_df,
                        id              =   :id,
                        time            =   :time,
                        observations    =   :dv,
                        amt             =   :amt,
                        # all subj's same grouping w/in each read_nca run, but to retain labels
                        group           =   [:rep, :NSubj, :samp],
                        route           =   :route)
 
    i_sim_tmax          =   NCA.tmax(i_sim_NCA)
    i_sim_cmax          =   NCA.cmax(i_sim_NCA)
    i_sim_auc           =   NCA.auc(i_sim_NCA)
    i_sim_auc_t         =   NCA.auc(i_sim_NCA, interval=(0, 90))
    x = zip(i_sim_NCA, NCA.tmax.(i_sim_NCA))
    i_sim_auc_0_tmax    =   map(x -> NCA.auc(x[1], interval=(0, x[2])), x)
    i_sim_auc_tmax_t    =   map(x -> NCA.auc(x[1], interval=(x[2], 90)), x)

    temp_df             =   DataFrame(auc_0_tmax = i_sim_auc_0_tmax, auc_tmax_t = i_sim_auc_tmax_t)
    i_sim_NCAresults_df =   hcat(i_sim_tmax, i_sim_cmax, i_sim_auc, i_sim_auc_t, temp_df, makeunique=true)

    return i_sim_NCAresults_df
end

cum_sim_NCAresults_df   =   vcat(cum_sim_NCAresults...)

vijay · August 27, 2021, 8:13am

hi Donald

Without knowing what your dataset structure is, it is hard to guess, but my first suggestion is to perhaps break up the function into smaller parts and bring the pieces together like this below

function do_nca(sim)
  ncadf = nca_prep(sim)
  nca = rnca(ncadf)
  res = compute_results(nca)
  return res
end

The biggest slowdown in your code above is probably the filter statement, so having a separate nca_prep function will help. Also, you may want to think through if there are alternate ways of passing the data in.

rnca is just call the read_nca function and get the data ready. And finally, compute_results will do all your specific NCA related computations.

We can help more if you have some example data or mock structure of what you are passing in.

Vijay

andreasnoack · August 27, 2021, 8:48am

It would useful to know where the time is spent. If you define a normal function instead of using the do syntac to generate a closure then you can profile that function on a single subject, see Home · FlameGraphs.jl. Alternatively, you might be able to just use GitHub - KristofferC/TimerOutputs.jl: Formatted output of timed sections in Julia to get some timer output.

Btw, isn’t the filtering essentially a groupby operation? If so then using groupby might also be more efficient.

Topic		Replies	Views
Simulation time Simulation	12	402	July 24, 2023
Modification of points used to calculate slope NCA	2	317	August 30, 2021
NCA for single oral dose NCA	2	820	March 25, 2020
Read_nca function arguments NCA	2	956	March 17, 2020
Read_nca errors NCA	4	685	October 25, 2019

Speeding up NCA on Simulations

Related topics