Pumas 2.5.0 bugs

I’ve noticed a few bugs I’d like to report.

  1. Sometimes, inspect df’s will be unreadable by CSV.write due to the wres_approx column.
  2. Simulations generate cmt column that utilizes type String7, which cannot be read into read_pumas.
  3. If you utilize LogNormal residual error, inspect will not calculate weighted residuals.

Here is the error for reading the inspect df:

julia> test = CSV.read("./programs/results/estimation/separated/parallel_absorption/inspect_zo_Erlang_R_logit.csv", DataFrame, missingstring=[""])
┌ Warning: thread = 27 warning: only found 17 / 45 columns around data row: 379. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
┌ Warning: thread = 11 warning: only found 17 / 45 columns around data row: 234. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
┌ Warning: thread = 2 warning: only found 17 / 45 columns around data row: 698. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
┌ Warning: thread = 12 warning: only found 17 / 45 columns around data row: 176. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
┌ Warning: thread = 32 warning: only found 17 / 45 columns around data row: 466. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
┌ Warning: thread = 22 warning: only found 17 / 45 columns around data row: 843. Filling remaining columns with `missing`
└ @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:577
ERROR: TaskFailedException

    nested task error: thread = 8 fatal error, encountered an invalidly quoted field while parsing around row = 174, col = 29: ""FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}(Optim.NewtonTrustRegion{Float64}(1.0, 100.0, 1.4901161193847656e-8, 0.1, 0.25, 0.75, false), Optim.Options(x_abstol = 0.0, x_reltol = 0.0, f_abstol = 0.0, f_reltol = 0.0, g_abstol = 1.0e-5, g_reltol = 1.0e-8, outer_x_abstol = 0.0, outer_x_reltol = 0.0, outer_f_abstol = 0.0, outer_f_reltol = 0.0, outer_g_abstol = 1.0e-8, outer_g_reltol = 1.0e-8, f_calls_limit = 0, g_calls_limit = 0, h_calls_limit = 0, allow_f_increases = false, allow_outer_f_increases = true, successive_f_tol = 1, iterations = 1000, outer_iterations = 1000, store_trace = false, trace_simplex = false, show_trace = false, extended_trace = false, show_every = 1, time_limit = NaN, )
    ", error=INVALID: OK | QUOTED | EOF | INVALID_QUOTED_FIELD , check your `quotechar` arguments or manually fix the field in the file itself
    
    Stacktrace:
     [1] fatalerror(buf::Vector{UInt8}, pos::Int64, len::Int64, code::Int16, row::Int64, col::Int64)
       @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:581
     [2] parsevalue!(#unused#::Type{String}, buf::Vector{UInt8}, pos::Int64, len::Int64, row::Int64, rowoffset::Int64, i::Int64, col::CSV.Column, ctx::CSV.Context)
       @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:789
     [3] parserow
       @ /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:631 [inlined]
     [4] parsefilechunk!(ctx::CSV.Context, pos::Int64, len::Int64, rowsguess::Int64, rowoffset::Int64, columns::Vector{CSV.Column}, #unused#::Type{Tuple{}})
       @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:550
     [5] multithreadparse(ctx::CSV.Context, pertaskcolumns::Vector{Vector{CSV.Column}}, rowchunkguess::Int64, i::Int64, rows::Vector{Int64}, wholecolumnslock::ReentrantLock)
       @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:360
     [6] macro expansion
       @ /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/WorkerUtilities/ey0fP/src/WorkerUtilities.jl:384 [inlined]
     [7] (::CSV.var"#34#39"{CSV.Context, Vector{Vector{CSV.Column}}, Int64, Int64, Vector{Int64}, ReentrantLock})()
       @ CSV ./threadingconstructs.jl:410

...and 5 more exceptions.

Stacktrace:
 [1] sync_end(c::Channel{Any})
   @ Base ./task.jl:445
 [2] macro expansion
   @ ./task.jl:477 [inlined]
 [3] CSV.File(ctx::CSV.Context, chunking::Bool)
   @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:240
 [4] File
   @ /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:227 [inlined]
 [5] #File#32
   @ /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/file.jl:223 [inlined]
 [6] read(source::String, sink::Type; copycols::Bool, kwargs::Base.Pairs{Symbol, Vector{String}, Tuple{Symbol}, NamedTuple{(:missingstring,), Tuple{Vector{String}}}})
   @ CSV /build/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/CSV/OnldF/src/CSV.jl:117
 [7] top-level scope
   @ ~/data/code/padagis_mibe/post_FDA_meeting_Nov2023/m5/analysis/programs/model_fitting/pilot1_3_estimation_RT_separated_(parallel absorption_zo_Erlang)_(logit).jl:248

Dear Donald,

Thank you for the feedback.

  1. Yes, this is indeed a regression from v2.4.x . We now expose the inner optimization options through the FOCE and LaplaceI constructors. This had the unintended consequence that a very verbose print of the information in each type is written out as wres_approx. If the expression contains the separator used in the CSV file, it will fail to read later. We will include a fix for this in the next bug fix release of the v2.5.x series of Pumas. Until then, the only workaround is to overwrite the column with "FOCE" or "LaplaceI" or drop the column completely before storing the CSV.

  2. Can you help me understand how this might happen? Do you mean if you create a data frame from a simulation? I will file an issue, but I may need a few more details for it to be actionable on our end. Thank you.

  3. This is correct. We currently do not output wres for LogNormal dependent variables, only Normal. To get something close to it, you would need to specify the model in the log-domain and evaluate the residuals that way. There is already an open issue to include this feature.

Thank you, Patrick.

Yes, when I create a dataframe from a simulation, the cmt column will be of type String7, instead of Int or Symbol as required for read_pumas.