How to apply function to selected rows of columns

KVTobin · August 17, 2022, 9:21pm

Hello,
I have a DataFrame and I want to take the average of specific sections of the columns.
I could use df = mean.(eachcol(_)) to average the entire column, but I want the average of only the values in the column with the same sigma value.

Here is what I want the final DataFrame to look like:

andreasnoack · August 18, 2022, 6:44am

The relevant part of the documentation is Split-apply-combine · DataFrames.jl. You are looking for a groupby operation together with combine, i.e. something like

julia> df = crossjoin(DataFrame(σ=0.1:0.1:0.3), DataFrame(x1=randn(5), x2=rand(5)))
15×3 DataFrame
 Row │ σ        x1          x2
     │ Float64  Float64     Float64
─────┼────────────────────────────────
   1 │     0.1  -0.0615925  0.86128
   2 │     0.1  -1.99798    0.424245
   3 │     0.1  -0.733117   0.0715064
   4 │     0.1   0.0684695  0.135994
   5 │     0.1   0.638693   0.113608
   6 │     0.2  -0.0615925  0.86128
   7 │     0.2  -1.99798    0.424245
   8 │     0.2  -0.733117   0.0715064
   9 │     0.2   0.0684695  0.135994
  10 │     0.2   0.638693   0.113608
  11 │     0.3  -0.0615925  0.86128
  12 │     0.3  -1.99798    0.424245
  13 │     0.3  -0.733117   0.0715064
  14 │     0.3   0.0684695  0.135994
  15 │     0.3   0.638693   0.113608

julia> combine(groupby(df, "σ"), Not("σ") .=> mean)
3×3 DataFrame
 Row │ σ        x1_mean    x2_mean
     │ Float64  Float64    Float64
─────┼──────────────────────────────
   1 │     0.1  -0.417105  0.321327
   2 │     0.2  -0.417105  0.321327
   3 │     0.3  -0.417105  0.321327

or

julia> gdf = groupby(df, "σ");

julia> combine(gdf, valuecols(gdf) .=> mean)
3×3 DataFrame
 Row │ σ        x1_mean    x2_mean
     │ Float64  Float64    Float64
─────┼──────────────────────────────
   1 │     0.1  -0.417105  0.321327
   2 │     0.2  -0.417105  0.321327
   3 │     0.3  -0.417105  0.321327

Topic		Replies	Views
Round Numbers in Dataframe Basic Usage	5	243	February 6, 2023
Geometric mean in NCA analysis NCA	3	1390	February 21, 2020
Error message when using DataFrame() Data read	9	628	April 5, 2022
Filling vectors with repeated numbers How-to	2	209	July 27, 2022
How to get multiple data summaries with one-line of code How-to	2	253	November 14, 2022

How to apply function to selected rows of columns

Related topics