Comparing datasets

How do we compare subjects in two datasets? How do we check if there are any missing subjects in a new dataset?

A function called antijoin can be used to compare datasets.

antijoin(df1, df2; on =[:id])

This will compare df1 to df2 and will give the missing information in df1 in comparison to df2.

antijoin(df2, df1; on =[:id])

This will compare df2 to df1 and give the missing information in df2 compared to df1.

1 Like