Code
include("utils.jl")
[ Info: loading success
include("utils.jl")
[ Info: loading success
=@pipe CSV.File("./data/palmerpenguins.csv")|>DataFrame|>dropmissing
dffirst(df,10)
Row | species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex |
---|---|---|---|---|---|---|---|
String15 | String15 | Float64 | Float64 | Int64 | Int64 | String7 | |
1 | Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male |
2 | Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female |
3 | Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | female |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female |
5 | Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male |
6 | Adelie | Torgersen | 38.9 | 17.8 | 181 | 3625 | female |
7 | Adelie | Torgersen | 39.2 | 19.6 | 195 | 4675 | male |
8 | Adelie | Torgersen | 41.1 | 17.6 | 182 | 3200 | female |
9 | Adelie | Torgersen | 38.6 | 21.2 | 191 | 3800 | male |
10 | Adelie | Torgersen | 34.6 | 21.1 | 198 | 4400 | male |
describe(df)
Row | variable | mean | min | median | max | nmissing | eltype |
---|---|---|---|---|---|---|---|
Symbol | Union… | Any | Union… | Any | Int64 | DataType | |
1 | species | Adelie | Gentoo | 0 | String15 | ||
2 | island | Biscoe | Torgersen | 0 | String15 | ||
3 | bill_length_mm | 43.9928 | 32.1 | 44.5 | 59.6 | 0 | Float64 |
4 | bill_depth_mm | 17.1649 | 13.1 | 17.3 | 21.5 | 0 | Float64 |
5 | flipper_length_mm | 200.967 | 172 | 197.0 | 231 | 0 | Int64 |
6 | body_mass_g | 4207.06 | 2700 | 4050.0 | 6300 | 0 | Int64 |
7 | sex | female | male | 0 | String7 |
```{julia}
#| label: fig-simpson-paradox
#| fig-cap: simpson-paradox on palmerpenguins
#| fig-align: center
#| warning: false
axis = (width = 300, height = 300)
penguin_bill = data(df) * mapping(
:bill_length_mm => (t -> t / 10) =>"bill_length",
:bill_depth_mm => (t -> t / 10) =>"bill_depth",
)
pipeline1=penguin_bill * linear() * mapping(color = :species)
pipeline2=penguin_bill * mapping(color = :species)*visual(Scatter;strokewidth=1,strokcolor=:black)
pipeline3=penguin_bill *linear()
plt =(pipeline1+pipeline2+pipeline3)*visual(alpha = 0.5)
draw(plt; axis = axis)
```