Code
include("utils.jl")[ Info: loading success
include("utils.jl")[ Info: loading success
df=@pipe CSV.File("./data/palmerpenguins.csv")|>DataFrame|>dropmissing
first(df,10)| Row | species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex |
|---|---|---|---|---|---|---|---|
| String15 | String15 | Float64 | Float64 | Int64 | Int64 | String7 | |
| 1 | Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male |
| 2 | Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female |
| 3 | Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | female |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female |
| 5 | Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male |
| 6 | Adelie | Torgersen | 38.9 | 17.8 | 181 | 3625 | female |
| 7 | Adelie | Torgersen | 39.2 | 19.6 | 195 | 4675 | male |
| 8 | Adelie | Torgersen | 41.1 | 17.6 | 182 | 3200 | female |
| 9 | Adelie | Torgersen | 38.6 | 21.2 | 191 | 3800 | male |
| 10 | Adelie | Torgersen | 34.6 | 21.1 | 198 | 4400 | male |
describe(df)| Row | variable | mean | min | median | max | nmissing | eltype |
|---|---|---|---|---|---|---|---|
| Symbol | Union… | Any | Union… | Any | Int64 | DataType | |
| 1 | species | Adelie | Gentoo | 0 | String15 | ||
| 2 | island | Biscoe | Torgersen | 0 | String15 | ||
| 3 | bill_length_mm | 43.9928 | 32.1 | 44.5 | 59.6 | 0 | Float64 |
| 4 | bill_depth_mm | 17.1649 | 13.1 | 17.3 | 21.5 | 0 | Float64 |
| 5 | flipper_length_mm | 200.967 | 172 | 197.0 | 231 | 0 | Int64 |
| 6 | body_mass_g | 4207.06 | 2700 | 4050.0 | 6300 | 0 | Int64 |
| 7 | sex | female | male | 0 | String7 |
```{julia}
#| label: fig-simpson-paradox
#| fig-cap: simpson-paradox on palmerpenguins
#| fig-align: center
#| warning: false
axis = (width = 300, height = 300)
penguin_bill = data(df) * mapping(
:bill_length_mm => (t -> t / 10) =>"bill_length",
:bill_depth_mm => (t -> t / 10) =>"bill_depth",
)
pipeline1=penguin_bill * linear() * mapping(color = :species)
pipeline2=penguin_bill * mapping(color = :species)*visual(Scatter;strokewidth=1,strokcolor=:black)
pipeline3=penguin_bill *linear()
plt =(pipeline1+pipeline2+pipeline3)*visual(alpha = 0.5)
draw(plt; axis = axis)
```