4-NIR-Spectra-Milk

Author

math4mad

简介

利用 PCA 对不同品种牛奶的近红外光谱数据进行降维处理

在这里从 602d降维到2d,3d,然后利用SVC · MLJ 进行分类,绘制决策边界

参考 :Classification of NIR spectra using Principal Component Analysis in Python

1. load package

Code
include("../utils.jl")
import MLJ:transform,predict
using DataFrames,MLJ,CSV,MLJModelInterface,GLMakie

2. data digest

2.1 load csv=>dataframe

Code
df=load_csv("NIR-spectra-milk")
first(df,10)
10×603 DataFrame
503 columns omitted
Row Column1 labels 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
String15 Int64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64
1 1/02/2018 1 2.39753 2.3942 2.38895 2.38128 2.37191 2.36094 2.34909 2.33675 2.32383 2.30971 2.29375 2.27532 2.25373 2.22825 2.19869 2.16501 2.12755 2.08735 2.0453 2.00258 1.95983 1.91818 1.87829 1.84078 1.80598 1.77438 1.74615 1.72117 1.69931 1.68041 1.66405 1.65001 1.6378 1.6275 1.61876 1.61132 1.60516 1.59983 1.5951 1.59059 1.58612 1.5818 1.57757 1.57357 1.56972 1.56607 1.56265 1.55942 1.55642 1.55372 1.55133 1.54911 1.54703 1.54519 1.54358 1.54237 1.54159 1.54116 1.54116 1.54145 1.54202 1.54291 1.54406 1.5453 1.54653 1.54758 1.54866 1.54986 1.55122 1.55296 1.55495 1.55694 1.55884 1.56038 1.56144 1.56212 1.56237 1.56225 1.5615 1.56006 1.55796 1.5552 1.552 1.54864 1.54536 1.54231 1.53927 1.53626 1.53293 1.52915 1.52472 1.51945 1.51336 1.50644 1.49855 1.48957 1.47928 1.46769
2 1/02/2018.1 1 2.39953 2.39672 2.39168 2.38328 2.37282 2.36116 2.34854 2.33635 2.32436 2.31153 2.29693 2.27919 2.25795 2.23234 2.20204 2.16763 2.12913 2.08767 2.04446 2.00054 1.957 1.91494 1.87484 1.83739 1.80299 1.77194 1.74441 1.72002 1.69886 1.68052 1.66447 1.65057 1.63856 1.62809 1.61911 1.61122 1.6043 1.59809 1.59243 1.58731 1.58272 1.57838 1.57426 1.57032 1.56641 1.56272 1.5593 1.5564 1.55416 1.5522 1.55054 1.54904 1.54754 1.54596 1.54428 1.54276 1.54154 1.5405 1.54005 1.54011 1.54063 1.54154 1.54263 1.54386 1.54516 1.54637 1.54776 1.54926 1.55094 1.55289 1.55495 1.55709 1.55911 1.56082 1.56214 1.56297 1.56332 1.56318 1.56242 1.56104 1.55903 1.55636 1.55316 1.54962 1.54597 1.54234 1.53891 1.53554 1.53214 1.52839 1.52415 1.51917 1.51333 1.50651 1.49868 1.48974 1.47952 1.46797
3 1/02/2018.2 1 2.39647 2.3936 2.38845 2.38099 2.37132 2.35993 2.34811 2.33635 2.32446 2.3115 2.29619 2.27764 2.25522 2.22877 2.19835 2.16399 2.12632 2.08603 2.04386 2.00081 1.95794 1.91628 1.87625 1.83868 1.80421 1.77314 1.74554 1.72122 1.69974 1.68103 1.66475 1.65072 1.63884 1.62889 1.62055 1.61334 1.60698 1.60136 1.5961 1.5911 1.5863 1.58153 1.57689 1.57237 1.5682 1.56448 1.56114 1.55825 1.55571 1.55338 1.55137 1.5495 1.54781 1.54615 1.54467 1.54339 1.5424 1.54185 1.54178 1.54207 1.54268 1.54334 1.54407 1.54484 1.54569 1.54667 1.54784 1.54923 1.55093 1.55288 1.555 1.55721 1.55931 1.56103 1.56229 1.56291 1.56305 1.56267 1.56183 1.5605 1.55858 1.55614 1.55327 1.5501 1.54687 1.54363 1.54043 1.53708 1.53339 1.52924 1.52439 1.51887 1.51259 1.50551 1.49762 1.4887 1.4786 1.46721
4 1/02/2018.3 1 2.40688 2.40424 2.3992 2.39114 2.38054 2.36857 2.35634 2.34405 2.33153 2.31797 2.30259 2.28396 2.26211 2.23634 2.20659 2.17255 2.1348 2.09384 2.05097 2.00741 1.96439 1.9222 1.88192 1.8439 1.80882 1.77683 1.74832 1.72348 1.70202 1.68354 1.66778 1.65442 1.64315 1.63344 1.62511 1.61791 1.61168 1.6062 1.60137 1.59691 1.59266 1.58823 1.58372 1.57906 1.57448 1.57023 1.56651 1.5632 1.56034 1.5578 1.55561 1.55371 1.55202 1.55071 1.54959 1.54857 1.54776 1.54717 1.54693 1.54715 1.54776 1.54868 1.54974 1.55077 1.55177 1.55282 1.5539 1.55508 1.55643 1.55789 1.55934 1.56085 1.56242 1.5641 1.56577 1.56717 1.56826 1.56873 1.56832 1.56701 1.56485 1.56201 1.55865 1.55508 1.55154 1.54815 1.54492 1.54169 1.53825 1.53445 1.53002 1.52496 1.51904 1.51225 1.50436 1.49529 1.48484 1.4731
5 1/02/2018.4 1 2.40988 2.40702 2.40131 2.39267 2.38137 2.3686 2.35552 2.34279 2.33123 2.31932 2.30561 2.2889 2.26787 2.2421 2.21174 2.17669 2.13787 2.09599 2.05236 2.00855 1.96539 1.92386 1.88469 1.8481 1.8141 1.7832 1.75517 1.73034 1.70849 1.68946 1.67313 1.65908 1.64709 1.63696 1.62814 1.62052 1.61401 1.60831 1.60316 1.59844 1.59383 1.58932 1.58481 1.58043 1.57648 1.5729 1.56975 1.56697 1.56438 1.56207 1.55983 1.55779 1.55587 1.55403 1.55245 1.55101 1.55001 1.54934 1.54917 1.54943 1.55003 1.55074 1.55161 1.55259 1.55369 1.55489 1.55635 1.55799 1.55985 1.56166 1.56343 1.56506 1.56649 1.56767 1.56875 1.56963 1.57023 1.57043 1.57005 1.56886 1.5668 1.56399 1.56073 1.55728 1.55384 1.5506 1.5475 1.54435 1.54093 1.53714 1.53273 1.52766 1.52176 1.51488 1.50685 1.49765 1.4871 1.4754
6 1/02/2018.5 1 2.41301 2.40883 2.40207 2.39301 2.38192 2.3696 2.35745 2.34573 2.33416 2.32167 2.30738 2.29006 2.26895 2.24357 2.2135 2.17873 2.14003 2.09796 2.05414 2.00975 1.96631 1.92457 1.88492 1.84771 1.81349 1.78226 1.75422 1.72938 1.70785 1.68903 1.67271 1.65854 1.64646 1.63629 1.62765 1.62037 1.61428 1.60889 1.60399 1.59937 1.59488 1.5905 1.58611 1.5818 1.57775 1.574 1.57068 1.56787 1.56542 1.56313 1.56087 1.5586 1.55636 1.55432 1.55261 1.55139 1.55074 1.55054 1.55071 1.5512 1.55193 1.55284 1.5537 1.55451 1.55531 1.55617 1.55719 1.55848 1.56008 1.56185 1.56368 1.56547 1.56708 1.56853 1.56971 1.57048 1.57077 1.57044 1.5695 1.56792 1.56582 1.56329 1.56047 1.55745 1.55439 1.55132 1.54822 1.5451 1.5418 1.53814 1.53387 1.5288 1.5227 1.51549 1.50713 1.49765 1.4871 1.47559
7 1/02/2018.6 1 2.40641 2.40349 2.39797 2.38992 2.38015 2.36888 2.35759 2.34651 2.33515 2.32307 2.30805 2.28966 2.2676 2.24119 2.21044 2.17569 2.13769 2.09721 2.05494 2.01232 1.96999 1.92859 1.88879 1.85133 1.8168 1.78529 1.75702 1.73198 1.70977 1.69017 1.67324 1.6588 1.64666 1.63655 1.62831 1.62153 1.61571 1.61056 1.60576 1.60113 1.59645 1.59176 1.58711 1.58259 1.57825 1.57422 1.57056 1.56745 1.5647 1.56239 1.56036 1.5586 1.55699 1.55556 1.55429 1.55326 1.55243 1.55185 1.55151 1.55142 1.55162 1.55213 1.55282 1.55369 1.55468 1.55573 1.55686 1.55801 1.55935 1.56087 1.56248 1.56415 1.56581 1.56742 1.56888 1.57012 1.57099 1.57136 1.57103 1.56992 1.56795 1.5652 1.56196 1.55846 1.55487 1.55144 1.54811 1.54484 1.54135 1.53737 1.53277 1.52733 1.52102 1.51383 1.50565 1.49659 1.48648 1.4752
8 1/02/2018.7 1 2.4278 2.42489 2.41844 2.40855 2.39609 2.38277 2.36939 2.35651 2.34441 2.332 2.31786 2.3007 2.27974 2.25444 2.22436 2.18974 2.15133 2.11012 2.06702 2.02315 1.97941 1.93685 1.89622 1.85839 1.82365 1.79251 1.76464 1.73988 1.71781 1.69842 1.68132 1.66653 1.6539 1.64351 1.63479 1.62743 1.62106 1.61564 1.61066 1.60609 1.60173 1.59747 1.59319 1.58896 1.58479 1.58104 1.57754 1.57448 1.57177 1.56929 1.56706 1.56498 1.56301 1.5613 1.55984 1.55873 1.55806 1.5578 1.55791 1.55823 1.55859 1.55902 1.55944 1.55994 1.56061 1.56164 1.56284 1.56424 1.56578 1.56741 1.56896 1.57053 1.57196 1.57334 1.57449 1.57548 1.57627 1.57669 1.57657 1.57572 1.57406 1.57161 1.56856 1.56518 1.5617 1.55831 1.55514 1.55193 1.54862 1.54487 1.54056 1.53546 1.52935 1.52219 1.51388 1.50435 1.49367 1.48189
9 1/02/2018.8 1 2.42194 2.41986 2.41463 2.40612 2.39449 2.38104 2.36704 2.35341 2.34076 2.32783 2.31328 2.29602 2.27478 2.249 2.21866 2.18395 2.14595 2.10522 2.06285 2.01992 1.97736 1.93597 1.89647 1.85912 1.82464 1.79299 1.76446 1.73911 1.71686 1.69767 1.68129 1.66719 1.65516 1.64478 1.63576 1.62793 1.62099 1.61492 1.60952 1.60462 1.59992 1.59544 1.59109 1.58695 1.58292 1.57925 1.57604 1.57333 1.57106 1.56922 1.56744 1.56569 1.56369 1.56163 1.55952 1.55765 1.55628 1.55547 1.55526 1.55571 1.55655 1.55777 1.55904 1.56036 1.56152 1.56262 1.56373 1.56496 1.56632 1.56796 1.56968 1.57151 1.57318 1.57476 1.57598 1.57686 1.57719 1.57697 1.57615 1.57472 1.5726 1.56997 1.56695 1.56371 1.56039 1.55709 1.55388 1.5506 1.54703 1.54299 1.53845 1.53312 1.52699 1.51989 1.51186 1.50277 1.4925 1.48114
10 1/02/2018.9 1 2.41687 2.41362 2.40849 2.40075 2.39046 2.3788 2.36611 2.35336 2.34074 2.32766 2.31327 2.29642 2.27597 2.25109 2.22143 2.18679 2.14827 2.10645 2.06283 2.01869 1.97492 1.93262 1.89241 1.85465 1.8199 1.78849 1.76047 1.73572 1.71409 1.69531 1.67892 1.66484 1.65252 1.642 1.63334 1.62609 1.62003 1.61474 1.60988 1.60518 1.60024 1.59516 1.59032 1.58553 1.58124 1.57738 1.57402 1.57115 1.56865 1.56637 1.56438 1.56237 1.56057 1.55891 1.5575 1.5563 1.55552 1.5551 1.55519 1.55554 1.55627 1.55717 1.55807 1.55902 1.56004 1.56105 1.56217 1.56343 1.56486 1.5664 1.56796 1.5696 1.57116 1.57254 1.57365 1.57435 1.57455 1.57424 1.57335 1.57196 1.57 1.56758 1.56468 1.56147 1.55813 1.55485 1.55163 1.54842 1.545 1.54126 1.53686 1.53182 1.52588 1.51902 1.51112 1.50201 1.49159 1.47983

2.2 corece and split data

Code
 to_ScienceType(d)=coerce(d,:labels=>Multiclass)
 df=to_ScienceType(df)
 ytrain, Xtrain=  unpack(df, ==(:labels),!=(:Column1), rng=123);
 cat=ytrain|>levels
 rows,cols=size(Xtrain)
(450, 601)

3 workflow

3.1 instantate model and train model

Code
    SVC = @load SVC pkg=LIBSVM 
    PCA = @load PCA pkg=MultivariateStats
    maxdim=2;nums=200
    model1=PCA(;maxoutdim=maxdim)
    model2 = SVC()
    mach1 = machine(model1, Xtrain) |> fit!
    Ytr =transform(mach1, Xtrain)
    mach2 = machine(model2, Ytr, ytrain)|>fit!
    Yte=transform(mach1, Xtrain)
    tx,ty,x_test=boundary_data2(Yte)
    yhat = predict(mach2, x_test)|>Array|>d->reshape(d,nums,nums)
import MLJLIBSVMInterface ✔
import MLJMultivariateStatsInterface ✔
[ Info: For silent loading, specify `verbosity=0`. 
[ Info: For silent loading, specify `verbosity=0`. 
[ Info: Training machine(PCA(maxoutdim = 2, …), …).
[ Info: Training machine(SVC(kernel = RadialBasis, …), …).
200×200 Matrix{Int64}:
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  5  5  5  5  5  5  5
 1  1  1  1  1  1  1  1  1  1  1  1  1     5  5  5  5  5  9  9  9  9  9  9  9
 1  1  1  1  1  1  1  1  1  1  1  1  1     9  9  9  9  9  9  9  9  9  9  9  9
 1  1  1  1  1  1  1  1  1  1  1  1  1     9  9  9  9  9  9  9  9  9  9  9  9
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  9  9  9  9  9  9  9  9  9  9  9  9
 1  1  1  1  1  1  1  1  1  1  1  1  1     9  9  9  9  9  9  9  9  9  9  9  9
 1  1  1  1  1  1  1  1  1  1  1  1  1     9  9  9  9  9  9  9  9  9  9  9  9
 ⋮              ⋮              ⋮        ⋱        ⋮              ⋮           
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2  …  6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2  …  6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6
 2  2  2  2  2  2  2  2  2  2  2  2  2     6  6  6  6  6  6  6  6  6  6  6  6

3.2 plot 2d results

Code
function plot_data()
    
    fig=Figure(resolution=(800,800))
    ax= maxdim==3 ? Axis3(fig[1,1]) : Axis(fig[1,1])
    colors=[:red, :yellow,:purple,:lightblue,:black,:orange,:pink,:blue,:tomato]
    contourf!(ax,tx,ty,yhat,levels=length(cat),colormap=:redsblues)
    for (c,color) in zip(cat,colors)
        data=Ytr[ytrain.==c,:]
        if maxdim==3
            scatter!(ax,data[:,1], data[:,2],data[:,3],color=(color,0.8),markersize=14)
        elseif maxdim==2
            scatter!(ax,data[:,1], data[:,2],color=(color,0.8),markersize=14)
        else
            return nothing
        end
    end
    fig
end
plot_data()

3.3 to 3d dimension

Code
  let
    maxdim=3;nums=200
    model1=PCA(;maxoutdim=maxdim)
    model2 = SVC()
    mach1 = machine(model1, Xtrain) |> fit!
    Ytr =transform(mach1, Xtrain)
    mach2 = machine(model2, Ytr, ytrain)|>fit!
    Yte=transform(mach1, Xtrain)
    fig=Figure(resolution=(800,800))
    ax= maxdim==3 ? Axis3(fig[1,1]) : Axis(fig[1,1])
    colors=[:red, :yellow,:purple,:lightblue,:black,:orange,:pink,:blue,:tomato]
    
    for (c,color) in zip(cat,colors)
        data=Ytr[ytrain.==c,:]
        if maxdim==3
            scatter!(ax,data[:,1], data[:,2],data[:,3],color=(color,0.8),markersize=14;label=c)
        elseif maxdim==2
            scatter!(ax,data[:,1], data[:,2],color=(color,0.8),markersize=14;label=c)
        else
            return nothing
        end
    end
    
    fig
end
[ Info: Training machine(PCA(maxoutdim = 3, …), …).
[ Info: Training machine(SVC(kernel = RadialBasis, …), …).