In situ sampling design

Set directory

Set parameters

Set filenames

Open in situ data prepared

Get "pixel ratio" for each polygons

Pixel ratio = number of 10m pixels belonging to specific crop type divided by the total number of 10m pixels

$ pix\_ratio = \frac{crop\_pix}{total\_pix} $

All polygons from a same class will have the same pixel ratio !

Assign sampling design strategy for each polygons

The polygons that are selected for the classification are used either for the calibration of the classifier model or for the validation of the results. Even though the classification is run per-pixel, this split needs to be done at the polygon-level to ensure a proper independent validation following international standards.
This split at the polygon-level is done into 3 steps:

Strategy 1 — majority classes

By default, the majority classes are defined as those counting a number of 10m pixels corresponding to more than 5% (pix_ratio_threshold) of the total number of pixels considering all classes to map.

Crop type belongs to strategy 1 if :

$pix\_ratio ≥ pix\_ratio\_threshold$

In this case, for each majority class, the number of training pixels $crop\_target$ will correspond to 25% of the number of pixels of the class (crop_pixels) :

$crop\_target = sample\_ratio\_hi \times crop\_pix$

Strategy 2 — minority classes

The minor classes are defined as those counting a number of S2 pixels corresponding to less than 5% of the total number of pixels considering all classes to map.

Crop type belongs to strategy 2 if :

$pix\_ratio < pix\_ratio\_threshold$

In this case, for each minor class, the number of training pixels $crop\_target$ will correspond to 75% of the number of pixels of the class :

$crop\_target = sample\_ratio\_lo \times crop\_pix$

Get summary of the CAL/VAL splitting

Plot calibration and validation in situ data

Write geodataframe into a shapefile