Skip to the content.

When a classification is performed, whether it is generated with machine learning tools such as Random Forest or based on thresholds of a spectral index for example, it should be validated on the basis of in situ data (ground data). In the case where in situ data are also used to generate the classification, the data used to validate the classification must be independent.

The validation data are used to assess the map accuracy defined by the agreement between the map output and the validation data assumed to be the truth. The most common way to derive the map accuracy is to analyse the confusion matrix, which is a square co-occurrence matrix compiling the number of samples matching a given land cover class with validation information. Diagonal values represent the agreement frequency between the validation data and the map output, while non-diagonal values represent the errors.

Among fourteen class-level and twenty map-level accuracy metrics, Liu et al. (2007) recommended user accuracy (UA), producer accuracy (PA) and overall accuracy (OA) as primary accuracy measures. For binary maps such as the cropland mask, the OA depends to a large extent on the respective proportion of both classes in the validation data set. In this case, the F-Score, the use of which has been recently adopted, is a more informative accuracy metric.

The Overall Accuracy (OA) is computed as the ratio of the number of all correctly classified samples to the total number (N) of all validation samples. A standard target for the overall accuracy of a land cover map is typically 85 percent. In some cases, simple land cover maps that include very few classes can reach 90 percent.

The User Accuracy (UA) for a given land cover class i is the ratio between the number of correctly classified samples as belonging to this class and all samples classified in this class.

The Producer Accuracy (PA) for a given class i is the ratio between the number of correctly classified samples and all samples belonging to this class, according to the validation data.

The F-score is calculated as a combination of PA and UA for a given land cover class i:


Handbook on remote sensing for agricultural statistics