Downscaling legacy soil information for hydrological soil mapping using multinomial logistic regression
Loading...
Date
2023
Authors
Smit, I.E.
Van Zijl, G.M.
Riddell, E.S.
Van Tol, J.J.
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Abstract
In South Africa, there is a growing demand for large scale detailed hydrological soil maps for modelling and management purposes. However, imbalanced legacy soil information often impedes the accurate creation of such maps by not being representative of the environmental complexity of large-scale catchments and containing imbalanced soil class distributions, often resulting in the loss of minority soil classes, which are often of great hydrological importance (e.g., wetland and riparian soils). In this study, we proposed a new downscaling approach to handle spatially localised legacy soil data within a larger low resolution legacy soil dataset to create an accurate hydrological soil map of the macro-scale (5790 km2) Sabie-Sand catchment using multinomial logistic regression (MNLR). The spatially localised legacy data was downscaled using k-means clustering and added to the broader legacy dataset. Five levels of legacy soil data were analysed in their representation of environmental covariates using QQ-plots and a Welsh’s t-test and their mapping accuracy using confusion matrix’s and Kappa coefficient statistics. However, MNLR also requires balanced soil classes. The value of the best performing legacy soil dataset was also compared to using all available soil information after both had their soil class distributions fully balanced using Synthetic Minority Oversampling Technique (SMOTE). The 500 ha/observation-SMOTE dataset resulted in the most accurate hydrological soil map with a validation point accuracy of 73% and a Kappa coefficient of 0.60, substantially outperforming the other downscaled soil maps as well as the SMOTE balanced dataset using all available soil information. This was due to the decreased variation between observations and catchment means, where the 500 ha/observation dataset yielded the least variation between soil observation and catchment datasets and well as reducing the class imbalance within the legacy soil data. Downscaling spatially localised legacy soil data for environmental representation is an effective tool to improve digital soil mapping accuracy using MNLR.
Description
Keywords
Hydropedology, K-means clustering, Digital soil mapping, SMOTE
Citation
Smit, I. E., Van Zijl, G. M., Riddell, E. S., & Van Tol, J. J. (2023). Downscaling legacy soil information for hydrological soil mapping using multinomial logistic regression. Geoderma, 436, 116568. https://doi.org/10.1016/j.geoderma.2023.116568