• Multiresolution digital soil mapping of permafrost soils using a random forest classifier: an investigation along the Dalton Highway corridor, Alaska

      Paul, Joshua D.; Ping, Chien-Lu; Prakash, Anupma; Rossello, Jordi Cristobal; Libohova, Zamir (2018-12)
      In order to complete soil inventories in the remote permafrost zones of Alaska, there is a need to develop efficient digital soil mapping tools that can be applied over large areas using a minimum of ground truth data. This investigation first used a random forest classifier to test combinations of environmental input data at multiple resolutions (10m, 30m, and 100m). Five tiers of soil taxonomic units were predicted: Order, Suborder, Great Group, "Series Concept", and Particle Size Class. Model outputs are compared quantitatively via estimated out-of-bag accuracy, and qualitatively via visual inspection by soil scientists. Estimated out-of-bag accuracy ranged from ~45% to ~75%, with results improving when fewer classes were modeled. Model runs at 10m and 30m resolution performed comparably, with 100m resolution performing ~5-10% worse in most cases. Increasing the number of trees used, including categorical environmental input data (e.g. landforms), and replacement of environmental covariates with principal component analysis (PCA) bands did not significantly improve model performance. The random forest classifier was then used in a digital soil mapping pilot study along the Dalton Highway in northern Alaska. Parameters suggested in the initial study were used to predict multiple soil taxonomic classes from a basic collection of environmental covariates generated using high resolution (10m) satellite images and sparsely sampled pedon data. Covariates included maximum curvature, multiresolution valley bottom flatness, normalized height, potential incoming solar radiation, slope, terrain ruggedness index, and modified soil and vegetation index. Five tiers of soil taxonomic units were predicted: Order, Suborder, Great Group, "Series Concept", and Particle Size Class. Model outputs are compared quantitatively via estimated out-of-bag accuracy. Estimated out-of-bag accuracy ranged from ~45% to ~75%, with results improving when fewer classes were modeled. We suggest future research into optimized sampling to ensure an adequate distribution of samples across the feature space, and the incorporation of expert knowledge into accuracy assessments. Overall, digital soil mapping with random forest classifiers appears to be a promising method for completing the soil survey of Alaska.