Top Banner
0038-075X/08/17301-25-34 January 2008 Soil Science Vol. 173, No. I Copyright © 2008 by Lippincott Williams & Wilkins, Inc. Printed in U.S.A. LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES Cathy A. Seybold, Moustala A. Elrashidi, and Robert J. Engel In Soil Survey, there is a need to estimate liquid limit (LL) and plasticity index (P1) for areas where data are not available. The objectives were to determine if LL and P1 prediction equations could be developed from readily available soil properties in Soil Survey, and to test two different data stratification approaches to improve predictability. Measured data in the National Soil Survey Characterization database and multiple linear regression were used for model development. Clay content (<2 pm) and cation exchange capacity were the primary variables used to predict both LL and P1. To predict LL, four equations were developed from 10 taxonomic soil order strata (aggregate of seven soil order strata, Andisols, Spodosols, and Vertisols) that explained between 68% and 81% of the variation in LL, with the Andisols order having the lowest predictability. To predict P1, 10 unique taxonomic soil order equations were developed (Aridisols, Alfisols, Entisols, Inceptisols, Mollisols, Oxisols, Ultisols, Andisols, Spodosols, and Verti- sols) that explained between 15% and 77% of the variation in P1, with the Andisols order having the lowest predictability. A few prediction equations were developed from the taxonomic mineralogy strata, which produced models with similar predictability to that of the soil order equations. Validation of the best fitting models with an independent data set showed no significant difference from unit 1 slope and 0 intercept. Predicting LL and P1 from readily available soil properties resulted in mostly moderate to strong prediction equations. The most useful equations are those with R2 > 0.60. These prediction equations can be useful in Soil Survey when there are no available data. (Soil Science 2008;173:25-34) Key words: Liquid limit, plasticity index, prediction, general linear models. T HE Atterberg limits are moisture content limits that divide the states of soil consis- tency, which is the degree of resistance to deformation. There are three states of soil consistency, the shrinkage limit that separates the solid state from semisolid state; the plastic limit (PL) that separates the semisolid state from plastic state; and the liquid limit (LL) that separates the plastic from liquid state (PCA, 1992). The width of the plastic state (LL minus USDA.NRCS, National Soil Survey Center, 100 Centennial Mall North, Federal Bldg, Room 152, Lincoln, NE. Dr. Cathy A. Seybold is corresponding author. E-mail; [email protected] Received Dec. 20, 2006; accepted Aug. 27, 2007. DOl: 10.1097/ss.06013e318159a5e1 PL), in terms of moisture content, is the plasticity index (P1). Plasticity is the capability of a soil to undergo unrecoverable deformation at constant volume without cracking or crum- bling (McBride, 2002). The LL and P1 are used in soil survey for interpreting soils for engineer- ing classifications and other engineering pur- poses. The Atterberg limits are important for classifying cohesive soil materials and is useful for interpreting soils for shear strength, bearing capacity, compressibility, and swelling potential (McBride, 2002). In addition, LL and P1 determinations are cumbersome, time consum- ing, and are not part of routine soil survey characterization analysis. They are carried out on an ad hoc basis, and such data are not
10

LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

Mar 11, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

0038-075X/08/17301-25-34 January 2008Soil Science Vol. 173, No. ICopyright © 2008 by Lippincott Williams & Wilkins, Inc. Printed in U.S.A.

LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMITAND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

Cathy A. Seybold, Moustala A. Elrashidi, and Robert J. Engel

In Soil Survey, there is a need to estimate liquid limit (LL) andplasticity index (P1) for areas where data are not available. The objectiveswere to determine if LL and P1 prediction equations could be developedfrom readily available soil properties in Soil Survey, and to test twodifferent data stratification approaches to improve predictability.Measured data in the National Soil Survey Characterization databaseand multiple linear regression were used for model development. Claycontent (<2 pm) and cation exchange capacity were the primaryvariables used to predict both LL and P1. To predict LL, four equationswere developed from 10 taxonomic soil order strata (aggregate of sevensoil order strata, Andisols, Spodosols, and Vertisols) that explainedbetween 68% and 81% of the variation in LL, with the Andisols orderhaving the lowest predictability. To predict P1, 10 unique taxonomicsoil order equations were developed (Aridisols, Alfisols, Entisols,Inceptisols, Mollisols, Oxisols, Ultisols, Andisols, Spodosols, and Verti-sols) that explained between 15% and 77% of the variation in P1, withthe Andisols order having the lowest predictability. A few predictionequations were developed from the taxonomic mineralogy strata, whichproduced models with similar predictability to that of the soil orderequations. Validation of the best fitting models with an independentdata set showed no significant difference from unit 1 slope and 0intercept. Predicting LL and P1 from readily available soil propertiesresulted in mostly moderate to strong prediction equations. The mostuseful equations are those with R2 > 0.60. These prediction equations canbe useful in Soil Survey when there are no available data. (Soil Science2008;173:25-34)

Key words: Liquid limit, plasticity index, prediction, general linearmodels.

THE Atterberg limits are moisture contentlimits that divide the states of soil consis-

tency, which is the degree of resistance todeformation. There are three states of soilconsistency, the shrinkage limit that separatesthe solid state from semisolid state; the plasticlimit (PL) that separates the semisolid state fromplastic state; and the liquid limit (LL) thatseparates the plastic from liquid state (PCA,1992). The width of the plastic state (LL minus

USDA.NRCS, National Soil Survey Center, 100 Centennial Mall North, FederalBldg, Room 152, Lincoln, NE. Dr. Cathy A. Seybold is corresponding author.E-mail; [email protected] Dec. 20, 2006; accepted Aug. 27, 2007.DOl: 10.1097/ss.06013e318159a5e1

PL), in terms of moisture content, is theplasticity index (P1). Plasticity is the capabilityof a soil to undergo unrecoverable deformationat constant volume without cracking or crum-bling (McBride, 2002). The LL and P1 are usedin soil survey for interpreting soils for engineer-ing classifications and other engineering pur-poses. The Atterberg limits are important forclassifying cohesive soil materials and is usefulfor interpreting soils for shear strength, bearingcapacity, compressibility, and swelling potential(McBride, 2002). In addition, LL and P1determinations are cumbersome, time consum-ing, and are not part of routine soil surveycharacterization analysis. They are carried outon an ad hoc basis, and such data are not

Page 2: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

26

SEYBOLD, ET AL. SOIL SCIENCE

generally available. A need remains, in soilsurvey, for a quick and reliable method ofestimating the LL and Pt that can be applied toa wide range of soils. In addition, the LL and PTneed to be predicted from generally availablesoil properties in soil survey.

There are limited unpublished predictionsof LL and P1 from soil characterization data thatwere developed in the 1950s and the early1970s. These models were developed using leastsquares estimates and focused on the relationshipof clay content (<2 tim) to the Atterberg limits.However, these models can only be applied to afew soil types (e.g., kaolinitic clays) or to specificareas or regions (e.g., Iowa loess). In addition,the A horizons were usually eliminated from theanalyses. In these unpublished models, clay wasfound to explain between 66% and 96% of thevariability in the LL, and between 71% and 93%for the P1.

Others have also shown clay to be closelycorrelated to the Atterberg limits (Dc Jong et al.,1990; Mbagwa and Abeh, 1998; Odell et al.,1960; Smith et al., 1985). In B and C hori-zons, total clay content was found to be the mostimportant independent variable for explainingvariation in the Atterberg limits (Dc Jong et al.,1990). In general, the greater the quantity of totalclay in a soil, the greater the plasticity andpotential shrink and swell (Mitchell, 1993). Inaddition to the amount of clay, the type of clayand the size and shape of the particles have aneffect on the Atterberg limits (Mitchell, 1993;Bayer, 1930). In 26 representative soils of Illinois(including A, B, and C horizons), Odell et al.(1960) has shown that the percent montmo-rillonite in the clay fraction is strongly corre-lated to the Atterberg limits. In addition, Odellet al. (1960) found that the organic C (OC),total clay, and the quantity of montmorillonitein the clay fraction could explain 86% and 94%of the variability in the LL and P1, respectively.Dumbleton and West (1966) have shown thatthe plasticity of soils of the same mineralogicaltype but different origins can show consider-able variation in physical properties. As pointedout by Mitchell (1993), the LL and P1 valuesfor any one clay mineral can vary over a widerange.

In general, the greater the surface area, thegreater the amount of water needed to get to theLL state (Seed et al., 1964; Mitchell, 1993). In19 British clay soils, total surface area was highlycorrelated to the LL, and to a lesser extent to theP1 (Farrar and Coleman, 1967). Smith et al.

(1985) found the LL to be more closelycorrelated to the specific surface area than tothe clay content in 66 soil samples taken from 32sites across Israel.

To a lesser extent than clay, OC content hasbeen shown to be correlated to the Atterberglimits (Dejong et al., 1990; Mbagwa and Abeh,1998; Odell et al., 1960; Larney et al., 1988).De Jong et al. (1990) found OC content to be asimportant as the clay content in explaining thevariation in the LL of Ap horizons. In the B andC horizons of the same study, OC was a poorpredictor of the Atterberg limits. Mbagwa andAbeh (1998) found organic matter to be bestcorrelated with the PL. In mainly kaolinitic claysoils of Florida, organic matter was found to beweakly correlated to the PT (Dc La Rosa, 1979).It has been shown that organic matter canincrease the PL without increasing the magni-tude of the P1 (Bayer, 1930; Smith et al., 1985).In other words, two soils may have the same PT,but may exhibit plasticity over entirely differentmoisture ranges.

Cation exchange capacity (CEC) can be anindication of mineral type and has been shownto be highly correlated to the LL (Mbagwa andAbeh, 1998; Odell et al., 1960; Farrar andColeman, 1967) and to a lesser extent to theP1 (Odell et al., 1960; Mbagwa and Abeh,1998). In 30 samples from Nigeria, 79.9% of thevariation in the LL was explained by the CECalone (Mbagwa and Abeh, 1998). Dc La Rosa(1979) found clay, CEC, and organic matter,and their interactions to explain 97% of thevariation in the P1 of 38 soil series (54 samples)from Florida. Exchangeable cations affect the P1by affecting the hydration of the clays (Bayer,1930; Mitchell, 1993). These studies indicatethat CEC could be an important variable inpredicting LL and P1.

In the National Soil Survey Handbook, there isan LL and P1 prediction equation that wasdeveloped from a broad range of soil properties(USDA-NRCS, 2005). However, these twoprediction equations do not use CEC, and theaccuracy of predicting LL and P1 could beimproved if this diverse data set was stratifiedinto more homogeneous soil groups. Theobjectives of this study were to determine (i)the relationship between soil properties that arereadily obtained in soil survey (e.g., CEC, clay,OC) and LL and Pt; and (ii) to determine ifuseful prediction models could be developedafter stratifying by taxonomic order or taxo-nomic family mineralogy class. These predictionI

Page 3: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

VOL. 173 No. I

LINEAR REGRESSION MODELS TO ESTIMATE SOIL LL AND P127

models will benefit the Natural ResourcesConservation Service field soil scientist makingentries into the National Soils InformationSystem, which is the United States' national soilsurvey database. More importantly, these modelsshould improve the accuracy of estimated LLand P1 data, which will benefit all users of soilsurvey data and their interpretations.

MATERIALS AND METHODS

The pre-1999 data in the National SoilSurvey Laboratory Characterization Database atLincoln, Nebraska, were used to develop the LLand P1 models. This database contains about10,000 horizons with measured LL and PT data,representing soils from across the continentalUnited States, Hawaii, and Alaska. Relevantdata in the database include taxonomic classi-fications, morphological descriptions, horizondesignations, and analytical data such as OC,exchange properties, particle size separates, pH,and water retention characteristics.

Basic soil properties evaluated as potentialpredictor variables were pH in water and 0.012vI CaCl2 ; total clay, silt, and sand (pipettemethod); OC (acid-dichromate digestion); watercontent at —1500 kPa (pressure-membraneextraction using sieved samples), CEC (1.0 I'sT

NH 40Ac at pH 7); carbonate clay; bulk densityat —33 kPa water content (clod method); andlinear extensibility percent (LEP). All methodsare described by Burt (2004). All deterniina-tions are from air-dried (30 °C-35 °C),crushed, and sieved (<2 mm) soil samples.Data are reported on an oven-dry basis (Burt,2004). Liquid limit and PL were determined byAmerican Society for Testing and Materialsmethod D 4318 on a less than 0.4-mm base.The P1 is the difference between the LL andPL. If either the LL or PL could not bedetermined, or if PL is greater than the LL,then the soil was eliminated from the data set.In addition, the samples were further restrictedto those exhibiting some degree of plasticity. A— 1500--kPa water-clay ratio greater than 0.6has been used to indicate poor dispersion inparticle size determinations (Soil Survey Staff,1995). Poorly crystalline materials also tend toincrease this ratio. Where clay is used as apredictor variable, data with a 1500—kPawater-clay ratio greater than 0.6 were excludedfrom the data set before model development.

Two different stratifications of the data (bytaxonomic order and by taxonomic family min-

eralogy) were evaluated as a way to improvethe accuracy of model estimations. Modelswere first developed from the whole data set,and then developed from strata of the taxo-nomic orders or family mineralogy. There werevery few soils that were classified as Histosolsand Gelosols that had LL and P1 data, and thusthese soils were excluded from the study. Onlyseven of the fhmily mineralogy classes had data(n > 30) from which a regression model couldbe developed. Three of these mineralogyclasses (i.e., carbonatic, parasequic, and illitic)had less than 60 records, and kaolinitic andsiliceous mineralogy classes had about 300 and200 records, respectively. Mixed and smectiticclasses had the most data, which are the mostcommon mineralogy classes. The rest of therecords did not have a family mineralogy classidentified. Because of the limited data availablefor the mineralogy classes, models were onlydeveloped for the mixed, smectitic, and kaolin-itic mineralogy classes.

Liquid limit and P1 were estimated usinggeneral linear model procedures in SYSTATSoftware (2002). For each strata (data group),the best fit regression model (with the highestR2 and lowest root mean square error (RMSE))was developed. Pearson correlations were per-formed to determine variable colinearity and tohelp in the selection of predictive variables.Only data elements that contributed significantly(P = 0.05) to predicting the LL or P1 and thatcontributed greater than 5% to the overallimprovement of the R2 were included in theequations. Scatter plots of the residuals versusthe fitted values of each model were used toindicate whether there was nonlinearity,unequal variances, and outliers in the data. Alloutliers, as identified by the studentized residualin SYSTAT Software (2002), were removedfrom the data groups. Variables were then addedand subtracted from the general linear modeluntil the best model was found that containedstatistically significant, intuitively meaningfulpredictive variables, and variables that are readilyobtainable within the National Soils Informa-tion System of the Natural Resources Conser-vation Service. A dummy variable regressor(taxonomic mineralogy/order) was used toevaluate model redundancy between predictiveequations (from different data strata) with thesame variables (Fox, 1997). The post hoc Tukeytest (multiple mean comparison procedure)was used for comparison of equation intercepts(Zar, 1999). When intercepts between two

Page 4: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

28 SEYBOLD, ET AL. SOIL SCIENCE

TABLE IThe range in selected properties of soils used in developing

LL and P1 regression equations

Soil property RangeMeanSD

pH (water) 3.3-10.56.71.4CEC (crnol(+) kg)0.4-98.519.512.4Clay (%) 4.0-94.733.416.6Organic C (%)0.01-11.060.770.85

equations were not significantly different, thenthe slope coefficients were compared by check-ing the significance of the interaction terms(dummy variable and predictive variable). Whenredundant equations were indicated (no signifi-cant difference between slope coefficients andintercepts), the data groups (strata) were com-bined, and a new regression model was devel-oped. Significant differences were determined atP = 0.05.

Because of the limited amount of availabledata (n = 516), only four of the better-fittingpredictive equations (i.e., soil orders LL equa-tion, and Alfisols, Mollisols, and Ultisols PTmodels) were validated with an independentdata set. Measured LL and P1 data were takenfrom the National Soil Survey LaboratoryCharacterization Database from years 2000 to2005. The independent data set representspedons from all across the United States,including Alaska and Hawaii. Models wereevaluated by comparing measured versus pre-dicted LL or P1 values. Confidence intervalswere calculated for the slope and intercept ofthe least square estimate line. Statisticallysignificant differences were determined usingP = 0.05.

RESULTS AND DISCUSSION

The range in properties of soils used indeveloping the prediction equations are shownin Table 1. The clay content ranged from 4% to95%, and OC ranged from 0.01% to 11%.Nonplastic soil layers were not used (P1 = 0).Clay content, CEC, and -1500 kPa water werethe most highly correlated to LL and also to thePT (Table 2). Dc Jong et al. (1990) also foundwater retention to be a good index for theAtterberg limits. In general, the correlationcoefficients were lower for the soil propertycorrelations with the P1 than with the LL.Others have also reported lower correlations ofsoil properties with the P1 than with the LL(Dc Jong et al., 1990). Bulk density wassignificantly and negatively correlated to theLL (r = -0.55). The sand content wassignificantly and negatively correlated to theP1 (r = - 0.45) and to a lesser extent to the LL(r = -0.38). Organic C was not significantlycorrelated to either LL or P1. In contrast,Larney et al. (1988) found organic matter tobe highly correlated to the LL and P1. In theirstudy, only Ap horizons were used, and theirdata set contained a very narrow range in claycontents. In the present study, clay contentcovers nearly the entire range found in soils(Table 1). Clay content was highly correlatedto the 1500-kPa water (r = 0.93). These twovariables (clay content and 1500-kPa water)would provide redundant information if bothwere included in a regression model. Claycontent is an easily obtained property in soilsurvey and is the preferred predictor variableover 1500-kPa water. However, in caseswere clay content is not accurately measured

TABLE 2Correlations between soil properties and LL and P1

Soil property'LLP1pHCECClaySiltSandDbLEPOCP1 0.759pH -0.043 -0.180CEC (cmol(+)0.6550.4870.283

kg-')Clay (%)0.8380.610-0.0620.545Silt (%)-0.094-0.1950.0450.047-0.276Sand (%)0.377-0.4540.153-0.319-0.633-0.408Db (g cm)-0.547-0.182-0.019-0.488-0.470-0.0160.200L E P (%)0.020

0.0530.061

0.1000.048-0.029-0.054-0.075

OC (%)0.1600.099-0.095

0.2330.0710.121-0.126-0.4660.027-1500 kPa water 0.8980.606-0.038

0.6430.927-0.172-0.543-0.5760.0490.135

tDb: soil bulk density.

Page 5: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

Vol.. 173 No. ILINEAR REGRESSION MODELS TO ESTIMATE SOIL LL AND P129

(because of clay dispersion problems in the par-ticle size analysis) then - 1 500-kPa water is pre-ferred as a predictor variable over clay content.

Carbonate clay content was not significantlycorrelated to the LL (r = -0.02), PL (r =-0.11) or P1 (r = 0.01). Others have alsoreported inorganic C to have little effect or nocorrelation on the LL or PL (Odell et al., 1900:Smith et al., 1985). Conversely, in highlycalcareous soils of Egypt, Stakman and Elishay(1976) showed increasing LL with an increasingCaCO 5 content up to about 35% CaCO ) . Intheir study , the IL showed a slight increase, andthus the P1 showed the same tendency as the LL.Inorganic C generally reduced the Atterherglimits and the P1 of B and C horizons in a studby Dc Jong et al. (1990). However, the impactof inorganic C was relatively small.

/)1.jfjjj of LL

Clay content and CEC explained 81% ofthe variation iii the LL of the entire data set (n =6592), excluding data with - 1 5( )0-kPa water-clay ratios that were greater than 0.6. Theregression equation is:

LL = 0.655 (day) + 0.406 (CEC) + 12.459 (1)

The equation RMSE (SD about the regres-sion line) is 6.8%. Dc La Rosa (1979) alsoused CEC and clay content in addition toOC and their interactions to explain nearlyall the variations in the LL (R 2 = 0.97) of 34 soilsamples from Florida. In the present study, therewere no interactions that significantly improvedthe prediction of LL from the entire data set.

Using taxonomic order or taxonomicmineralogy as dummy variable regressoralone explained 20% and 24% of the variationin the LL, respectively. This indicates thatstratifying the data set by mineralogy ortaxonomic order and developing a regressionequation for each strata could improve theprediction of LL.

The database was stratified bytaxononucorder and equations developed for each soilorder (excluding Histosols and Gelisols). Theresulting regression equations for the ordersEntisols, Aridisols, Alfisols, Molhsols, Incepti-sols, Oxisols, and Ultisols were not significantlydifferent from each other (data not shown).Therefore, the data from these seven soil orderswere combined, and an overall equation wasdeveloped (Table 3). Clay content and CECexplained 79% of the variation in LL when thedata from the seven soil orders were combined.

TABLE 3

Prediction equations for LL and P1 with their R2 . RMSE, and is values

Grouping

Predicting LLOrders'VernsolsAndisolsSpodosolsKaoliniticPredicting l'IEnnsolsAridisolsAlfisolsMull isolsVertisolsInceptisolsOxisolsUltisolsSpodosolsAndisolsKaoliniticMixed

Prediction

0.656 (clay) + 0.409 (CEC) + 12.1540.338 (clay) + 0.194 (CEC) + 1.462 (LEI') + 21.2732.752 (OC) + 0.559 (wl5bar) + 30.5170.986 (CEC) + 18.9990.545 (clay) + 0.493(CEC) - 19.552 (Db) + 41876

0.517 (clay) + 0.309 (CEC) - 2.326 (OC) - 1.5330.5 (clay) + 0.181 (CEC) - 1.1560.685 (clay) - 0.003 (clay) 2 + 0.442 (CEC) - 6.1180.917 (clay) - 0.005 (clay) 2 + 0.259 (CEC) - 8.1990.132 (clay) + 1.53 (LEP) + 16.2370.741 (clay) - 0.005 (clay) 2 + 0.246 (CEC) - 4.4020.327 (clay) + 1.4280.68 (clay) - 0.003 (clay) 2 - 1.2920.355 (CEC) + 1.4250.237 (wl5bar) + 1.162 (OC) + 5.398((.232 (clay) + 0.35 (CEC) - 16.604 (Db) + 33.2730.75 (clay) - 0.004 (clay) 2 + (( . 341 (CEC) - 1.848 (OC) - 8.895

RRMSE

((.796.6944332((.786.2443100.6810.7551210.815.489410.6310.555246

0.77

5.646

21(10.64

5.631

6350.74

6.5261022

0.72

5.632

14590.51

7.793

3190.61

6.794

2950.42

6.821

2370.59

5.997

4800.69

2.2074(1

0.15

16.37140

0.50

7.863

2460.69

5.429

2797

tclay: total clay (%); OC: organic carbon (%): w Sbar: -15(1(1 kPa water content; CEC: cation exchange capacity (cmol4+)kg 1); LEP: linear extensibility percent (%): Db: bulk density (g cut 3).*Includes taxonomic soil orders Entisols, Aridisols, Alfisols, Mollisols, Inceptisols, Oxisols, and LJltisols.

Page 6: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

Ii1I

30

SEYBOLD, ET AL. Son. SCIENCE

Bulk density did not significantly improve theprediction of the combined soil order equation. Aunique LL prediction equation was developed forthe Vertisols, Andisols, and Spodosols soil orders(Table 3). For the Vertisols order, clay content,CEC, and LEP explained 78% of the variation inthe LL. Linear extensibility percent indicates theamount of swelling and shrinkage of the soil. Thegreater plasticity of the soil indicates a greaterpotential shrink-swell (Mitchell, 1993). The CECalone explained 81% of the variation in LL for theSpodosols order. However, a small number of soilsamples were used in the development of theequation (n = 41), which may limit its overall ap-plication. For the Andisols, the water content at— 1500 kPa was used instead of clay contentbecause clay dispersion is a problem in the particlesize analyses of these soils. Organic C and - 1 500kPa water explained 68% of the variation in LL forthe Andisols. The LL in the Andisols order was themost difficult to predict and had the largestRMSE. This could be caused by the variation inthe amount of andic properties between soilsamples or layers within a pedon. Some horizonsmay have andic properties and some may not, andevery combination in between. It may be useful infuture analyses to further stratify the Andisols (orthe entire data set) by texture modifier—hydrous,medial, and ashy soil layers (Soil Survey Staff,1999). Spodic horizons could also be groupedwith soils with andic soil properties since theyrespond similarly.

The RMSE of the models developed fromthe taxonomic order strata, excluding Andisols,ranged from 6.69% to 5.49%, and were lowerthan the RSME for Eq. 1 (overall LL predictionequation). The lower RMSE indicate thatstratifying the large data set and separating outthe Vertisols, Andisols, and Spodosols increasedthe accuracy of predicting the LL.

A comprehensive coverage of prediction equa-tions for the mineralogy classes was not possiblebecause of insufficient data. The mineralogy classstrata provided a different way of grouping thesoils. Which strata would provide the best fitmodels and the most accurate predictions cannotbe determined from this study. The smectitic andmixed mineralogy models for predicting LL werenot significantly different from the "Orders"model in Table 3. The kaolinitic mineralogymodel is unique in that bulk density was useful inexplaining some of the variations in LL when clayand CEC were used as predictor variables. As withthe Andisols, it was equally as difficult to predictthe LL of the kaolinitic mineralogy strata.

Plasticity Index

Clay content and CEC explained 71% ofthe variation in the P1 of the entire data set (n =6592), excluding data with - I 500—kPa water-clay ratios that were greater than 0.6. Theprediction equation is:

P1 = 0.408 (clay) H- 0.434 (CEC) - 1.525(2)

The equation RMSE is 6.72%. The P1 wasmore difficult to predict than the LL. Reasonsfor this could he that the PT can be the same fortwo soils, but exhibit plasticity over entirelydifferent moisture ranges (Ba yer, 1930). In otherwords, soil properties may vary, but the P1may be the same in some cases, causing P1 to beless predictable.

Using taxonomic order or taxonomic nun-eralogy as a dummy variable regressor aloneexplained 16% and 17% of the variation in theP1, respectively. Taxonomic mineralogy or soilorder explained less variation in P1 than in theLL, probably for the same reasons as statedpreviously. However, stratifying the data set bytaxonomic mineralogy or soil order couldimprove the prediction of P1.

Plasticity index prediction equations weredeveloped for 10 soil orders (Table 3). All ofthe soil order P1 equations were determined tobe unique. In general, the PT was more difficultto predict than the LL, as indicated by thelower R2 and, in some cases, higher RMSE(Table 3). The CEC and clay contents werethe primary predictor variables used in predict-ing P1. Clay is a major contributor to theplasticity of soils. Organic C was a usefulpredictor variable of the PT for the Entisols,and LEP was a useful predictor variable for theVertisols. For the Alfisols, Mollisols, Incepti-sols, and Ultisols soil orders, the squared claycontent (clay 2) was found to be significant inpredicting the PT, which indicates nonlinearity(curvature) in the relationship between claycontent and the P1.

Clay, CEC, and OC explained 77% of thevariation in P1 for the Entisols order. Clay,squared clay, and CEC explained 74%, 72%, and61% of the variation in P1 for the Alfisols,Mollisols, and Inceptisols, respectively. Claycontent and CEC explained 64% of the varia-tion in P1 for the Aridisols. Clay content andLEP explained 51% of the variation in the P1 forthe Vertisols. The CEC alone explained 69% ofthe variation in P1 for the Spodosols, and claycontent alone explained 42% of the variation in

Page 7: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

VOL. 173 No. ILINEAR REGRESSION MODELS TO ESTIMATE SOIL LL AND P131

ii

PT for the Oxisols. Clay content and squaredclay explained 59% of the variation in P1 for theUltisols. The P1 of the Andisols order was themost difficult to predict, with - 1500 kPa waterand OC explaining only 15% of the variabilityin the P1. As indicated previously, the magni-tude of the LL and PL may vary , whereas the P1may not change, causing the P1 to have norelationship to the varying soil property . All soilsprobably have some degree of this, but it may bemore of a factor in the Andisols order. Inaddition, the factors which affect the plasticityof soils for the most part act simultaneously, andit can be therefore difficult to isolate the effectcaused by the individual factors (Dumbleton andWest, 1966).

The RMSE for the P1 prediction equationsranged from 2.2% to 7.79%, excluding that forthe Andisols, which was 16.37%. The RMSE ofthe overall P1 equation (Eq. 2) was at the higherend of this range (6.27%). By stratifying the dataset, lower RMSE and/or higher R2 wereobtained for the Entisols, Aridisols, Alfisols,Mollisols, Ultisols, and Spodosols soil ordersthan that for Eq. 2. This suggests that theaccuracy of predicting P1 will improve if thesoil order equations are used instead of Eq. 2.

The prediction of P1 for the three mineral-ogy classes (mixed, sniectitic, and kaolmitic)resulted in two unique equations (Table 3). Theequation for smectitic mineralogy was notsignificantly different from the Molhsols model.

A100

Eq.190r0.675 n =466S

80— S •• SSS

70- S • •—J •...?,$

60-• S

•.

..

. ..50-• 51

CU(I, a,

40—

30 - S

20-..

Clay, CEC, and bulk density to explained only50% of the P1 variability in the kaolinitic min-eralogy data group. Clay, squared clay, CEC,and OC were able to explain 69% of the P1variation in the mixed mineralogy strata. Asindicated in the prediction of LL, there are notenough data to evaluate models from all the min-eralogy class strata. However, the three modelsdeveloped from the mineralogy class strata donot seem to provide any better estimates thanstratifying by taxonomic order.

Va/ida fioii

The measured versus predicted LL or P1values for Eq. 1 and 2 regression models areshown in Figs. IA, B. The 95% confidenceintervals about the slope of the regressionline for prediction of LL (Eq. 1) indicate nosignificant difference from unity; the confidenceintervals contain one (Table 4). For the sameequation, there is no significant difference froma 0 intercept; the confidence intervals containzero. This means that more than 95% of thetime, similarly constructed intervals will containunit 1 slope and 0 intercept. This indicates thatthe LL prediction equation validated against theindependent data set. For P1 (Eq. 2), the 95%confidence intervals about the slope of theregression line indicate a significant differencefrom unity; the confidence intervals do notcontain one (Table 4). The slope of the regres-sion line is significantly less than 1, indicating

B 70i IEq.2 S

60r0.49 n=466SS

• •,S S

S•0

••. t•555

40 •.D

: _a) I/.CD

V)::•

-

1,5111

'Ca)

.t .••2.•

. •.:CS, • S

•SS10 •1,. • S• S•

1010 20 30 40 50 60 70 80 90 100

Predicted LL(%)

• I

0 10 20 30 40 50 60 70Predicted P1 (%)

Fig. 1. Scatter plot of measured versus predicted (A) LL and (B) P1. The prediction equations were developedusing the entire data set with no data stratification (Eq. 1 and Eq. 2). The one-to-one slope lines are marked onthe plots with solid lines.

In

Page 8: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

B60

50

40a.

30

20

10

020406080100 120

0

Predicted LL(%)

102030405060

Predicted PI (%)

A120

100

'180

-J-i

60

ii40

20

00

C60

50

40a.

30

ii20

10

0

IF

32 SEYBOLD, ET AL. SOIL SCIENCE

TABLE 4The slope and intercept of the measured versus predicted LL or P1 plots are presented with 95% confidence intervals

in parentheses

Prediction Eq. Slope Intercept

Eq. 1 (LL) 0.996 (0.933 to 1.058) 2.607 (-0.070 to 5.284)Eq. 2 (P1) 0.846 (0.767 to 0.924) 1.068 (0.713 to 2.848)Orders (LL) 0.996 (0.939 to 1.053) 1.609 (-0.841 to 4.059)Molliso]s (P1) 0.916 (0.827 to 1.005) —0.247 (2.325 to 1.831)Alfisols (P1) 0.963 (0.804 to 1.122) 0.766 (-2.393 to 3.925)Ultisols (P1) 1.005 (0.757 to 1.254) —1.837 (-6.767 to 3.093)

that the predicted P1 for the independent data validation may, in part, be caused by theset will be slightly overestimated. The degree of relatively small validation data set (n = 466)overestimation increases as the P1 gets larger. compared with the large data set (n = 6592) thatConversely, for the same equation, there is no was used to develop the model. The linearsignificant difference from a 0 intercept. Over- models and validation results are optimized forall, the P1 prediction equation did not validate the data set. If the data set compositions areagainst the independent data set. The lack of changed (only slightly), a different slope and

D60

50

40a.

30

20

10

001020304050600102030405060

Predicted P1 (%) Predicted P1 (%)

Fig. 2. Scatter plot of measured versus predicted LL or P1 for the (A) Orders (LL), (B) Alfisols (P1), (C) Mollisols (131),and (D) Ultisols (P1) in Table 3. The Orders includes the Entisols, Aridisols, Alfisols, Mollisols, lnceptisols, Oxisols,and Ultisols. The one-to-one slope lines are marked on the plots with solid lines.

Page 9: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

VOL. 173 No. 1LINEAR REGRESSION MODELS TO ESTIMATE SOIL LL AND P133

intercept can result. If a larger validation data setis used, one that more closely represents theoriginal size that was used in model develop-nient, a different validation result could occur.However, these results suggest that P1 is noteasily predicted from readily available soilproperties in Soil Survey when one predictionequation is used for all soils.

The measured versus predicted LL or P1values for four of the prediction models in Table3 are shown in Figs. 2A-D. The 95% con-fidence intervals about the slope of the regres-sion line for prediction of LL (soil ordersequations) and P1 for Ivlollisols, Alfisols, andUltisols soil orders indicate no significant differ-ence from unity; the confidence intervals con-tain one (Table 4). There is no significantdifference from a 0 intercept for the same fourprediction equations. These results suggest thatLL and P1 can be predicted from readilyavailable soil properties in Soil Survey.

SUMMARY AND CONCLUSIONS

Clay content and CEC were the mosthighly correlated and most important inde-pendent variables in predicting both LL and PT.Organic C was not significantly correlated toeither LL or PT. However, OC was a usefulpredictor variable in predicting LL for theAndisols and PT for the Entisols order when inthe presence of other variables. The LEP wasan important independent variable for predict-ing LL and P1 in the Vertisols soil order.Carbonate clay content was not significantlycorrelated to either LL or P1. An LL and PTprediction equation was developed from theentire range in soils used in this study. Theaccuracy of predicting LL and P1 was improvedby stratifying the data set by taxonomic order.The R2 ranged from 0.68 to 0.80 for predic-tion of LL and from 0.15 to 0.77 for predictionof PT. The P1 of the Andisols was the mostdifficult to predict (R2 = 0.15). Stratifying thedata set by taxonomic family mineralogy didnot provide a comprehensive coverage of soils(because of lack of data). Only equations forthree of the most common mineralogy classes(mixed, smectitic, and kaolinitic) were devel-oped. Based on the limited data and models,stratifying by mineralogy produced similarlevels of prediction accuracy. In conclusion,predicting LL and PT from readily available soilproperties (e.g., clay and CEC) resulted inmostly moderate to some strong prediction

equations. Weak P1 prediction equationsresulted for the Andisols strata. Some of thebetter fit models (R2 > 0.60) will be useful inSoil Survey when no measured data or bettermeans of estimating the LL and P1 areavailable. Other techniques such as the non-parametric nearest neighbor approach (Nemeset al., 2006) or other data stratifications may beneeded as the next step in improving theprediction of LL or P1. Stratifying the data setby taxonomic family mineralogy class needsfurther exploration.

REFERENCES

Bayer, L. D. 1930. The Atterberg consistency con-stants: factors affecting their values and a newconcept of their significance. J . Am. Soc. Agron.22:935-948.

Burt, R. 2004. Soil survey laboratory methodsmanual. Soil Survey Investigations ReportNo. 42. Version 4.0. U.S. Gov. Print. Office,Washington, DC.

Dc Jong, E., D. F. Acton, and H. B. Stonehouse,1990. Estimating the Atterberg limits of southernSaskatchewan soils from texture and carbon con-tents. Can. J. Soil Sci. 70:543-554.

Dc La Rosa, 1). 1979. Relation of several pedologicalcharacteristics to engineering qualities of soil. J.Soil Sci .30:793-799.

Dumbleton, M. J . , and C. West. 1966. Some factorsaffecting the relation between the clay minerals insoils and their plasticity. Clay Miner. 6:179-193.

Farrar, D. M., and J . D. Coleman. 1967. Thecorrelation of surface area with other proper-ties of nineteen British clay soils. J . Soil Sci. 18:118-124.

Fox, J . 1997. Applied regression analysis, linearmodels, and related methods. Sage Publications,Thousand Oaks, CA.

Larney, F. J . , R. A. Fortune, and J. F. Collins. 1988.Intrinsic soil physical parameters influencingintensity of cultivation procedures for sugarbeet seedbed preparation. Soil Tillage Res. 12:253-267.

Mbagwa, J . S. C., and 0. G. Abeh. 1998. Predic-tion of engineering properties of tropical soilsusing intrinsic pedological parameters. Soil Sci.163:93-102.

McBride, R. A. 2002. Atterberg limits. hi: Methodsof Soil Analysis, Part 4: Physical Methods. SSSANo. 5. J . H. Dane and G. C. Topp (eds.). SoilScience Society of America, Madison, WI, pp.389-398.

Mitchell, J . K. 1993. Fundamentals of soil behavior,2nd ed. John Wiley and Sons, New York, NY.

Nemes, A., W. J . Rawls, Ya. A. Paehepsky, andM. Th. van Ganuchten. 2006. Sensitivity analysisof the nonparametrie nearest neighbor technique

Page 10: LINEAR REGRESSION MODELS TO ESTIMATE SOIL LIQUID LIMIT AND PLASTICITY INDEX FROM BASIC SOIL PROPERTIES

34 SEYBOLD, ET AL. SOIL SCIENCE

to estimate soil water retention. Vadoze Zone J,5:1222-1235.

Odell, R. T., I. H. Thornburn, and L. J . McKenzie.1960. Relationships of Atterberg limits to someother properties of Illinois soils. Soil Sci. Soc. Am.Proc. 24:297-300.

ICA, 1992. PCA Soil Primer. Portland CementAssociation, Skokie, IL.

Seed, H. B., R. J . Woodward, and R. Lundgren.1964. Fundamental aspects of the Atterberg limits.J . Soil Mech. Found. Div. ASCE 90:75-105.

Smith, C. W., A. Hadas, J . Dan, and H.Koyuindj iskv. 1985. Shrinkage and Atterberglimits iii relation to other properties of principalsoil types in Israel. Geoderma 35:47-65.

Soil Survey Staff, 1995. Soil survey laboratoryinformation manual. Soil Survey Investigations

Report No. 45. Version 1.0. U.S. Co y . Print.Office, Washington, DC.

Soil Survey Staff 1999. Soil taxonomy: a basic systemof soil classification for making and interpretingsoil surveys. 2nd ed. Agric. Handh. 436. U.S.Go'. Print. Office, Washington, DC.

Stakman, W. P., and B. C. Bisha y . 1976. Moistureretention and plasticity of highly calcareous soils inEgypt. Neth. J . Agric. Sci. 24:43-57.

SYSTAT Software, 2002. SYSTAT for windows.Version 10.2. SYSTAT Software Inc., Richmond,CA.

USDA-NRCS, 2005. National Soil Survey Hand-book, title 430-VI. [Online] Available fromhttp://soils.usda.gov/technical/handbook/.

Zar, J . 1). 1999. Biostatistical anal ysis. 4th ed. PrenticeHall, Upper Saddle River, NJ.