Nonlinear neural-based modeling of soil cohesion intercept

10
KSCE Journal of Civil Engineering (2011) 15(5):831-840 DOI 10.1007/s12205-011-1154-4 831 www.springer.com/12205 Geotechnical Engineering Nonlinear Neural-Based Modeling of Soil Cohesion Intercept Ali Mollahasani*, Amir Hossein Alavi**, Amir Hossein Gandomi***, and Azadeh Rashed**** Received March 1, 2010/Revised May 30, 2010/Accepted October 17, 2010 ··································································································································································································································· Abstract A new model was derived to estimate undrained cohesion intercept (c) of soil using Multilayer Perceptron (MLP) of artificial neural networks. The proposed model relates c to the basic soil physical properties including coarse and fine-grained contents, grains size characteristics, liquid limit, moisture content, and soil dry density. The experimental database used for developing the model was established upon a series of unconsolidated-undrained triaxial tests conducted in this study. A Nonlinear Least Squares Regression (NLSR) analysis was performed to benchmark the proposed model. The contributions of the parameters affecting c were evaluated through a sensitivity analysis. The results indicate that the developed model is effectively capable of estimating the c values for a number of soil samples. The MLP model provides a significantly better prediction performance than the regression model. Keywords: soil cohesion intercept; soil physical properties; artificial neural networks; nonlinear modeling ··································································································································································································································· 1. Introduction One of the most important engineering properties of soil is its ability to resist sliding along internal surfaces within a mass. The stability of structures built on soil depends on the shearing resistance offered by the soil along probable surfaces of slippage. The shear strength of geotechnical materials is generally represented by the Mohr-Coulomb theory. According to this theory, the soil shear strength varies linearly with the applied stress through two shear strength components known as cohesion intercept and angle of shearing resistance. The values of these empirical parameters for any soil depend on several factors such as the soil textural properties, past history of soil, initial state of soil, and permeability characteristics of soil (Poulos, 1989; Al-Shayea, 2001; Murthy, 2008). If the cohesion intercept and angle of shearing resistance are determined using the total stresses, they are named as total or undrained cohesion intercept (c) and angle of shearing resistance (φ ). If the pore water pressures are measured during the test, the effective strength parameters (c' and φ' ) are obtained. Accurate determination of c is a major concern in the design of different geotechnical structures such as foundations, slopes, underground chambers, and open excavations. This key parameter can be determined either in the field or in the laboratory. The triaxial compression and direct shear tests are the most common tests for determining the c values in the laboratory. The triaxial test is more suitable for clayey soils. The direct shear test is commonly used for sandy soils and requires simpler test procedure in comparison with the triaxial test. The tests employed in the field include vane shear test or any other indirect method (Murthy, 2008; El-Maksoud, 2006). However, experimental determination of c is extensive, cumbersome, and costly. Also, it is not always possible to conduct the tests on every new situation. In order to cope with such problems, numerical solutions have been developed to estimate the shear strength parameters. The fact that most of the available empirical models are based on limited experimental data rises doubts on their generality. On the other hand, despite the multivariable depen- dency of soils, such correlations are mostly developed using only one soil index property. Incorporating simplifying assumptions into the development of the statistical and numerical methods may also lead to very large errors (Panwar and Seimens, 1972; Korayem et al., 1996; Terzaghi et al., 1996; Shahin et al., 2001). By extending developments in computational software and hardware, several alternative computer-aided data mining ap- proaches have been developed. Pattern recognition systems, as an example, learn adaptively from experiences and extract various discriminators. Artificial Neural Networks (ANNs) (Haykin, 1999) are the most widely used pattern recognition methods. Some of the recent scientific efforts directed at applying ANNs to geo- technical engineering problems include extracting soil constitutive behavior (Hashash and Song, 2008), modeling of maximum dry ****Researcher, Dept. of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Iran (E-mail: [email protected]) ****Researcher, School of Civil Engineering, Iran University of Science and Technology, Tehran, Iran (Corresponding Author, E-mail: ah_alavi@hotmail. com) ****Lecturer, College of Civil Engineering, Tafresh University, Tafresh, Iran (E-mail: [email protected]) ****Researcher, Dept. of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Iran (E-mail: [email protected])

Transcript of Nonlinear neural-based modeling of soil cohesion intercept

KSCE Journal of Civil Engineering (2011) 15(5):831-840DOI 10.1007/s12205-011-1154-4

− 831 −

www.springer.com/12205

Geotechnical Engineering

Nonlinear Neural-Based Modeling of Soil Cohesion Intercept

Ali Mollahasani*, Amir Hossein Alavi**, Amir Hossein Gandomi***, and Azadeh Rashed****

Received March 1, 2010/Revised May 30, 2010/Accepted October 17, 2010

···································································································································································································································

Abstract

A new model was derived to estimate undrained cohesion intercept (c) of soil using Multilayer Perceptron (MLP) of artificialneural networks. The proposed model relates c to the basic soil physical properties including coarse and fine-grained contents, grainssize characteristics, liquid limit, moisture content, and soil dry density. The experimental database used for developing the model wasestablished upon a series of unconsolidated-undrained triaxial tests conducted in this study. A Nonlinear Least Squares Regression(NLSR) analysis was performed to benchmark the proposed model. The contributions of the parameters affecting c were evaluatedthrough a sensitivity analysis. The results indicate that the developed model is effectively capable of estimating the c values for anumber of soil samples. The MLP model provides a significantly better prediction performance than the regression model. Keywords: soil cohesion intercept; soil physical properties; artificial neural networks; nonlinear modeling

···································································································································································································································

1. Introduction

One of the most important engineering properties of soil is itsability to resist sliding along internal surfaces within a mass. Thestability of structures built on soil depends on the shearingresistance offered by the soil along probable surfaces of slippage.The shear strength of geotechnical materials is generallyrepresented by the Mohr-Coulomb theory. According to thistheory, the soil shear strength varies linearly with the appliedstress through two shear strength components known ascohesion intercept and angle of shearing resistance. The valuesof these empirical parameters for any soil depend on severalfactors such as the soil textural properties, past history of soil,initial state of soil, and permeability characteristics of soil (Poulos,1989; Al-Shayea, 2001; Murthy, 2008). If the cohesion interceptand angle of shearing resistance are determined using the totalstresses, they are named as total or undrained cohesion intercept(c) and angle of shearing resistance (φ). If the pore waterpressures are measured during the test, the effective strengthparameters (c' and φ' ) are obtained.

Accurate determination of c is a major concern in the design ofdifferent geotechnical structures such as foundations, slopes,underground chambers, and open excavations. This key parametercan be determined either in the field or in the laboratory. Thetriaxial compression and direct shear tests are the most commontests for determining the c values in the laboratory. The triaxial

test is more suitable for clayey soils. The direct shear test iscommonly used for sandy soils and requires simpler testprocedure in comparison with the triaxial test. The tests employedin the field include vane shear test or any other indirect method(Murthy, 2008; El-Maksoud, 2006). However, experimentaldetermination of c is extensive, cumbersome, and costly.Also, it is not always possible to conduct the tests on every newsituation. In order to cope with such problems, numericalsolutions have been developed to estimate the shear strengthparameters. The fact that most of the available empirical modelsare based on limited experimental data rises doubts on theirgenerality. On the other hand, despite the multivariable depen-dency of soils, such correlations are mostly developed using onlyone soil index property. Incorporating simplifying assumptionsinto the development of the statistical and numerical methodsmay also lead to very large errors (Panwar and Seimens, 1972;Korayem et al., 1996; Terzaghi et al., 1996; Shahin et al., 2001).By extending developments in computational software andhardware, several alternative computer-aided data mining ap-proaches have been developed. Pattern recognition systems, asan example, learn adaptively from experiences and extract variousdiscriminators. Artificial Neural Networks (ANNs) (Haykin, 1999)are the most widely used pattern recognition methods. Some ofthe recent scientific efforts directed at applying ANNs to geo-technical engineering problems include extracting soil constitutivebehavior (Hashash and Song, 2008), modeling of maximum dry

****Researcher, Dept. of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Iran (E-mail: [email protected])****Researcher, School of Civil Engineering, Iran University of Science and Technology, Tehran, Iran (Corresponding Author, E-mail: ah_alavi@hotmail.

com)****Lecturer, College of Civil Engineering, Tafresh University, Tafresh, Iran (E-mail: [email protected])****Researcher, Dept. of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Iran (E-mail: [email protected])

Ali Mollahasani, Amir Hossein Alavi, Amir Hossein Gandomi, and Azadeh Rashed

− 832 − KSCE Journal of Civil Engineering

density and optimum moisture content of stabilized soil (Alaviet al., 2009, 2010), simulating saltwater intrusion process incoastal aquifers (Bhattacharjya et al., 2008), compressive strengthprediction of stabilized soil (Heshmati et al., 2009), and predictionof lateral behavior of single and group piles (Kim et al., 2008).Recently, Kayadelen et al. (2009) developed an ANN model topredict the φ ' value of soils. Multilayer Perceptron (MLP)(Cybenko, 1989) is an alternative ANN approach. MLP isessentially capable of approximating any continuous function toan arbitrary degree of accuracy (Cybenko, 1989). This approachis useful in deriving an empirical model for characterizing theundrained shear strength parameters by directly extracting theknowledge contained in the experimental data.

In this paper, the MLP technique has been utilized to obtain aprecise model relating c to several physical properties of soils.The proposed model was developed based on triaxial testsconducted in this study. ANNs are commonly considered asblack box systems as they are unable to explain the underlyingprinciples of prediction. To overcome this limitation, a con-ventional calculation procedure is further proposed based on thefixed connection weights and bias factors of the best MLPstructure.

2. Artificial Neural Network

Artificial Neural Networks (ANNs) have emerged as a resultof simulation of biological nervous system. The ANN methodwas founded in the early 1940s by McCulloch and co-workers(Perlovsky, 2001). The first researches were focused on buildingsimple neural networks to model simple logic functions. At thepresent time, ANNs can be applied to problems that do not havealgorithmic solutions or problems with complex solutions. ANNformulates a mathematical model for a system in which no clearrelationship is available between the inputs and outputs. Unlikethe majority of the conventional statistical methods, ANNs usethe data alone to determine the structure of the model and theunknown model parameters. The ability of ANNs to learn byexample makes them very flexible and powerful techniques.Thus, this approach has widely been applied to solving regressionand classification problems in many fields.

2.1 Multilayer Perceptron Network MLPs are class of ANNs using feedforward architecture.

These networks are universal approximators because of theiressential capability of approximating any continuous function toan arbitrary degree of accuracy (Cybenko, 1989). MLPs areusually applied to perform supervised learning tasks, whichinvolve iterative training methods to adjust the connection weightswithin the network. They are usually trained with Back Pro-pagation (BP) (Rumelhart et al., 1986) algorithm. Fig. 1 shows aschematic representation of an MLP network. The MLP networksconsist of an input layer, at least one hidden layer of neurons andan output layer. Each of these layers has several processing unitsand each unit is fully interconnected with the weighted con-

nections to units in the subsequent layer. Each layer contains anumber of nodes. Every input is multiplied by the interconnectionweights of the nodes. Finally, the output (hj) is obtained bypassing the sum of the product through an activation function asfollows:

(1)

where f () is activation function, xi is the activation of ith hiddenlayer node, wij is the weight of the connection joining the jth

neuron in a layer with the ith neuron in the previous layer, and b isthe bias for the neuron. For nonlinear problems, sigmoid functions(Hyperbolic tangent sigmoid or log-sigmoid) are usually adoptedas the activation function. Adjusting the interconnections betweenlayers will reduce the following error function:

(2)

where and are respectively the calculated output and theactual output value, n is the number of sample and k is thenumber of output nodes. Further details of MLPs can be found in(Cybenko, 1989).

3. Experimental Study

Triaxial compression test is presently the most widely usedprocedure for determining the shear strength parameters of soils.In the triaxial soil testing system, field conditions can efficientlybe simulated since the confining pressure is applied to the soilspecimen according to the uniform lateral in-situ stresses. Withrespect to the problem statement, three types of testing techniquescan be applied, namely Unconsolidated Undrained (UU),Consolidated Undrained (CU), and Consolidated Drained(CD) (Kayadelen et al., 2009). Within the scope of this study, aseries of unconsolidated, undrained, and unsaturated triaxial(UU) tests were performed in accordance with ASTM D2850-87(1987) to determine the shear strength parameters of 81different undisturbed soil samples.

3.1 Sampling A total of 50 drillings were performed at different locations in

Khorasan and Khouzestan provinces, Iran. The soil sampleswere manually taken by divers from test pits using metal tubes, 4

hj f xiwiji∑ b+⎝ ⎠⎛ ⎞=

E 12--- tk

n hkn–( )2

k∑

n∑=

tnk hn

k

Fig. 1. A Schematic Representation of an MLP Network

Nonlinear Neural-Based Modeling of Soil Cohesion Intercept

Vol. 15, No. 5 / May 2011 − 833 −

and 6 inches in diameter. The samples were obtained at depthsranging from 5 to 30 m and contained no gravel or larger particles.After extracting, the cores were carefully taken to the geotechnicallaboratory and maintained in a wet chamber to avoid loosing ofwater content. Undisturbed sub-samples were then extracted fromthe cores for the geotechnical characterization tests. Also, severaldisturbed soil samples were taken from the sites for other testingpurposes.

3.2 Basic Geotechnical Characterization Tests Extensive geotechnical laboratory test programs were carried

out for the basic characterization of the soil samples. Thesecomprised determining water (or moisture) content, natural unitweight of the soil, Atterberg limits (plastic and liquid limits), andgrain size distribution. The grain size distribution of the disturbedsamples was determined by sieving from number 4, 8, 16, 30, 50,100 and 200 sieves and for finer soils (silt and clay) remainingfrom the 200 sieves. A sedimentation test throughout a hydro-meter analysis was also carried out. To avoid flocculation of thefiner fraction, a dispersing agent (sodium hexametaphosfate) wasadded to the soil-water mixture before the sedimentation test.Fig. 2 illustrates the lower and upper limits of the grain sizedistribution of the samples tested. Different soil types tested weregravelly silt with sand (ML), sandy silt (ML), silty clay with sand(CL-ML), silt with sand (ML), lean clay (CL), silty sand withgravel (SM), poorly-graded gravel with clay (GP-GC), sandysilty clay (CL-ML), lean clay with gravel (CL), lean clay withsand (CL), gravelly lean clay with sand (CL), sandy silt withgravel (ML), and silt (ML).

3.3 Triaxial Compression TestsThe soil samples with dimension of 38-50 mm in diameter and

76-100 mm in height were used in the UU tests. The sampleswere enclosed in a thin rubber membrane and closed tightlyusing plastic O-rings from the top platen and base pedestal.Then, they were placed inside the water-filled cell. Afterwards, aknown magnitude of confining pressure of the fluid was appliedinside the cell. In order to obtain the soil failure, an axial stresswas implemented on the top of the soil specimen by a frictionlessram through the top of the cell. Meanwhile, the axial stressapplied and axial displacement of samples were measured byload cell until the samples fails. This test was repeated at least forthree confining pressures.

The obtained database includes measurements of fine (FC) andcoarse-grained (CC) contents, grain size for which 30 percentageof the sample was finer (D30), coefficient of uniformity (Cu),liquid limit (LL), moisture content (W), dry density (γd), andbulk density (γ). Undrained cohesion intercept (c) was also themeasured soil shear strength parameter. Total of 81 data setswere considered for developing the prediction models. A majorpart of the database comprises the laboratory test results for fine-grained soil samples. The descriptive statistics of the data used inthis study are also given in Table 1. To visualize the distributionof the samples, the data are presented by frequency histograms(Fig. 3).

4. Modeling of Soil Cohesion Intercept

Precise estimation of the undrained cohesion intercept is anessential criterion in design process of geotechnical tasks. Due tothe complexity of the behavior of the soil strength parameters, itis not simple to identify a relationship between the involvedparameters. Cohesion is mainly due to the intermolecular bondbetween the adsorbed water surrounding each grain, especially infine-grained soils (Murthy, 2008; El-Maksoud, 2006). The soilswith high plasticity like clayey soils have higher cohesion andlower angle of shearing resistance. Conversely, as the soil grainsize increases like sands, the soil cohesion decreases.

The main purpose of this study is to obtain a meaningful MLP-based relationship between c and the influencing parameters.The most important factors representing the c behavior weredetected based on the literature review (Mayne, 2001; Al-Shayea,2001; Murthy, 2008; El-Maksoud, 2006; Kayadelen et al., 2009).The undrained cohesion intercept (c) (kg/cm2) was considered to

Fig. 2. Lower and Upper Limits of the Grain Size Distribution of theSoil Samples

Table 1. Descriptive Statistics of the Variables used in the Model Development

Parameter FC (%) D30 (mm) Cu LL (%) W (%) γd (gr/cm3) c (kg/cm2)

Mean 80.945 0.051 78.543 28.171 16.643 1.607 0.367

Standard Error 2.112 0.026 20.083 0.634 0.637 0.012 0.031

Standard Deviation 19.124 0.239 181.858 5.739 5.765 0.109 0.284

Minimum 9.6 0.001 2.14 20 3.5 1.37 0.02

Maximum 99.1 1.9 1150 46 30 1.81 1.16

Ali Mollahasani, Amir Hossein Alavi, Amir Hossein Gandomi, and Azadeh Rashed

− 834 − KSCE Journal of Civil Engineering

be a function of several parameters as follows:

(3)

where,FC (%) : Fine-grained contentCC (%) : Coarse-grained contentD30 (mm) : Grain size for which 30 percentage of the sample

was finerCu : Coefficient of uniformity (D60 / D10). D10 and D60

are grain sizes, in millimeters, for which 10 and60 percentages of the sample was finer.

LL (%) : Liquid limit

W (%) : Moisture contentγd (gr/cm3) : Soil dry densityγ (gr/cm3) : Bulk density

The significant influence of the above parameters in deter-mining c is well understood. It is known that the cohesionintercept is affected by the basic soil properties (fabriccharacteristics), the state of the soil, and its consolidation history.FC, CC, D30, Cu, and LL represent the intrinsic soil properties. W,γd and γ provide information on the state of the soil and itsprevious history. They are also indicators of void ratio. Over-Consolidation Ratio (OCR) could have been included in theanalysis. OCR was not used herein as it should be obtained from

c f FC CC D30 Cu LL W γd γ, , , , , , ,( )=

Fig. 3. Histograms of the Variables used in the Model Development

Nonlinear Neural-Based Modeling of Soil Cohesion Intercept

Vol. 15, No. 5 / May 2011 − 835 −

time-consuming laboratory tests. On the other hand, γd and γ caneasily be calculated for a soil.

4.1 Performance MeasuresCorrelation coefficient (R), root mean squared error (RMSE)

and mean absolute percent error (MAPE) were used to evaluatethe performance of the proposed models. R, RMSE and MAPEare given in the form of equations as follows:

(4)

(5)

(6)

where hi and ti are respectively the actual and predicted outputvalues for the ith output, is the average of the actual outputs,and n is the number of sample. It is well known that the R valuealone is not a good indicator of prediction accuracy of a model.The reason is that R will not change by equally shifting thevalues predicted by a model. RMSE is one of the most popularmeasures of error. It has the advantage that large errors receivemuch greater attention than small errors. MAPE is commonlyused in quantitative prediction methods because it produces ameasure of relative overall fit (Hecht-Nielson, 1990). MAE andthe RMSE can be used together to diagnose the variation in theerrors in a set of estimates. Higher R values and lower RMSEand MAPE values indicate a more precise model.

4.2 Data PreprocessingSome of the soil property variables are fundamentally

interdependent. The first step in the analysis of interdependencyof the data is to make a careful study of what it is that thesevariables are measuring, noting any highly correlated pairs. Highpositive or negative correlation coefficients between the pairs maylead to poor performance of the models and difficulty ininterpreting the effects of the explanatory variables on theresponse. This interdependency can cause problems in analysisas it will tend to exaggerate the strength of relationships betweenvariables. This is a simple case commonly known as the problemof multicollinearity (Dunlop and Smith, 2003). It is apparent thatthere is a high negative correlation between FC and CC. Also, γd

and γ are highly correlated with each other. There will be noadvantage of having both variables in the modeling process asone can represent the other. Thus, decisions were made toremove the correlated parameters in order to maximize thereliability of the final model.

For the MLP analysis, the data sets were randomly dividedinto training and testing subsets. Training data were used forlearning. The testing data were used to measure the performanceof the MLP models on data that played no role in building themodels. Out of the available data, 69 data vectors were used forthe training process and 12 data were taken for the testing of themodels. In order to obtain a consistent data division, severalcombinations of the training and testing sets were considered.Both the input and output variables were normalized in thisstudy. After controlling several normalization methods (Swingler,1996; Mesbahi, 2000), the following method was used tonormalize the variables to a range of [L, U] :

(7)

where,

(8)

(9)

in which Xmax and Xmin are the maximum and minimum values ofthe variable and Xn is the normalized value. In the present study,L = 0.05 and U = 0.95.

4.3 Model Development using MLPThe available database was used for establishing the MLP

prediction models. After developing different models withdifferent combinations of the input parameters, the final ex-planatory variables (FC, D30, Cu, LL, W, γd) were selected as theinputs of the optimal model. For the development of the MLPmodels, a script was written in the MATLAB environmentusing Neural Network Toolbox 5.1 (MathWorks, 2007). Theperformance of an ANN model mainly depends on the networkarchitecture and parameter settings. According to a universalapproximation theorem (Cybenko, 1989), a single hidden layernetwork is sufficient for the traditional MLP to uniformly ap-proximate any continuous and nonlinear function. Choosing thenumber of the hidden layers, hidden nodes, learning rate, epochs,and activation function type plays an important role in the modelconstruction. Hence, several MLP network models with differentsettings for the mentioned characters were trained to reach theoptimal configurations with the desired precision (Eberhart andDobbins, 1990). The written program automatically tries variousnumbers of neurons in the hidden layer and reports the R, RMSEand MAPE values for each model. The model that provided thehighest R and lowest RMSE and MAPE values on the trainingdata sets was chosen as the optimal model. Various trainingalgorithms were implemented for the training of the MLP networksuch as gradient descent (traingd), Levenberg–Marquardt (trainlm),and resilient (trainrp) back propagation algorithms. The bestresults were obtained by Quasi-Newton back-propagation(trainbfg) method. Also, log-sigmoid was adopted as the transferfunction between the input and hidden layer. The transferfunction between the hidden layer and output layer was a linear

R i 1=

n∑ hi hi–( ) ti ti–( )

i 1=

n∑ hi hi–( )2

i 1=

n∑ ti ti–( )2

----------------------------------------------------------------=

RMSEhi ti–( )2

i 1=

n

n-------------------------- 100×=

MAPE 1n--- hi ti–

hi------------- 100×

i 1=

n

∑=

hi

Xn ax b+=

a U L–Xmax Xmin–------------------------=

b U aXmax–=

Ali Mollahasani, Amir Hossein Alavi, Amir Hossein Gandomi, and Azadeh Rashed

− 836 − KSCE Journal of Civil Engineering

transfer function (purelin). The ANN toolbox in MATLAB randomly assigns the initial

weights and biases for each run each time (MathWorks, 2007).These assignments considerably change the performance of anewly trained ANN even all the previous parameter settings andthe ANN architecture are kept constant. This leads to extradifficulties in the selection of the optimal ANN architecture andparameter settings. To overcome this difficulty, the weights andbiases were frozen after the network was well trained and thenthe trained ANN model was translated into explicit forms(Guzelbey et al., 2006; Tapkýn et al., 2009). For brevity, thedetailed explanations of the procedure used to convert theoptimal ANN model into simple equation is not given.

4.4 MLP-based Formulation for Soil Cohesion InterceptThe model architecture that gave the best results for the

formulation of the undrained cohesion intercept (c) was found tocontain:

●One invariant input layer, with 6 (n = 6) arguments (FC, D30,

Cu, LL, W, γd) and a bias term;●One invariant output layer with 1 node providing the value of c.●One hidden layer having 6 (m = 6) nodes.

The explicit formulation of c is as follows:

(10)

where,

, (11)

in which, FCn, D30n, Cu,n, LLn, Wn, and γd,n respectively representthe inputs variables normalized using Eq. (7). i is the number ofthe hidden layer neurons. The input layer weights (Wi), inputlayer biases (Biasi), and hidden layer weights (Vi) of the optimumMLP model are presented in Tables 2 and 3. The MLP modelwas built with a learning rate of 0.05 and trained for 1,500epochs. Comparisons of the predicted versus experimental cvalues are shown in Fig. 4.

c kg/cm2( ) 10.7895---------------- 3.1895–

Vi

1 e Fj–+---------------

i∑+⎝ ⎠

⎛ ⎞=

Fj FCn W1i D30n W2i Cun W3i LLn+×+×+×= W4i× Wn W5i γdn W6i Biasi+×+×+ i 1 ... 6, ,=

Table 3. Weight Values between the Hidden and Output Layer

Number of hidden neurons (i)

Weights 1 2 3 4 5 6

Vi 9.3120 -7.3948 -3.9038 -3.9087 0.8777 2.1238

Table 2. Weight and Bias Values between the Input and Hidden Layer

Number of hidden neurons (i)

Weights 1 2 3 4 5 6

W1i 9.8772 4.1858 1.9700 9.8601 12.5035 -11.1453

W2i 2.0136 -2.1360 -8.6685 1.9825 -0.8749 2.0077

W3i -5.2599 13.5814 13.2740 -3.1667 0.7696 3.8014

W4i 0.1009 8.8448 10.9255 -1.7578 3.3352 -7.7729

W5i 10.8502 2.7682 -24.8648 36.8155 0.8320 -13.6760

W6i -4.0653 -4.4922 2.8089 -0.3596 -1.2803 15.0461

Biasi 0.3397 -2.6089 1.9675 3.1940 -1.5186 -0.7549

Fig. 4. Predicted versus Experimental c Values using the Best MLP Model: (a) Training Data, (b) Testing Data

Nonlinear Neural-Based Modeling of Soil Cohesion Intercept

Vol. 15, No. 5 / May 2011 − 837 −

4.5 Model Development Using Regression Analysis A multivariable nonlinear least squares regression (NLSR)

(Ryan, 1997) analysis was performed to have an idea about thepredictive power of the best MLP model, in comparison with aclassical statistical approach. The method of NLSR is extensivelyused in the regression analyses because of its interesting nature.Under certain assumptions, NLSR has some attractive statisticalproperties that have made it as a member of the most powerfuland popular methods of the regression analysis. NLSR extendsthe linear least-squares regression for use with a much larger andmore general class of functions. There are very few limitationson the way parameters can be used in the functional part of thenonlinear regression models (Ryan, 1997). The NLSR predictionequation relates c to the predictor variables as follows:

(12)

where α denotes coefficient vector. The NLSR model wastrained using the same training and testing data sets previouslyconsidered for developing the MLP model. Eviews softwarepackage (Maravall and Gomez, 2004) was used to perform theregression analysis. The NLSR-based formulation of c in termsof FC, D30, Cu, LL, W, and γd is as given below:

(13)

Comparisons of the predicted versus experimental c values areshown in Fig. 5.

5. Performance Analysis of the Models

A precise prediction model was developed for the soil cohesionintercept upon a reliable database. A comparison of the ratiobetween the predictions made by the MLP and NLSR modelsand the experimental c values is shown in Fig. 6. No rationalmodel has been found for the prediction of c that encompassesthe influencing variables considered in this study. Thus, it wasnot possible to conduct a comparative study between the resultsof this research and those in hand.

Based on a logical hypothesis (Smith, 1986; Kasabov, 1998),if a model gives R > 0.8, and the RMSE and MAPE values are atthe minimum, there is a strong correlation between the predictedand measured values. The model can therefore be judged as verygood. It can be observed from Fig. 4 that the MLP model withhigh R and low RMSE and MAPE values is able to predict thetarget values to an acceptable degree of accuracy. Meanwhile, itis noteworthy that the RMSE values are not only low but also assimilar as possible for the training and testing sets. This suggeststhat the proposed model has both predictive ability (low values)and generalization performance (similar values) (Pan et al., 2009).The 95% confidence interval of the predicted soil cohesionvalues for the entire data was 0.058.

The models derived using ANNs or other soft computing toolshave a predictive capability within the data range used for their

c α1FCα2 α3D30α4 α5Cu

α6 α7LLα8 α9Wa10 α11γda12 α13+ + + + + +=

cNLSR kg/cm2( ) 2500.89 FC .00008–– 0.003D308.504+=

0.0692Cu0.213 10.108LL 0.851– 3175.22W 0.0001–+–+

1892.33γd0.0006 2566.23–+

Fig. 6. Comparison of the c Predictions Made by Different Models for the Entire Database

Fig. 5. Predicted versus Experimental c Values using the NLSR Model: (a) Training Data, (b) Testing Data

Ali Mollahasani, Amir Hossein Alavi, Amir Hossein Gandomi, and Azadeh Rashed

− 838 − KSCE Journal of Civil Engineering

development. For these methods, the amount of data involved inthe modeling process is an important issue, as it bears heavily onthe reliability of the final models. To cope with this issue, Frankand Todeschini (1994) argue that the minimum ratio of thenumber of objects over the number of selected variables formodel acceptability is 3. They also suggest that it is morereasonable to consider a ratio equal to 5. In the present study, thisratio is higher and is equal to 81/6 = 13.5. The above facts ensurethe derived prediction model is valid and is not a chancecorrelation.

It is obvious that, in all cases, the MLP model has a remarkablybetter performance than the NLSR model. Empirical modelingbased on statistical regression techniques has significantlimitations. Most commonly used regression analyses can havelarge uncertainties. It has own major drawbacks pertainingidealization of complex processes, approximation and averagingwidely varying prototype conditions. Contrary to MLP, theregression-based methods model the nature of the correspondingproblem by a pre-defined linear or nonlinear equation.

6. Sensitivity Analysis

Sensitivity analysis is of utmost concern for selecting theimportant input variables. The contribution of each inputparameter in the MLP model was evaluated through a sensitivityanalysis. To achieve this, relative importance values of thepredictor variables were calculated using Garson's algorithm(Garson, 1991). Fig. 7 presents a summary of the Garson’sprotocol for determining the relative importance values. Ac-cording to this algorithm, the input-hidden and hidden-outputweights of the trained MLP model were partitioned and theabsolute values of the weights were taken to calculate the relativeimportance values. The relative importance values of the inputparameters of the MLP model are presented in Fig. 8. Accordingto these results, it can be found that c, for the ranges investigated,

is more dependent on Cu and γd compared with the other soilproperties. Another observation from the results of the sensitivitystudy is that W is less important in explaining the variations ofthe c values.

7. Conclusions

In this research, a high-precision model was derived forassessing the undrained soil cohesion intercept, c, using the MLPparadigm. The proposed model was developed based on wellestablished and widely dispersed triaxial test results obtainedthrough an experimental study. The following principal con-clusions may be drawn based on the results presented:

•The developed MLP model gives reliable estimates of the cvalues. The results indicate that the proposed model possessessome obvious superiority in comparison with the nonlinearregression model.

•Contrary to the conventional models, the proposed modelsimultaneously takes into account the role of several importantfactors (FC, D30, Cu, LL, W, and γd) representing the behaviorof the shear strength parameters. The results indicate that Wand γd are efficient representatives of the initial state and

Fig. 7. Determining the Relative Importance of Each Input Variable using the Garson’s Algorithm (Alavi et al., 2010)

Fig. 8. Contributions of the Predictor Variables in the MLP Model

Nonlinear Neural-Based Modeling of Soil Cohesion Intercept

Vol. 15, No. 5 / May 2011 − 839 −

consolidation history of the soil.•The sensitivity analysis results indicate that Cu and γd are the

most important parameters governing the behavior of c.•The proposed model is mostly suitable for fine-grained soils

with physical properties similar to the soil samples used inthis study. The model can be improved to make moreaccurate predictions for a wider range by adding newer datasets for other soil types and test conditions.

•The tractable MLP-based design equation provides an analysistool accessible to practicing engineers. The MLP calculationprocedure outlined in Appendix A can readily be performedusing a spreadsheet or hand calculations to give predictionsof the c values.

•Using the MLP approach, the c values can be estimatedwithout carrying out sophisticated and time-consuminglaboratory or field tests.

•A major distinction of MLP for determining the c values liesin its powerful ability to model the mechanical behaviorwithout assuming prior form of the existing relationships.

Further research can be focused on analyzing the relationshipsbetween the angle of shearing resistance and the soil physicalproperties using the MLP technique. This leads to the formulationof a power and nonlinear model for friction angle like the onesderived for the cohesion in this work.

Appendix. Design Example

An illustrative design example is provided to further explainthe implementation of the soil cohesion intercept formula. Forthis aim, one of the soil samples used for the testing of the modelwas taken. The FC, D30, Cu, LL, W, and γd values for the sampleare respectively equal to 58.4%, 0.023 mm, 80, 23%, 11.1%, and1.73 gr/cm3. The c of the soil is required. The calculationprocedure can be divided into three sections: 1) normalization ofthe input data, 2) calculation of the hidden layers, and 3)prediction of c. The calculation procedure is outlined in thefollowing steps:

• Step 1: Normalization of the input data (FC, D30, Cu, LL, W,γd) to lie in a range from 0.05 to 0.95 and calculation of theinput neurons (FCn, D30,n, Cu,n, LLn, Wn, γd,n) for each inputdata vector using Eqs. (7) to (9). The input neurons arecalculated as:For FC: the maximum and minimum values of the variableare 99.1 and 9.6, thus:

(14)

Similarly,

D30,n = 0.0604, Cu,n = 0.1110, LLn = 0.1538, Wn = 0.3105, andγd,n = 0.7864.

•Step 2: Calculation of the hidden layer. The input value ofeach neuron in the hidden layer is determined for six neuronsusing the input layer weights and biases shown in Table 2.Given the information provided, the input values of the neuron(F1,…, F6) are calculated using Eq. (11):

F1 = 9.8772×0.5407+4.1858×0.0604+1.9700×0.1110+9.8601×0.1538+12.5035×0.3105−11.1453×0.7864+0.3397 = 2.7860 (15)

Similarly,

F2 = -0.9993, F3 = 4.1586, F4 = -0.3519, F5 = -3.0788, and F6 = 8.4668.

•Step 3: Prediction of c. The input value of each output neuronis calculated using an activation function (log-sigmoidfunction). The calculated values are multiplied by the hiddenlayer connection weights (Table 3) and the summation isobtained:

A = 9.3121f(F1)−7.3948f(F2)−3.9038f(F3)−3.9087f (F4) +0.8777f (F5)+2.1238 f (F6) = 3.4856 (16)

where f(x) is the a log-sigmoid function of form 1/(1+e−x).Using Eq. (10), the value of c is calculated as follows:

(17)

In this example, the result is in good agreement with theexperimental value (0.37 kg/cm2) of c as it yields a value1.35% higher.

References

Alavi, A. H., Gandomi, A. H., Mollahasani, A., Heshmati, A. A. R., andRashed, A. (2010). “Modeling of maximum dry density andoptimum moisture content of stabilized soil using artificial neuralnetworks.” J. Plant. Nutr. Soil. Sci., Vol. 173, No. 3, pp. 368-379.

Alavi, A. H., Gandomi, A. H., Gandomi, M., and Sadat Hosseini, S. S.(2009). “Prediction of maximum dry density and optimum moisturecontent of stabilized soil using RBF neural networks.” The IES J.Part A . Civ. Struct. Eng., Vol. 2, No. 2, pp. 98-106.

Al-Shayea, N. A. (2001). “The combined effect of clay and moisturecontent on the behavior of remolded unsaturated soils.” Eng. Geol.,Vol. 62, No. 4, pp. 319-342.

ASTM D2850-87. (1987). Standard test method for unconsolidated,undrained compressive strength of cohesive soils in triaxialcompression.

Bhattacharjya, R. K., Datta, B., and Satish, M. G. (2009). “Performanceof an artificial neural network model for simulating saltwaterintrusion process in coastal aquifers when training with noisy data.”KSCE J. Civ. Eng., Vol. 13, No. 3, pp. 205-215.

Cybenko, J. (1989). “Approximations by superpositions of a sigmoidalfunction.” Math. Cont. Sign. Syst., Vol. 2, pp. 303-314.

Dunlop, P. and Smith, S. (2003). “Estimating key characteristics of theconcrete delivery and placement process using linear regressionanalysis.” Civil Eng. Environ. Syst., Vol. 20, pp. 273-290.

FCn0.95 0.05–99.1 9.6–

-------------------------⎝ ⎠⎛ ⎞FC 0.95 0.95 0.05–

99.1 9.6–------------------------- 99.1×–⎝ ⎠

⎛ ⎞+=

0.5407=

c 10.7895---------------- 3.1895– 3.4861+( ) 0.375 kg/cm2= =

Ali Mollahasani, Amir Hossein Alavi, Amir Hossein Gandomi, and Azadeh Rashed

− 840 − KSCE Journal of Civil Engineering

Eberhart, R. C. and Dobbins, R. W. (1990). Neural network PC tools, APractical Guide, Academic Press, San Diego, C.A.

El-Maksoud, M. A. F. (2006). “Laboratory determining of soil strengthparameters in calcareous soils and their effect on chiseling draftprediction.” Proc. Energy Efficiency and Agricultural EngineeringInt. Conf., Rousse, Bulgaria.

Frank, I. E. and Todeschini, R. (1994). “The data analysis handbook.”Elsevier, Amsterdam, The Nederland.

Garson, G. D. (1991). “Interpreting neural-network connectionweights.” AI Expert, Vol. 6, No. 7, pp. 47-51.

Guzelbey, I. H., Cevik, A., and Gogus, M. T. (2006). “Prediction ofrotation capacity of wide flange beams using neural networks.” J.Constr. Steel Res., Vol. 62, No. 10, pp. 950-961.

Hashash, M. A. and Song, H. (2008). “The integration of numericalmodeling and physical measurements through inverse analysis ingeotechnical engineering.” KSCE J. Civ. Eng., Vol. 12, No. 3, pp.165-176.

Haykin, S. (1999). Neural networks - A comprehensive foundation, 2nd

Ed., Prentice Hall Inc., Englewood Cliffs.Hecht-Nielson, R. (1990). “Neurocomputing.” Reading, Mass: Addison-

Wesley.Heshmati, A. A. R., Alavi, A. H., Keramati, M., and Gandomi, A. H.

(2009). “A radial basis function neural network approach forcompressive strength prediction of stabilized soil.” Geotech. Spec.Pub. ASCE., Vol. 191, pp. 147-153.

Kasabov, N. K. (1998). Foundations of neural networks fuzzy systemsand knowledge engineering, MIT Press, Cambridge.

Kayadelen, C., Günaydýn, O., Fener, M., Demir, A., and Özvan, A.(2009). “Modeling of the angle of shearing resistance of soils usingsoft computing systems.” Expert Syst. Appl., Vol. 36, pp. 11814-11826.

Kim, B. T., Kim, Y. S., and Lee, S. H. (2008). “Prediction of lateralbehavior of single and group piles using artificial neural networks.”KSCE J. Civ. Eng., Vol. 5, No. 2, pp. 185-198.

Korayem, A. Y., Ismail, K. M., and Sehari, S. Q. (1996). “Prediction ofsoil shear strength and penetration resistance using some soilproperties.” Mis. J. Agr. Res., Vol. 13, No. 4, pp. 119-140.

Maravall, A. and Gomez, V. (2004). Eviews software, Version 5,Quantitative Micro Software, LLC, Irvine C.A.

MathWorks (2007). Inc. MATLAB the language of technical computing,Version 7.4, Natick, MA, U.S.A .

Mayne, P. W. (2001). “Stress-strain-strength-flow parameters fromenhanced in-situ tests.” Proc., In-Situ Measurement of SoilProperties & Case Histories, Bali, Indonesia, pp. 27-48.

Mesbahi, E. (2000). Application of artificial neural networks inmodelling and control of diesel engines, PhD Thesis, University ofNewcastle, U.K.

Murthy, S. (2008). Geotechnical engineering. principles and practicesof soil mechanics, 2nd Edition, Taylor & Francis, CRC Press, U.K.

Pan, Y., Jiang, J., Wang, R., Cao, H., and Cui, Y. (2009). “A novel QSPRmodel for prediction of lower flammability limits of organiccompounds based on support vector machine.” J. Haz. Mater., Vol.168, Nos. 2-3, pp. 962-969.

Panwar, J. S. and Seimens, J. C. (1972). “Shear strength and energy ofsoil failure related to density and moisture.” T. ASAE, Vol. 15, pp.423-427.

Perlovsky, L. I. (2001). Neural networks and intellect, OxfordUniversity Press.

Poulos, S. J. (1989). “Liquefaction related phenomena.” Advance DamEngineering for Design, Construction, and Rehabilitation, VanNostrand Reinhold, pp. 292-297.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). “Learninginternal representations by error propagation.” Proc., ParallelDistributed Processing, MIT Press, Cambridge.

Ryan, T. P. (1997). Modern regression methods, Wiley, New York, N.Y.Shahin, M. A., Maier, H. R., and Jaksa, M. B. (2001). “Artificial neural

network applications in geotechnical engineering.” Aus. Geomech.Vol. 36, No. 1, pp. 49-62.

Smith, G. N. (1986). Probability and statistics in civil engineering,Collins, London.

Swingler, K. (1996). Applying neural networks a practical guide,Academic Press, New York, N.Y.

Tapkýn, S., Cevik, A., and Usar, Ü. (2009). “Accumulated strainprediction of polypropylene modified marshall specimens inrepeated creep test using artificial neural networks.” Exp. Syst. Appl.,Vol. 36, pp. 11186-11197.

Terzaghi, K., Peck, R. B., and Mesri, G. (1996). Soil mechanics inengineering practice, 2nd Ed., Wiley & Sons, Inc., New York, N.Y.