Applying Random Forests to Decide Ordering Policy Based on ...

Original Paper

Applying Random Forests to Decide Ordering Policy Based on Important Shipping Statistics

Keisuke NAGASAWA †1, Takashi IROHARA

†1, Yosuke MATOBA †2 and Shuling LIU

†2

Abstract: This study was motivated by challenges facing inventory managers when deciding the

ordering policy for various items. It is difficult to find an appropriate ordering policy for many types

of items. We propose a model that changes conventional multi-criteria ABC analysis so that it is

suitable for use by inventory managers. We indicate that categorizing items based on their statistical

characteristics leads to an ordering policy suitable for each item. We propose a method for deciding

the ordering policy based on important shipping statistics and a classification technique. For this

method, we analyze the relation between shipping statistics and the ordering policy for searching

important shipping statistics. We classify items by shipping statistics and then decide the ordering

policy for each item. In the numerical experiment, we used actual shipment data to calculate many

shipping statistics that represent the characteristics of each item. Next, we found the important

shipping statistics from Random Forests and applied them to decide the ordering policy. Finally, for

confirming the importance of important shipping statistics, we tested the performance of Random

Forests and other classifying methods using the important shipping statistics. It was found that the

performance of each classifying method was improved.

Key words: Inventory management, ordering policy, multivariable analysis, Random Forests

1 INTRODUCTION

Effective inventory management has played an

important role in the success of supply chain

management. Deciding the ordering policy has a big

effect on the performance of supply chain

management. If a single ordering policy is applied to

many different items with different shipping statistics,

the inventory would be a mix of items that had

frequent shortages and those having excess inventory.

On the other hand, even using multiple ordering

policies, it is not easy to find an appropriate policy

for many types of items, particularly when each has a

different shipping statistics. In addition, given a

particular shipping statistics, no method or criterion

has been specifically developed for choosing the

ordering policy to that would be applied to these

items. For these reasons, the proper relationship

between shipping statistics and the appropriate

ordering policies is not still clear. Practically, the

ordering policy should be decided by some

supposedly important shipping statistics such, as lead

time, purchase unit price or demand fluctuation.

The most important problem for to deciding the

appropriate ordering policy appropriately for many

items has not been clarified is not erected. Thus,

some researchers have dealt with this problem by an

approach called “item classification.” That is, items are

classified and, inventory managers change the ordering

policy according to items’ classification.

In this research, we use Random Forests. There are

business contributions and practical importance. First,

the inventory manager can easily change the number

that might be used for each item in the ordering policy

which might be used for each item. Next, important

shipping statistics, calculated by Random Forests, for

applying an appropriate ordering policy becomes clear.

Moreover, it may be possible we might be able to use

the important shipping statistics by Random forests

even if using another technique is used.

We propose a framework for deciding ordering

policy by classification. There are five steps. First, we

calculate the shipping statistics for each item. Second,

we divide the items into two groups: learning item

group and testing item group. Third, we assign an

optimal ordering policy for each item of the learning

item group. Fourth, for the learning item group, we

†1 Sophia University †2 Fairway Solutions Inc. Received: November 30, 2012

Accepted: September 12, 2013

J Jpn Ind Manage Assoc 64, 579-590, 2014

Vol.64 No.4E (2014) 579

analyze the relation between shipping statistics and

assigned optimal ordering policies, and find

significant shipping statistics for deciding the ordering

policy. Finally, we used these important shipping

statistics for deciding the ordering policy for each item

of the testing item group.

The remainder of this paper is organized as

follows. In Section 2, we review relevant literature.

In Section 3, we explain previous knowledge of

classification and classification techniques relating to

our proposed method. In Section 4, we give an

outline of the proposed method, including how to

convert actual shipping data to data matrices for

analysis. The details of the experiment and results of

analysis are reported in Section 5.

2 LITERATURE REVIEW

When there is a shortage of items, there are two

ways to cope with it: “backorder” and “lost-sales.”

Although backorder models have been studied by

many researchers and practitioners, there are few

papers that discuss lost-sales models.

Some researchers have investigated the difference

between backorder models and lost-sales models. In

the case of stock-outs, Gruen et al.[1] revealed that

only 15% of the customers who observe a stock-out

will wait for the item to be restocked. The other 85%

of the customers decide to buy a different product,

visit another store, or do not buy any product at all.

According to Zipkin[2], the cost deviations can run

up to 30% when the lost-sales system is

approximated by a backorder model. Janakiraman et

al.[3] compared the performance of optimal

replenishment policies in lost-sales and backorder

models. Huh et al.[4] compared the performance of

base-stock policies in lost-sales and backorder

models. Levi et al.[5] proposed a dual-balancing

policy, in which the risks of ordering and holding are

balanced in lost-sales models. Bijvank and Vis[6]

classified the models based on the characteristics of

the inventory system and reviewed the proposed

replenishment policies. Van Donselaar and

Broekmeulen[7] showed that the lost-sales system

can simply be approximated by a backordering

system if the target fill rate is at least 95%, which

may lead to serious approximation errors. So, even if

some method has meaningful results as a backorder

model, there is no guarantee of its significance as a

lost-sales model. Therefore, a common concern of

researchers and inventory managers is whether or not

the results of backorder models can be applied to

lost-sales models.

Some researchers tackle backorder models and

lost-sales models by applying an item classification

approach where inventory managers change the

ordering policy according to item classification. This

approach can also be classified into two categories:

“optimization” and “grouping.”

When the optimization approach is applied, the

ordering policy of each item is determined by

optimization even if important shipping statistics are

not revealed. Tsai and Yeh[8] used a particle swarm

optimization approach for an inventory classification

problem. There is another research study in which the

main point is reducing total cost by grouping similar

tendency items. Mohammaditabar et al.[9] used

simulated annealing to group similar items and find

the best ordering policy simultaneously. However,

when using that method, the optimal ordering policy

might be used for each item. There is also a demerit

when using optimization. For example, the result is

completely dependent on input data. If a new item is

added or deleted, the manager must recalculate again.

On the other hand, some inventory managers need

rules: if they follow the rules, they can obtain

optimal or satisfactory results. Inventory managers

often think that they would like to respond flexibly to

item addition and deletion using item shipping

statistics. Grouping is recommended for such

inventory managers.

A typical method of grouping is ABC analysis.

This method is based on the Pareto principal. Items

classified as Class A are commonly those items of

high importance but few in number. Items in Class C

are less important but large in number, and those in

Class B are between the other two classes. For

classical ABC analysis, “annual dollar usage” is

basically used for the evaluation value. In order to

simplify inventory control, many researchers have

created similar methods for grouping items or

ranking items by shipping statistics.

Although ABC analysis is famed for its ease of

use, it has been criticized for focusing on a single

criteria. Many researchers have used two or more

580 J Jpn Ind Manage Assoc

evaluation criterion for grouping, called “multi-

criteria ABC classification.” Item classification

problems based on multi-criteria ABC classification

are still being studied. Other criteria such as lead-

time, commonality, obsolescence, durability,

inventory cost and order size requirements have also

been claimed as being significant for inventory

classification (Flores and Whyback[10], Jamshidi and

Jain[11], Ng[12] and Ramakrishnan[13]). Ernst and

Cohen[14] used cluster analysis to group similar

items. Flores et al.[15] provided a matrix-based

methodology for multi-criteria ABC classification.

Gajpal et al.[16], Partovi and Burton[17], and Partovi

and Hopton[18] applied the analytical hierarchy

process (AHP) to ABC analysis. However, there is

little documentation discussing which shipping

statistics should be used.

Some researchers have given attention to the

method for deciding the ordering policy for each item

from shipping statistics. Ramakrishnan[13] proposed

a weighted linear optimization model and classified

inventory items using multiple criteria. Genetic

algorithms and artificial neural networks have also

been applied (Partovi and Anandarajan[19] and

Guvenir and Erel[20]). Yu[21] applied support vector

machines (SVM) and made comparisons using

traditional multiple discriminant analysis, back-

propagation networks, and the k-nearest neighbor

algorithm. (Support Vector Machine was originally

developed by Vapnik [22] at Bell Laboratories.)

Moreover, some research studies manage the

ordering policy by considering the shipping

relationship between items. Xiao et al.[23]

considered the importance of inventory items based

on a loss rule. Tiacci and Saetta[24] considered the

relationship between forecasting and ordering policy

using carrier capacity. If there is relation between the

shipment of items and the evaluation value of

inventory control, an evaluation value can be

predicted when an ordering policy is applied. Berling

and Marklund[25] used a linear regression technique

to obtain approximate values of the induced

backorder cost in a one-warehouse multiple-retailer

system.

Shipping statistics were used for deciding the

ordering policy. However, there are a few papers that

consider which shipping statistics are important. For

example, in a previous study (Nagasawa et al.[26]),

the authors applied canonical correlation analysis to

a set of shipping statistics and the evaluation values

of ordering policies. Canonical correlation analysis is

one method for searching the relationship between two

sets of variables; as between shipping statistics of each

item and evaluation values of ordering policies for

each item. According to Basu and Mandal[27], this

was initially developed by Hotelling[28]. The study

by Nagasawa et al. suggests that shipping statistics

have different importance or significance for

decidinge ordering policy. However, no

consideration has been given to how to classify items

based on the importance of shipping statistics and

how to apply ordering policies to each item group.

In our article, using multi-criteria ABC analysis,

we focus on applying an ordering policy for each

item using the shipping statistics. There are two

differences from typical multi-criteria ABC analysis.

First, typical multi-criteria ABC analysis assumes

that all shipping statistics have to be used. On the

other hand, we decide the priority for which shipping

statistics are important in deciding ordering policies.

Second, our approach is not limited to three classes

since multiple ABC analysis classifies items into

three classes.

3 CLASSIFICATION TECHNIQUE

We will now briefly explain decision tree and

Random Forests algorithms.

Random Forests is a more advanced decision tree

technique, which was proposed by Breiman[29]. Since

Random Forests is a collection of decision trees, first

of all, we briefly introduce decision trees.

3.1 Classification and Regression Tree

A decision tree is a useful technique and consists of

a series of if-then rules for dependent variables. When

using a decision tree, the classification and regression

tree (CART) is one of the data mining algorithms for

classification and regression (Breiman et al.[30]).

CART can handle both qualitative and quantitative

variables. The CART algorithm partitions the

independent variable space into two regions according

to the performance measures. The output of CART is a

hierarchical structure that consists of a series of if-

then rules to predict the outcome of the dependent

Vol.64 No.4E (2014) 581

variables. The image of applying a decision tree using

CART is shown in Figs. 1 and 2.

Figure 1 expresses the learning phase of the

decision tree. The illustration on the left represents how

items are plotted when shipping statistics x1 and x2 are

chosen. There are 10 items. An optimal ordering policy

was set for each item, represented as a square and a

hexagon. The illustration on the right represents the type

of tree structure created and the if-then rules made.

Fig. 1. The learning phase of the decision tree.

Fig. 2. The testing phase of the decision tree.

Figure 2 expresses the testing phase of the decision

tree. If a new item, i is given, the if-then rule is

applied to it. We can predict what ordering policy

should be used from where the item is assigned. In the

figure, item i should be applied to the ordering policy

square from the results of the decision tree.

There are many criteria by which node impurity is

minimized in a classification problem. The Gini index

is frequently used in practice. Random Forests

basically use the Gini index too. The Gini index

supposes there are a total of K classes, each indexed

by k. Let prk be the proportion of class k observations

in node r. The Gini index can then be written as

)1(1 rkrk

K

kpp

. In each node splitting, for most

decreasing this performance measures, if-then rules,

set using variables and set for border values.

3.2 Random Forests

Random Forests usually make many decision trees

using randomly selected statistics to split the nodes

and take a majority vote from each tree. Random

Forests have been increasing in popularity because the

performance is high. The method is robust against

over-fitting and is not very sensitive to parameter

settings. In literature[31], a survey and comparison

were carried out regarding the measure of variable

importance, the application method, and an example of

the application.

Fig. 3. The learning phase in Random Forests.

In our study, Random Forests is used to randomly

select shipping statistics to generate trees and

aggregate the classification results of these trees to

obtain the appropriate ordering policy through a

plurality vote. The image of applying Random Forests

is shown in Figs. 3 and 4. Figure 3 expresses the

learning phase of Random Forests. We pick l learning

items from n items. We set the optimal ordering policy

for each learning item, deciding o1. We apply the

x1

Items

x1 > a ?

x2 > b ?

x2

a

b

Yes

Yes

No

No

6

8

9

7

1

2

3

4

510

6

8 9

7

1 2 3 4 5

2

3

4

5

1

6 8 97

Shipping statistics

Shipping

statistics

10

10

New item

xi1 > a ?

xi2 > b ?

a

b

Yes

Yes

No

No

ii

i i

Optimal ordering policy for item i might be

x1Shipping statistics

x2

Shipping

statistics

…

…

Columns are randomly selected …

=

=

=


following procedure for the learning phase in each tree

construction:

1. Pick m shipping statistics, vector x, from X.

2. Then, m shipping statistics are used to split a

node of the tree, making an if-then rule at the node

and split items to optimal ordering policy using

shipping statistics

3. Take a bootstrap sample from the items.

Usually, we use the rest of the items to estimate the

error of the tree by predicting their classes.

4. For each node of the tree, calculate the best

split in the learning items from randomly chosen m

shipping statistics.

5. Since some branching algorithms terminate

branching even if there are multiple classes at the end

of the tree, each tree is fully grown (i.e., there is only

one ordering policy at the end of the tree).

Figure 4 expresses the testing phase of Random

Forests. In the testing phase, the if-then rule of each

tree is applied for each item. Prediction is assigned by

the learning tree in the terminal node. This procedure

is iterated over all trees. Then the results of majority

votes by all trees is reported as the prediction of

Random Forests. The number of trees is the parameter

of Random Forests.

Fig. 4. The testing phase in Random Forests.

We follow the conventionally recommended

parameter setting in the experiment. For the

classification parameters, fully grown Random Forests

recommend 1 (i.e., the end point of the node in Step 5

should be 1), and the number of bootstrap samples is

l/3. The number of shipping statistics to be used to

split a node of the tree, m, is set to be √v . In our case,

v is the number of all shipping statistics.

The Random Forests algorithm we used is the most

popular one. Therefore, shipping statistics were

selected at each branch for the Gini index that

decreases the most in the tree of Random Forests. At

each branching, we record which shipping statistics

were used and how the Gini index was reduced by the

branching. After applying Random Forests for the

learning data, we join the records. The Gini index-based

variable importance measure is then given by the average

decrease in the Gini index in the forest, where the variable

is used to branching a node. Then, we used the records

as the indicator of importance for the shipping

statistics. In our case, measurements are made using

the Gini index as well as the decision tree. We

measure the importance of shipping statistics using

these values. Since a shipping statistics variable is

chosen at random at each node, the importance of

measurement changes stochastically. However, the

importance of variable measurement does not change

dynamically.

4 PROPOSED METHODS

4.1 Problem Outline

This study considers the inventory management

problem for items having different shipping statistics.

We assume the situation where only one ordering

policy can be used for one item group. In that situation,

grouping items corresponds to deciding ordering

policy. Therefore, we represent grouping items as

deciding ordering policy. Here, it is assumed that the

result of applying an ordering policy is different

according to the shipping statistics. To achieve

efficient inventory management, we propose an

approach to evaluate the importance of shipping

statistics and then decide a suitable ordering policy for

each item based on its important shipping statistic.

In this section, mainly we describe a procedure to

find important shipping statistics for deciding the

ordering policy of each item.

4.2 Proposed method

Figure 5 shows the flowchart of the proposed

method with two stages, enclosed by dashed line, and

experimental procedure. In Stage 1, we find important

shipping statistic variables for deciding the ordering

policy. In stage 2, the ordering policy for each item is

decided using a classifier. In the numerical experiment,

3 VS 1

u

Majority vote

New ItemRandom Forests classifier

Vol.64 No.4E (2014) 583

we compare classification techniques by comparing

the optimal ordering policy and the decided ordering

policy using a classifier.

Fig. 5. The flowchart of the proposed method.

First, we assume that there are “Shipping data” for

each item. Since there is a problem that we have

obtained results that are suitable only for specific data

using the method we created, we separate items for

“learning, named learning item” and “testing, named

testing item.” We divide items to “Shipping data

(Learning data)” and “Shipping data (Testing data)”.

“Shipping data (Learning data)” is used for our

proposed method. Additionally “Shipping data

(Testing data)” is used for checking the performance

of the proposed method; this is mainly used in Section

5. The procedures not surrounded by the dashed line

are those for testing the proposed method as applied in

Section 5, and correspond to our following procedures

for finding important shipping statistics to decide the

ordering policy.

In Stage 1, we find the important shipping statistics.

From “Shipping data (Learning data)”, we attempt to

find important shipping statistics for deciding the

ordering policy. Since, in many cases, no results are

obtained when applying two or more ordering policies

to one item simultaneously, as in our case, we

simulate each ordering policy at “Simulate ordering

policies.” From the result of “Simulate ordering

policies,” we compare the result of each ordering

policy and set the optimal ordering policy for each

learning item. Then, we obtain the optimal ordering

policy for each learning item, named “Optimal

ordering policy (Learning data).” Next, we calculate

the shipping statistics of each learning item at

“Calculate shipping statistics.” We then obtain the

shipping statistics, which is a mixture of important

shipping statistics and unimportant shipping statistics ,

named “Shipping statistics (Learning data).”

From “Shipping statistics (Learning data)” and

“Optimal ordering policy (Learning data),” we attempt

to find important shipping statistics for deciding the

optimal ordering policy for each learning item. Then,

at “Apply Random Forests between shipping statistics

and optimal ordering policy,” we apply Random

Forests between “Shipping statistics (Learning data)”

of all the learning items and “Optimal ordering policy

(Learning data)” of all the learning items to obtain the

classifier and importance of shipping statistics, named

“Importance of shipping statistics.”

From “Importance of shipping statistics,” we

determine which shipping statistics were important in

“Shipping statistics (Learning data)”.

In Stage 2, we decide the ordering policy for each

item. In “Apply classifying technique between

shipping statistics and optimal ordering policy to make

classifier,” we are able to make a more precise

classifier, which predicts output, optimal ordering

policy for each item from input, shipping statistics

from “Optimal ordering policy (Learning data)” and

important shipping statistics, which are the result of

“Shipping statistics (Learning data)” and “Importance

of shipping statistics.”

The performances of the classifiers were tested in

Section 5. In Section 5, we compared three techniques,

Support Vector Machine (SVM), Decision Tree (DT)

and Random Forests (RF), with three situations, used

all shipping statistic variables, used important

shipping statistics by Random Forests, and used

unimportant shipping statistics by Random Forests.

Importance of

shipping statistics

Shipping data

(Testing data)Apply classifying technique

between important shipping statistics and

optimal ordering policy to make classifier

classifier

Apply classifier for shipping statistics(Testing data)

(Apply classifier for predicting optimal ordering policy)

Calculate shipping

statistics

Simulate ordering policies

Optimal ordering policy

(Testing data)

Shipping statistics

(Testing data)

Which ordering policy

should we use for testing data

Check hit rate of classifier

Shipping data

(Learning data)

Calculate shipping

statisticsSimulate ordering policies

Apply Random Forests

between shipping statistics and

optimal ordering policy

Optimal ordering policy

(Learning data)

Shipping statistics

(Learning data)

Stage 1

Stage 2

Shipping data


In Section 4.2.1, we will explain the details of

“Calculate shipping statistics.” In Section 4.2.2, we

will explain the details of how to find important

shipping statistics.

Our proposed model (framework) can also use even

if there is an outperforming technique which can

classify the optimal ordering policy from two or more

statistic values while the technique could reveal the

importance of shipping statistics.

4.2.1 Data preprocessing

First, we assume that there are shipping data that

provides the shipping date and volume of items, as in

shown in Table 1. From this shipping data, we

calculate shipping statistics and simulate each

ordering policy for evaluating some criteria.

Table 1 Shipping data

In the following steps, we repeatedly use

multivariate data matrices, as shown in Table 2.

Before we proceed to explaining the analysis method,

we will first explain the premise behind multivariate

data matrices.

Table 2 General variable of each item

Let X denote an n v multivariate data matrix, which

represents the shipping statistics. There are n items

and v shipping criteria. xij gives the value for the jth

shipping statistics of the ith item. For example, if the

index of average daily shipping is shipping statistics j,

we can calculate the value of the ith item and xij = (di1

+ di2 + … + dim)/m for all i in {1..., n} as in Table 1.

4.2.2 The data mining process

Let Y denote the multivariate data matrix, which

represents the evaluation values of ordering policies,

as the number of how many inventories or number of

how many shortages. Y is similar to the previously

mentioned X. For example, yizp gives the value for the

zth evaluation criteria, as labeling “inventories” or

labeling “shortages” of the pth ordering policy of the

ith item. So, yzp denotes the n 1 variable matrix of

the zth evaluation criteria of the pth ordering policy.

Then, Yz denotes the n q variable matrix of the zth

evaluation criteria.

It is necessary to find the relation between shipping

statistics and evaluation values of the ordering policy

since there are differences in the scale of numbers in

terms of sum of inventories and shortages. We should

standardize the evaluation value of each evaluation

criteria and item in order to compare ordering policies.

If yikp is the value for the kth evaluation criteria of the

pth ordering policy of the ith item, then let y’ikp be the

standardized evaluation value for this same

combination, where the y’ikp is calculated using Eq. (1)

and take values in [0,1].

pkiyy

yyy

ikpp

ikpp

ikpp

ikp

ikp ,,,)(min)(max

)(min'

(1)

From evaluation values of the previous steps, we

set the optimal ordering policy for each item. We can

then make a relational table between shipping

statistics and the optimal ordering policy of each item,

as shown in Table 3. Based on this table, we will find

which shipping statistics are important for deciding

the ordering policy.

Table 3 Shipping statistics and optimal ordering

policy

Most classifying methods need the matrix of

quantitative variable X and qualitative variable oz, as

shown in Table 3. If those data are available, we can

apply most classifying methods by prearranged

algorithm and a series of each classifying method. We

made Random Forests using the algorithm shown in

Section 3.2.

Item

No.

Date : T Average

(statistic j)day : 1 … day : m

1 d11 … d1m (d11+d12+…+d1m)/m=x1j… … … …n dn1 … dnm (dn1+dn2+…+dnm)/m=xnj

Item

No.

Shipping statistics : X Evaluation value of ordering policy : Y

Statistic 1

x1…

Statistic v

xv

Evaluation criteria 1 : Y1

…

Evaluation criteria z : Yz

Ordering

policy 1 : y11…

Ordering

policy q : y1q

Ordering

policy 1 : yz1…

Ordering

policy q : yzq

1 x11

…

x1v y111

…

y11q

…

y1z1

…

y1zq… … … … … … …

i xi1 xiv yi11 yi1q yiz1 yizq… … … … … … …

n xn1 xnv yn11 yn1q ynz1 ynzq

Item

No.

Shipping statistics : X Optimal ordering policy : O

x1 … xv

Evaluation

criteria 1 :

o1

…

Evaluation

criteria z :

oz

1 x11

…

x1v o11

…

o1z… … … … …

i xi1 xiv oi1 oiz… … … … …

n xn1 xnv on1 onz

Vol.64 No.4E (2014) 585

5 NUMERICAL EXPERIMENTS

Next, we calculate the shipping statistics from

actual shipping data and generate evaluation values for

ordering policies from the results of the simulating of

every ordering policy. From the results of each

ordering policy, we set an optimal ordering policy for

each item. These optimal ordering policies are

predicted using a classifying method. We then

compare the hit rate, which is the rate when the

classified ordering policy and specified optimal

ordering policy are same. Unfortunately, we didn't

have the actual data for applying ordering policies to

each item. And, it is impossible to have all the desired

data in any situation. Especially, managerial data tends

to be unavailable to outsiders in many cases. So, from

shipping data, we simulate each ordering policy and

check the performance of each ordering policy, and

the average inventories and shortages. Inventories and

shortage data are tentatively set for experiments in this

paper. This means that the proposed technology is

independent from specific industries. Next, we label a

specified optimal ordering policy for each item. We

apply Random Forests between the shipping statistics

and optimal ordering policies. From these results, we

can obtain the importance of the shipping statistics

that exhibit a relatively strong relation with the most

fitting ordering policy.

In a previous study, Yu [21] applied support vector

machines and outperform multiple discriminant

analysis (MDA), back-propagation networks, and a k-

NN algorithm. In this study, we apply and compared

three classification techniques: Support Vector

Machine (SVM), Decision Tree (DT) and Random

Forests (RF).

We apply DT, SVM and RF to shipping statistics

and the optimal ordering policy, and compare the

results of the method’s hit rate.

For testing the important shipping statistics

calculated from Random Forests, we make three

situations and test the usefulness of the importance of

shipping statistics. We also tested to determine if the

importance of shipping statistics based on Random

Forests is indeed meaningful for the other classifying

methods or not. We apply three classifying methods

(i.e., DM, SVM and RF), and compare the

performance of each classifying method in each

situation. The situations are as follows: 1) Use all

shipping statistics for classifying ordering policy. In

this case, significant and insignificant shipping

statistics are used for the learning and classifying

processes of each method. 2) Use a certain level of

significant shipping statistics for classifying ordering

policy. In this case, some meaningless shipping

statistics are ignored for the learning and classifying

processes. 3) Use many insignificant shipping

statistics for the learning and classifying processes.

From 3), we check the results to ensure they are not

results from an accident.

In this study, we used R (version 2.14.1), which is

a free software environment for statistical computing

and graphics, for DT, SVM and RF. The libraries are

“mvpart,” “e1071,” and "randomForest." The

experimental environment is an Intel® Core™ 2 Duo

CPU, E8400 at 3.00 GHz and with 4.00 GB of RAM;

we did not need long computational times for

processing.

5.1 Ordering policy and experiment

We used two ordering policies for this study: the

so-called (s, Q) policy and the (R, s, S) policy. These

policies are the most commonly used ordering policies

for inventory control.

For the (s, Q) policy, when the inventory position

declines to or below the reorder point, s, a batch

quantity of size Q is ordered. In this case, the

inventory levels never exceed the position s + Q .

For the (R, s, S) policy, at the end of each time

period of length R, if the inventory position declines

to or below the reorder point, s, an amount equal to the

difference between the order-up-to level S and the

current inventory position is ordered. In the special

case where s = S - 1, we call this the (R, S) policy.

In this study, we use the (s, Q) and (R, S) policies.

In particular, s, Q, and S of the ith item in the tth date

period are denoted by sit, Qit, and Sit, which are set as

follows:

TtIisQ itit , (2)

TtIiSP

d

d

t

SPtl

il

it

,

1

(3)

TtIissdRLTS iitit , (4)

dayth on the item of averageshipment :

item ofstock safety :

dayth on the item of level to-up-order :

dayth on the itemfor point reorder :

dayth on the item ofquantity ordering:

tid

iss

tiS

tis

tiQ

it

i

it

it

it


termreview periodic :

dayth on the item of shipments :

R

tid it

days ofset the:

items ofset the:

period sampling :

timelead:

T

I

SP

LT

Equation (2) shows that the ordering quantity of

item i is the same amount as the reorder point of item i.

Normally, when we can estimate ordering cost and

holding cost, we calculate the economic order quantity

for Qit. Equation (3) provides the sum average of daily

shipping volume in the sampling period. Equation (4)

shows that the replenishment level, Sit, was concerned

with lead-time and periodic review term.

5.2 Data Set

In numerical and simulation experiments, we used

the actual shipments data of a distribution company in

2010. These items are categorized broadly into 11

categories, such as beer, Japanese sake, foods, and

spices. There are no oblivious characteristic trends for

these categories, and we could not find any trends by

categorizing the items. There are no production items

in the inventory. We did not take up brand data in this

research because we were unable to get brand

information about the items in data. , But, if there is a

trend for brands, brands will be introduced into the

proposed method as a qualitative variable.

We simulate (s, Q) and (R, S) policies from the

shipping data, which are then used to calculate

shipping statistics. We set common parameters

between the set of (s, Q) and (R, S) policies as

described below. Sampling period SP was set to four

weeks, lead time LT was set to two days, and safety

stock, ssi was set individually to a special value, which

will not cause a stock-out of more than 1% of the total

number of shipments.

We compare the performance between three

ordering policies: one (s, Q) policy and two (R, S)

policies, for R=2 or 7 days. For simulation, we use the

first month of shipping data for calculating the initial

values of Sit. For the (R, S) policies, Sit values are

recalculated periodically, and sit and Qit are fixed at

the initial calculation values, which will not cause a

stock-out of more than 1% of the total number of

shipments.

For applying the classification method, we compare

the total storage of each ordering policy and set an

optimal ordering policy for each item. From the ratio

between the total storage of the specified ordering

policy and the other ordering policy, we select 100

items for each specified ordering policy. Because we

apply three ordering policies, 300 items are selected.

The other items were less frequently shipped, smaller

amounts, or the ratio between the total storage of the

specified ordering policy and the other ordering policy

is not so large. In order to compare each technique in

terms of whether or not prediction can be

appropriately done from shipping statistics, we select

two-thirds of 300 randomly selected items; so 200

items were selected, and used for learning data. The

other one-third (i.e., 100) items are used for testing,

and we compare the results of each method’s hit rate;

that is, how many optimal ordering policies of new

items can assigned using by each classifying method.

From 300 items selected, we prepare 100 data sets and

conduct learning and testing data randomly, and

applying each technique.

5.3 Experimental results

The results of applying Random Forests to the set

of shipping statistics and specified ordering policy for

the 300 items selected are shown in Fig. 6. In Fig. 6,

shipping statistics are ranked in descending order of

importance. The horizontal axis shows the rank of

shipping statistics, and the vertical axis shows the

importance of shipping statistics. From the results, we

obtain the importance of all shipping statistics, 165

shipping statistics, to be classified.

Fig. 6. The importance of each shipping statistics.

dayth on the item of averageshipment :

item ofstock safety :

dayth on the item of level to-up-order :

dayth on the itemfor point reorder :

dayth on the item ofquantity ordering:

tid

iss

tiS

tis

tiQ

it

i

it

it

it

0

1

2

3

4

5

6

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161

Imp

ort

an

ce o

f sh

ipp

ing

sta

tist

ics

(cal

cula

ted

by

Ra

nd

om

Fo

rest

s)

index of shipping statistics

1)2)

3)

Vol.64 No.4E (2014) 587

From this result, for testing the importance of

shipping statistics, we select the number of shipping

statistics.

For case 1), we use all shipping statistics for

applying each method.

For case 2), we select the top 20 important shipping

statistics from Random Forests, which were indexed 1

to 20 in Fig. 6. In the selected shipping statistics,

those considered as important such as average daily

shipping interval, probability that a shipping interval

will be less than one week, average shipments ping

times in a monthly shipping by omitting the non-

shipping months, skewness of daily amount shipped,

kurtosis of the number of daily shipping lots,

maximum number of monthly shipments, skewness of

the number of daily shipping lots, skewness of amount

shipped daily, probability that the daily shipping

interval will be less than LT, average shipping times

per month, standard deviation of daily shipping

interval, standard deviation of (day's shipping

amount/interval from previous shipment to day's

shipment), average of (day's shipping amount/interval

from previous shipment to the day's shipment), total

shipments, maximum shipping interval, average

amount shipped weekly shipping amount, average

shipments in week by omitting the non-shipping weeks,

kurtosis of (day's shipping amount/interval from

previous shipment to the day's shipment), average lots

shipped lots in daily shipping by omitting the non-

shipping days and average of shipments per weekly

shipping times.

For case 3), we select the bottom 20 important

shipping statistics from Random Forests, which were

indexed 145 to 165 in Fig. 6. In the shipping statistics

selected, the shipping statistics clearly considered as

insignificant, such as minimum shipping volume and

month-long shipment count, are also included.

For cases 1) ~ 3), we apply DT, SVM and RF, and

compare each hit rate.

Figure 7 shows the results of applying Random

Forests and other methods for each of the three cases

explained as cases 1) ~ 3). The bars in the figure

represent the hit rate of each classifying method. The

black bar shows the confidence interval (95%) for

each method and case. Beside of case 2), in cases 1)

and 3), there is no overlap between the methods of

each case. RF achieved the best hit rate in all cases.

As some researchers claim that RF has a robust hit rate

from insignificant variables, these experimental results

are easy to accept.

Fig. 7. The results of applying each classifying

method.

When all the shipping statistics are used, in Case 1),

the hit rate of the SVM is inferior to the hit rate of RF.

This is believed to be because several insignificant

shipping statistics are included for classification.

Therefore, in Case 2), which uses the top 20 important

shipping statistics for classification by RF, the hit rate

of SVM improved. The hit rate for DT improved too.

So, the importance of shipping statistics from RF is

meaningful for other analysis or classifying methods.

6 CONCLUSION

Traditional ABC analysis should be replaced with

multi-criteria classification approaches in order to

manage inventory more efficiently.

But, conventional multi-criteria ABC analysis has

the following three problems. First, there are limited

classes for applying appropriate ordering policy and

the inventory manager cannot change parameters

easily. Next, the shipping statistics considered to be

important are almost all fixed. Finally, conventional

multi-criteria ABC analysis is not applicable if

important shipping statistics data is not on hand.

In our numerical experiments, actual shipping data

was used to calculate shipping statistics and simulate

ordering policies. To evaluate the ordering policies,

we considered the simulation results of (s, Q) and (R,

S) policies for each item. In addition, an optimal

ordering policy was set for each item from the results

of the simulation. We then applied Random Forests

45%

50%

55%

60%

65%

70%

75%

1)all 165

statistics

2)top20

(from RF)

3)bottom20

(from RF)

hit

ra

te

classifying method

DT

SVM

RF


analysis between the shipping statistics and optimal

ordering policy to find the important shipping

statistics. Next, we applied decision tree, support

vector machine and Random Forests between the

(un)important shipping statistics and predicted the

optimal ordering policy. Random Forests was used for

the model to find the important shipping statistics for

classification. Finally, we compared the results when

important shipping statistics were used and not used.

From the classifying results, we conclude that if

there are many insignificant shipping statistics,

Random Forests outperforms the support vector

machine. On the other hand, if insignificant shipping

statistics can be eliminated, Random Forests or other

methods would be useful for classifying and deciding

ordering policies.

REFERENCES

[1] Gruen, T.W., Corsten, D. and Bharadwaj, S.:

“Retail Out-of-stocks: A Worldwide

Examination of Extent Causes and Consumer

Responses,” Grocery Manufacturers of America,

Washington, D.C. (2002)

[2] Zipkin P.: “Old and New Methods for Lost-Sales

Inventory Systems,” Oper. Res., Vol. 56, No. 5,

pp. 1256-1263 (2008)

[3] Janakiraman, G., Seshadri, S. and Shanthikumar,

J.G.: “A Comparison of the Optimal Costs of

Two Canonical Inventory Systems,” Oper. Res.,

Vol. 55, No. 5, pp. 866-875 (2007)

[4] Huh, W.T., Janakiraman, G., Muckstadt, J.A. and

Rusmevichientong, P.: “Asymptotic Optimality

of Order-Up-To Policies in Lost Sales Inventory

Systems,” Manage. Sci., Vol. 55, Issue 3, pp.

404-420 (2009)

[5] Levi, R., Janakiraman, G. and Nagarajan, M.: “A

2-approximation Algorithm for Stochastic

Inventory Control Models with Lost Sales,”

Math. Oper. Res., Vol. 33, No. 2, pp. 351-374

(2008)

[6] Bijvank, M. and Vis, I.F.A.: “Lost-sales

Inventory Theory: A Review,” Eur. J. Oper. Res.,

Vol. 215, Issue 1, pp. 1-13 (2011)

[7] van Donselaar K.H. and Broekmeulen R.

A.C.M.: “Determination of Safety Stocks in a

Lost Sales Inventory System with Periodic

Review, Positive Lead-time, Lot-sizing and a

Target Fill Rate,” Int. J. Prod. Econ., Vol. 143,

Issue 2, pp. 440-448 (2013)

[8] Tsai, C.Y. and Yeh, S.W.: “A Multiple Objective

Particle Swarm Optimization Approach for

Inventory Classification,” Int. J. Prod. Econ. Vol.

114, Issue 2, pp. 656-666 (2008)

[9] Mohammaditabar, D., Ghodsypour S.H. and

O’Brien, C.: “Inventory Control System Design

by Integrating Inventory Classification and

Policy Selection,” Int. J. Prod. Econ., Vol. 140,

Issue 2, pp. 655-659 (2012)

[10] Flores, B.E. and Whyback, D.C.: “Implementing

Multiple Criteria ABC Analysis,” J. Oper.

Manage., Vol. 7, Issues 1-2, pp. 79-84 (1987)

[11] Jamshidi, H. and Jain, A.: “Multi-criteria ABC

Inventory Classification: With Exponential

Smoothing Weights,” J. Global Bus. Issues, Vol.

2, No.1, pp. 61-67 (2008)

[12] Ng, W.L.: “A Simple Classifier for Multiple

Criteria ABC Analysis,” Eur. J. Oper. Res., Vol.

177, Issue 1, pp. 344-353 (2007)

[13] Ramanathan R.: “ABC Inventory Classification

with Multiple-criteria Using Weighted Linear

Optimization,” Comput. Oper. Res., Vol. 33,

Issue. 3, pp. 695-700 (2006)

[14] Ernst, R. and Cohen, M.A.: “Operations Related

Groups (ORGs): A Clustering Procedure for

Production/Inventory Systems,” J. Oper.

Manage., Vol. 9, Issue 4, pp. 574-598 (1990)

[15] Flores, B.E., Olson, D.L. and Dorai, V.K.:

“Management of Multicriteria Inventory

Classification,” Math. Comput. Modelling, Vol.

16, Issue 12, pp. 71-82 (1992)

[16] Gajpal, P.P., Ganesh, L.S. and Rajendran, C.:

“Criticality Analysis of Spare Parts Using the

Analytic Hierarchy Process,” Int. J. Prod. Econ.,

Vol. 35 Issues 1-3, pp. 293-297 (1994)

[17] Partovi, F.Y. and Burton, J.: “Using the Analytic

Hierarchy Process for ABC Analysis,” Int. J.

Oper. Prod. Manage., Vol. 13, Issue 9, pp. 29-

44 (1993)

[18] Partovi, F.Y. and Hopton, W.E.: “The Analytic

Hierarchy Process as Applied to Two Types of

Inventory Problems,” Prod. Invent. Manage. J.,

Vol. 35, Issue 1, pp.13-19 (1994)

[19] Partovi, F.Y. and Anandarjan, M.: “Classifying

Inventory Using an Artificial Neural Network

Vol.64 No.4E (2014) 589

Approach,” Comput. Ind. Eng., Vol. 41, Issue 4,

pp. 389-404 (2002)

[20] Guvenir, H.A. and Erel, E.: “Multi Criteria

Inventory Classification Using a Genetic

Algorithm,” Euro. J. Oper. Res., Vol. 105, Issue

1, pp. 29-37 (1998)

[21] Yu, M.C.: “Multi-criteria ABC Analysis Using

Artificial-intelligence-based Classification

Techniques,” Expert Syst. Appl, Vol. 38, Issue 4,

pp. 3416-3421 (2011)

[22] Vapnik, V.N.: The Nature of Statistical Learning

Theory, Springer-Verlag (1995)

[23] Xiao, Y., Zhang, R. and Kaku, I.: “A New

Approach of Inventory Classification Based on

Loss Profit,” Expert Syst. Appl., Vol. 38, Issue 8,

pp. 9382-9391 (2011)

[24] Tiacci, L. and Saetta, S.: “An Approach to

Evaluate the Impact of Interaction between

Demand Forecasting Method and Stock Control

Policy on the Inventory System Performance,”

Int. J. Prod. Econ., Vol. 118, Issue 1, pp. 63-71

(2009)

[25] Berling, P. and Marklund, J.: “Heuristic

Coordination of Decentralized Inventory

Systems Using Induced Backorder Costs,” Prod.

Oper. Manag., Vol. 15, Issue 2, pp. 294-310

(2006)

[26] Nagasawa, K., Irohara, T., Matoba, Y. and Liu,

S.: “Selecting Ordering Policy and Items

Classification Based on Canonical Correlation

and Cluster Analysis,” Ind. Eng. Manage. Syst.,

Vol. 11, No. 2, pp. 134-141 (2012)

[27] Basu, A. and Mandal, A.: Canonical Correlation,

International Encyclopedia of Education,

Elsevier, pp. 52-57 (2010)

[28] Hotelling, H.: “Relations between Two Sets of

Variables,” Biometrika, Vol. 28, No. 3, pp. 321-

377 (1936)

[29] Breiman, L.: “Random Forests,” Mach. Learn.,

Vol. 45, No. 1, pp. 5-32 (2001)

[30] Breiman, L., Friedman, J.H., Olshen, R.A. and

Stone, C.J.: Classification and Regression Trees,

Chapman & Hall/CRC, New York (1984)

[31] Verikas, A., Gelzinis, A. and Bacauskiene, M.:

“Mining Data with Random Forests: A Survey

and Results of New Tests,” Pattern Recognition,

Vol. 44, Issue 2, pp. 330-349 (2011)


Applying Random Forests to Decide Ordering Policy Based on ...

Documents

Transcript of Applying Random Forests to Decide Ordering Policy Based on ...