Applying Random Forests to Decide Ordering Policy Based on ...
-
Upload
khangminh22 -
Category
Documents
-
view
2 -
download
0
Transcript of Applying Random Forests to Decide Ordering Policy Based on ...
Original Paper
Applying Random Forests to Decide Ordering Policy Based on Important Shipping Statistics
Keisuke NAGASAWA †1, Takashi IROHARA
†1, Yosuke MATOBA †2 and Shuling LIU
†2
Abstract: This study was motivated by challenges facing inventory managers when deciding the
ordering policy for various items. It is difficult to find an appropriate ordering policy for many types
of items. We propose a model that changes conventional multi-criteria ABC analysis so that it is
suitable for use by inventory managers. We indicate that categorizing items based on their statistical
characteristics leads to an ordering policy suitable for each item. We propose a method for deciding
the ordering policy based on important shipping statistics and a classification technique. For this
method, we analyze the relation between shipping statistics and the ordering policy for searching
important shipping statistics. We classify items by shipping statistics and then decide the ordering
policy for each item. In the numerical experiment, we used actual shipment data to calculate many
shipping statistics that represent the characteristics of each item. Next, we found the important
shipping statistics from Random Forests and applied them to decide the ordering policy. Finally, for
confirming the importance of important shipping statistics, we tested the performance of Random
Forests and other classifying methods using the important shipping statistics. It was found that the
performance of each classifying method was improved.
Key words: Inventory management, ordering policy, multivariable analysis, Random Forests
1 INTRODUCTION
Effective inventory management has played an
important role in the success of supply chain
management. Deciding the ordering policy has a big
effect on the performance of supply chain
management. If a single ordering policy is applied to
many different items with different shipping statistics,
the inventory would be a mix of items that had
frequent shortages and those having excess inventory.
On the other hand, even using multiple ordering
policies, it is not easy to find an appropriate policy
for many types of items, particularly when each has a
different shipping statistics. In addition, given a
particular shipping statistics, no method or criterion
has been specifically developed for choosing the
ordering policy to that would be applied to these
items. For these reasons, the proper relationship
between shipping statistics and the appropriate
ordering policies is not still clear. Practically, the
ordering policy should be decided by some
supposedly important shipping statistics such, as lead
time, purchase unit price or demand fluctuation.
The most important problem for to deciding the
appropriate ordering policy appropriately for many
items has not been clarified is not erected. Thus,
some researchers have dealt with this problem by an
approach called “item classification.” That is, items are
classified and, inventory managers change the ordering
policy according to items’ classification.
In this research, we use Random Forests. There are
business contributions and practical importance. First,
the inventory manager can easily change the number
that might be used for each item in the ordering policy
which might be used for each item. Next, important
shipping statistics, calculated by Random Forests, for
applying an appropriate ordering policy becomes clear.
Moreover, it may be possible we might be able to use
the important shipping statistics by Random forests
even if using another technique is used.
We propose a framework for deciding ordering
policy by classification. There are five steps. First, we
calculate the shipping statistics for each item. Second,
we divide the items into two groups: learning item
group and testing item group. Third, we assign an
optimal ordering policy for each item of the learning
item group. Fourth, for the learning item group, we
†1 Sophia University †2 Fairway Solutions Inc. Received: November 30, 2012
Accepted: September 12, 2013
J Jpn Ind Manage Assoc 64, 579-590, 2014
Vol.64 No.4E (2014) 579
analyze the relation between shipping statistics and
assigned optimal ordering policies, and find
significant shipping statistics for deciding the ordering
policy. Finally, we used these important shipping
statistics for deciding the ordering policy for each item
of the testing item group.
The remainder of this paper is organized as
follows. In Section 2, we review relevant literature.
In Section 3, we explain previous knowledge of
classification and classification techniques relating to
our proposed method. In Section 4, we give an
outline of the proposed method, including how to
convert actual shipping data to data matrices for
analysis. The details of the experiment and results of
analysis are reported in Section 5.
2 LITERATURE REVIEW
When there is a shortage of items, there are two
ways to cope with it: “backorder” and “lost-sales.”
Although backorder models have been studied by
many researchers and practitioners, there are few
papers that discuss lost-sales models.
Some researchers have investigated the difference
between backorder models and lost-sales models. In
the case of stock-outs, Gruen et al.[1] revealed that
only 15% of the customers who observe a stock-out
will wait for the item to be restocked. The other 85%
of the customers decide to buy a different product,
visit another store, or do not buy any product at all.
According to Zipkin[2], the cost deviations can run
up to 30% when the lost-sales system is
approximated by a backorder model. Janakiraman et
al.[3] compared the performance of optimal
replenishment policies in lost-sales and backorder
models. Huh et al.[4] compared the performance of
base-stock policies in lost-sales and backorder
models. Levi et al.[5] proposed a dual-balancing
policy, in which the risks of ordering and holding are
balanced in lost-sales models. Bijvank and Vis[6]
classified the models based on the characteristics of
the inventory system and reviewed the proposed
replenishment policies. Van Donselaar and
Broekmeulen[7] showed that the lost-sales system
can simply be approximated by a backordering
system if the target fill rate is at least 95%, which
may lead to serious approximation errors. So, even if
some method has meaningful results as a backorder
model, there is no guarantee of its significance as a
lost-sales model. Therefore, a common concern of
researchers and inventory managers is whether or not
the results of backorder models can be applied to
lost-sales models.
Some researchers tackle backorder models and
lost-sales models by applying an item classification
approach where inventory managers change the
ordering policy according to item classification. This
approach can also be classified into two categories:
“optimization” and “grouping.”
When the optimization approach is applied, the
ordering policy of each item is determined by
optimization even if important shipping statistics are
not revealed. Tsai and Yeh[8] used a particle swarm
optimization approach for an inventory classification
problem. There is another research study in which the
main point is reducing total cost by grouping similar
tendency items. Mohammaditabar et al.[9] used
simulated annealing to group similar items and find
the best ordering policy simultaneously. However,
when using that method, the optimal ordering policy
might be used for each item. There is also a demerit
when using optimization. For example, the result is
completely dependent on input data. If a new item is
added or deleted, the manager must recalculate again.
On the other hand, some inventory managers need
rules: if they follow the rules, they can obtain
optimal or satisfactory results. Inventory managers
often think that they would like to respond flexibly to
item addition and deletion using item shipping
statistics. Grouping is recommended for such
inventory managers.
A typical method of grouping is ABC analysis.
This method is based on the Pareto principal. Items
classified as Class A are commonly those items of
high importance but few in number. Items in Class C
are less important but large in number, and those in
Class B are between the other two classes. For
classical ABC analysis, “annual dollar usage” is
basically used for the evaluation value. In order to
simplify inventory control, many researchers have
created similar methods for grouping items or
ranking items by shipping statistics.
Although ABC analysis is famed for its ease of
use, it has been criticized for focusing on a single
criteria. Many researchers have used two or more
580 J Jpn Ind Manage Assoc
evaluation criterion for grouping, called “multi-
criteria ABC classification.” Item classification
problems based on multi-criteria ABC classification
are still being studied. Other criteria such as lead-
time, commonality, obsolescence, durability,
inventory cost and order size requirements have also
been claimed as being significant for inventory
classification (Flores and Whyback[10], Jamshidi and
Jain[11], Ng[12] and Ramakrishnan[13]). Ernst and
Cohen[14] used cluster analysis to group similar
items. Flores et al.[15] provided a matrix-based
methodology for multi-criteria ABC classification.
Gajpal et al.[16], Partovi and Burton[17], and Partovi
and Hopton[18] applied the analytical hierarchy
process (AHP) to ABC analysis. However, there is
little documentation discussing which shipping
statistics should be used.
Some researchers have given attention to the
method for deciding the ordering policy for each item
from shipping statistics. Ramakrishnan[13] proposed
a weighted linear optimization model and classified
inventory items using multiple criteria. Genetic
algorithms and artificial neural networks have also
been applied (Partovi and Anandarajan[19] and
Guvenir and Erel[20]). Yu[21] applied support vector
machines (SVM) and made comparisons using
traditional multiple discriminant analysis, back-
propagation networks, and the k-nearest neighbor
algorithm. (Support Vector Machine was originally
developed by Vapnik [22] at Bell Laboratories.)
Moreover, some research studies manage the
ordering policy by considering the shipping
relationship between items. Xiao et al.[23]
considered the importance of inventory items based
on a loss rule. Tiacci and Saetta[24] considered the
relationship between forecasting and ordering policy
using carrier capacity. If there is relation between the
shipment of items and the evaluation value of
inventory control, an evaluation value can be
predicted when an ordering policy is applied. Berling
and Marklund[25] used a linear regression technique
to obtain approximate values of the induced
backorder cost in a one-warehouse multiple-retailer
system.
Shipping statistics were used for deciding the
ordering policy. However, there are a few papers that
consider which shipping statistics are important. For
example, in a previous study (Nagasawa et al.[26]),
the authors applied canonical correlation analysis to
a set of shipping statistics and the evaluation values
of ordering policies. Canonical correlation analysis is
one method for searching the relationship between two
sets of variables; as between shipping statistics of each
item and evaluation values of ordering policies for
each item. According to Basu and Mandal[27], this
was initially developed by Hotelling[28]. The study
by Nagasawa et al. suggests that shipping statistics
have different importance or significance for
decidinge ordering policy. However, no
consideration has been given to how to classify items
based on the importance of shipping statistics and
how to apply ordering policies to each item group.
In our article, using multi-criteria ABC analysis,
we focus on applying an ordering policy for each
item using the shipping statistics. There are two
differences from typical multi-criteria ABC analysis.
First, typical multi-criteria ABC analysis assumes
that all shipping statistics have to be used. On the
other hand, we decide the priority for which shipping
statistics are important in deciding ordering policies.
Second, our approach is not limited to three classes
since multiple ABC analysis classifies items into
three classes.
3 CLASSIFICATION TECHNIQUE
We will now briefly explain decision tree and
Random Forests algorithms.
Random Forests is a more advanced decision tree
technique, which was proposed by Breiman[29]. Since
Random Forests is a collection of decision trees, first
of all, we briefly introduce decision trees.
3.1 Classification and Regression Tree
A decision tree is a useful technique and consists of
a series of if-then rules for dependent variables. When
using a decision tree, the classification and regression
tree (CART) is one of the data mining algorithms for
classification and regression (Breiman et al.[30]).
CART can handle both qualitative and quantitative
variables. The CART algorithm partitions the
independent variable space into two regions according
to the performance measures. The output of CART is a
hierarchical structure that consists of a series of if-
then rules to predict the outcome of the dependent
Vol.64 No.4E (2014) 581
variables. The image of applying a decision tree using
CART is shown in Figs. 1 and 2.
Figure 1 expresses the learning phase of the
decision tree. The illustration on the left represents how
items are plotted when shipping statistics x1 and x2 are
chosen. There are 10 items. An optimal ordering policy
was set for each item, represented as a square and a
hexagon. The illustration on the right represents the type
of tree structure created and the if-then rules made.
Fig. 1. The learning phase of the decision tree.
Fig. 2. The testing phase of the decision tree.
Figure 2 expresses the testing phase of the decision
tree. If a new item, i is given, the if-then rule is
applied to it. We can predict what ordering policy
should be used from where the item is assigned. In the
figure, item i should be applied to the ordering policy
square from the results of the decision tree.
There are many criteria by which node impurity is
minimized in a classification problem. The Gini index
is frequently used in practice. Random Forests
basically use the Gini index too. The Gini index
supposes there are a total of K classes, each indexed
by k. Let prk be the proportion of class k observations
in node r. The Gini index can then be written as
)1(1 rkrk
K
kpp
. In each node splitting, for most
decreasing this performance measures, if-then rules,
set using variables and set for border values.
3.2 Random Forests
Random Forests usually make many decision trees
using randomly selected statistics to split the nodes
and take a majority vote from each tree. Random
Forests have been increasing in popularity because the
performance is high. The method is robust against
over-fitting and is not very sensitive to parameter
settings. In literature[31], a survey and comparison
were carried out regarding the measure of variable
importance, the application method, and an example of
the application.
Fig. 3. The learning phase in Random Forests.
In our study, Random Forests is used to randomly
select shipping statistics to generate trees and
aggregate the classification results of these trees to
obtain the appropriate ordering policy through a
plurality vote. The image of applying Random Forests
is shown in Figs. 3 and 4. Figure 3 expresses the
learning phase of Random Forests. We pick l learning
items from n items. We set the optimal ordering policy
for each learning item, deciding o1. We apply the
x1
Items
x1 > a ?
x2 > b ?
x2
a
b
Yes
Yes
No
No
6
8
9
7
1
2
3
4
510
6
8 9
7
1 2 3 4 5
2
3
4
5
1
6 8 97
Shipping statistics
Shipping
statistics
10
10
New item
xi1 > a ?
xi2 > b ?
a
b
Yes
Yes
No
No
ii
i i
Optimal ordering policy for item i might be
x1Shipping statistics
x2
Shipping
statistics
…
…
Columns are randomly selected …
=
=
=
582 J Jpn Ind Manage Assoc
following procedure for the learning phase in each tree
construction:
1. Pick m shipping statistics, vector x, from X.
2. Then, m shipping statistics are used to split a
node of the tree, making an if-then rule at the node
and split items to optimal ordering policy using
shipping statistics
3. Take a bootstrap sample from the items.
Usually, we use the rest of the items to estimate the
error of the tree by predicting their classes.
4. For each node of the tree, calculate the best
split in the learning items from randomly chosen m
shipping statistics.
5. Since some branching algorithms terminate
branching even if there are multiple classes at the end
of the tree, each tree is fully grown (i.e., there is only
one ordering policy at the end of the tree).
Figure 4 expresses the testing phase of Random
Forests. In the testing phase, the if-then rule of each
tree is applied for each item. Prediction is assigned by
the learning tree in the terminal node. This procedure
is iterated over all trees. Then the results of majority
votes by all trees is reported as the prediction of
Random Forests. The number of trees is the parameter
of Random Forests.
Fig. 4. The testing phase in Random Forests.
We follow the conventionally recommended
parameter setting in the experiment. For the
classification parameters, fully grown Random Forests
recommend 1 (i.e., the end point of the node in Step 5
should be 1), and the number of bootstrap samples is
l/3. The number of shipping statistics to be used to
split a node of the tree, m, is set to be √v . In our case,
v is the number of all shipping statistics.
The Random Forests algorithm we used is the most
popular one. Therefore, shipping statistics were
selected at each branch for the Gini index that
decreases the most in the tree of Random Forests. At
each branching, we record which shipping statistics
were used and how the Gini index was reduced by the
branching. After applying Random Forests for the
learning data, we join the records. The Gini index-based
variable importance measure is then given by the average
decrease in the Gini index in the forest, where the variable
is used to branching a node. Then, we used the records
as the indicator of importance for the shipping
statistics. In our case, measurements are made using
the Gini index as well as the decision tree. We
measure the importance of shipping statistics using
these values. Since a shipping statistics variable is
chosen at random at each node, the importance of
measurement changes stochastically. However, the
importance of variable measurement does not change
dynamically.
4 PROPOSED METHODS
4.1 Problem Outline
This study considers the inventory management
problem for items having different shipping statistics.
We assume the situation where only one ordering
policy can be used for one item group. In that situation,
grouping items corresponds to deciding ordering
policy. Therefore, we represent grouping items as
deciding ordering policy. Here, it is assumed that the
result of applying an ordering policy is different
according to the shipping statistics. To achieve
efficient inventory management, we propose an
approach to evaluate the importance of shipping
statistics and then decide a suitable ordering policy for
each item based on its important shipping statistic.
In this section, mainly we describe a procedure to
find important shipping statistics for deciding the
ordering policy of each item.
4.2 Proposed method
Figure 5 shows the flowchart of the proposed
method with two stages, enclosed by dashed line, and
experimental procedure. In Stage 1, we find important
shipping statistic variables for deciding the ordering
policy. In stage 2, the ordering policy for each item is
decided using a classifier. In the numerical experiment,
3 VS 1
u
Majority vote
New ItemRandom Forests classifier
Vol.64 No.4E (2014) 583
we compare classification techniques by comparing
the optimal ordering policy and the decided ordering
policy using a classifier.
Fig. 5. The flowchart of the proposed method.
First, we assume that there are “Shipping data” for
each item. Since there is a problem that we have
obtained results that are suitable only for specific data
using the method we created, we separate items for
“learning, named learning item” and “testing, named
testing item.” We divide items to “Shipping data
(Learning data)” and “Shipping data (Testing data)”.
“Shipping data (Learning data)” is used for our
proposed method. Additionally “Shipping data
(Testing data)” is used for checking the performance
of the proposed method; this is mainly used in Section
5. The procedures not surrounded by the dashed line
are those for testing the proposed method as applied in
Section 5, and correspond to our following procedures
for finding important shipping statistics to decide the
ordering policy.
In Stage 1, we find the important shipping statistics.
From “Shipping data (Learning data)”, we attempt to
find important shipping statistics for deciding the
ordering policy. Since, in many cases, no results are
obtained when applying two or more ordering policies
to one item simultaneously, as in our case, we
simulate each ordering policy at “Simulate ordering
policies.” From the result of “Simulate ordering
policies,” we compare the result of each ordering
policy and set the optimal ordering policy for each
learning item. Then, we obtain the optimal ordering
policy for each learning item, named “Optimal
ordering policy (Learning data).” Next, we calculate
the shipping statistics of each learning item at
“Calculate shipping statistics.” We then obtain the
shipping statistics, which is a mixture of important
shipping statistics and unimportant shipping statistics ,
named “Shipping statistics (Learning data).”
From “Shipping statistics (Learning data)” and
“Optimal ordering policy (Learning data),” we attempt
to find important shipping statistics for deciding the
optimal ordering policy for each learning item. Then,
at “Apply Random Forests between shipping statistics
and optimal ordering policy,” we apply Random
Forests between “Shipping statistics (Learning data)”
of all the learning items and “Optimal ordering policy
(Learning data)” of all the learning items to obtain the
classifier and importance of shipping statistics, named
“Importance of shipping statistics.”
From “Importance of shipping statistics,” we
determine which shipping statistics were important in
“Shipping statistics (Learning data)”.
In Stage 2, we decide the ordering policy for each
item. In “Apply classifying technique between
shipping statistics and optimal ordering policy to make
classifier,” we are able to make a more precise
classifier, which predicts output, optimal ordering
policy for each item from input, shipping statistics
from “Optimal ordering policy (Learning data)” and
important shipping statistics, which are the result of
“Shipping statistics (Learning data)” and “Importance
of shipping statistics.”
The performances of the classifiers were tested in
Section 5. In Section 5, we compared three techniques,
Support Vector Machine (SVM), Decision Tree (DT)
and Random Forests (RF), with three situations, used
all shipping statistic variables, used important
shipping statistics by Random Forests, and used
unimportant shipping statistics by Random Forests.
Importance of
shipping statistics
Shipping data
(Testing data)Apply classifying technique
between important shipping statistics and
optimal ordering policy to make classifier
classifier
Apply classifier for shipping statistics(Testing data)
(Apply classifier for predicting optimal ordering policy)
Calculate shipping
statistics
Simulate ordering policies
Optimal ordering policy
(Testing data)
Shipping statistics
(Testing data)
Which ordering policy
should we use for testing data
Check hit rate of classifier
Shipping data
(Learning data)
Calculate shipping
statisticsSimulate ordering policies
Apply Random Forests
between shipping statistics and
optimal ordering policy
Optimal ordering policy
(Learning data)
Shipping statistics
(Learning data)
Stage 1
Stage 2
Shipping data
584 J Jpn Ind Manage Assoc
In Section 4.2.1, we will explain the details of
“Calculate shipping statistics.” In Section 4.2.2, we
will explain the details of how to find important
shipping statistics.
Our proposed model (framework) can also use even
if there is an outperforming technique which can
classify the optimal ordering policy from two or more
statistic values while the technique could reveal the
importance of shipping statistics.
4.2.1 Data preprocessing
First, we assume that there are shipping data that
provides the shipping date and volume of items, as in
shown in Table 1. From this shipping data, we
calculate shipping statistics and simulate each
ordering policy for evaluating some criteria.
Table 1 Shipping data
In the following steps, we repeatedly use
multivariate data matrices, as shown in Table 2.
Before we proceed to explaining the analysis method,
we will first explain the premise behind multivariate
data matrices.
Table 2 General variable of each item
Let X denote an n v multivariate data matrix, which
represents the shipping statistics. There are n items
and v shipping criteria. xij gives the value for the jth
shipping statistics of the ith item. For example, if the
index of average daily shipping is shipping statistics j,
we can calculate the value of the ith item and xij = (di1
+ di2 + … + dim)/m for all i in {1..., n} as in Table 1.
4.2.2 The data mining process
Let Y denote the multivariate data matrix, which
represents the evaluation values of ordering policies,
as the number of how many inventories or number of
how many shortages. Y is similar to the previously
mentioned X. For example, yizp gives the value for the
zth evaluation criteria, as labeling “inventories” or
labeling “shortages” of the pth ordering policy of the
ith item. So, yzp denotes the n 1 variable matrix of
the zth evaluation criteria of the pth ordering policy.
Then, Yz denotes the n q variable matrix of the zth
evaluation criteria.
It is necessary to find the relation between shipping
statistics and evaluation values of the ordering policy
since there are differences in the scale of numbers in
terms of sum of inventories and shortages. We should
standardize the evaluation value of each evaluation
criteria and item in order to compare ordering policies.
If yikp is the value for the kth evaluation criteria of the
pth ordering policy of the ith item, then let y’ikp be the
standardized evaluation value for this same
combination, where the y’ikp is calculated using Eq. (1)
and take values in [0,1].
pkiyy
yyy
ikpp
ikpp
ikpp
ikp
ikp ,,,)(min)(max
)(min'
(1)
From evaluation values of the previous steps, we
set the optimal ordering policy for each item. We can
then make a relational table between shipping
statistics and the optimal ordering policy of each item,
as shown in Table 3. Based on this table, we will find
which shipping statistics are important for deciding
the ordering policy.
Table 3 Shipping statistics and optimal ordering
policy
Most classifying methods need the matrix of
quantitative variable X and qualitative variable oz, as
shown in Table 3. If those data are available, we can
apply most classifying methods by prearranged
algorithm and a series of each classifying method. We
made Random Forests using the algorithm shown in
Section 3.2.
Item
No.
Date : T Average
(statistic j)day : 1 … day : m
1 d11 … d1m (d11+d12+…+d1m)/m=x1j… … … …n dn1 … dnm (dn1+dn2+…+dnm)/m=xnj
Item
No.
Shipping statistics : X Evaluation value of ordering policy : Y
Statistic 1
x1…
Statistic v
xv
Evaluation criteria 1 : Y1
…
Evaluation criteria z : Yz
Ordering
policy 1 : y11…
Ordering
policy q : y1q
Ordering
policy 1 : yz1…
Ordering
policy q : yzq
1 x11
…
x1v y111
…
y11q
…
y1z1
…
y1zq… … … … … … …
i xi1 xiv yi11 yi1q yiz1 yizq… … … … … … …
n xn1 xnv yn11 yn1q ynz1 ynzq
Item
No.
Shipping statistics : X Optimal ordering policy : O
x1 … xv
Evaluation
criteria 1 :
o1
…
Evaluation
criteria z :
oz
1 x11
…
x1v o11
…
o1z… … … … …
i xi1 xiv oi1 oiz… … … … …
n xn1 xnv on1 onz
Vol.64 No.4E (2014) 585
5 NUMERICAL EXPERIMENTS
Next, we calculate the shipping statistics from
actual shipping data and generate evaluation values for
ordering policies from the results of the simulating of
every ordering policy. From the results of each
ordering policy, we set an optimal ordering policy for
each item. These optimal ordering policies are
predicted using a classifying method. We then
compare the hit rate, which is the rate when the
classified ordering policy and specified optimal
ordering policy are same. Unfortunately, we didn't
have the actual data for applying ordering policies to
each item. And, it is impossible to have all the desired
data in any situation. Especially, managerial data tends
to be unavailable to outsiders in many cases. So, from
shipping data, we simulate each ordering policy and
check the performance of each ordering policy, and
the average inventories and shortages. Inventories and
shortage data are tentatively set for experiments in this
paper. This means that the proposed technology is
independent from specific industries. Next, we label a
specified optimal ordering policy for each item. We
apply Random Forests between the shipping statistics
and optimal ordering policies. From these results, we
can obtain the importance of the shipping statistics
that exhibit a relatively strong relation with the most
fitting ordering policy.
In a previous study, Yu [21] applied support vector
machines and outperform multiple discriminant
analysis (MDA), back-propagation networks, and a k-
NN algorithm. In this study, we apply and compared
three classification techniques: Support Vector
Machine (SVM), Decision Tree (DT) and Random
Forests (RF).
We apply DT, SVM and RF to shipping statistics
and the optimal ordering policy, and compare the
results of the method’s hit rate.
For testing the important shipping statistics
calculated from Random Forests, we make three
situations and test the usefulness of the importance of
shipping statistics. We also tested to determine if the
importance of shipping statistics based on Random
Forests is indeed meaningful for the other classifying
methods or not. We apply three classifying methods
(i.e., DM, SVM and RF), and compare the
performance of each classifying method in each
situation. The situations are as follows: 1) Use all
shipping statistics for classifying ordering policy. In
this case, significant and insignificant shipping
statistics are used for the learning and classifying
processes of each method. 2) Use a certain level of
significant shipping statistics for classifying ordering
policy. In this case, some meaningless shipping
statistics are ignored for the learning and classifying
processes. 3) Use many insignificant shipping
statistics for the learning and classifying processes.
From 3), we check the results to ensure they are not
results from an accident.
In this study, we used R (version 2.14.1), which is
a free software environment for statistical computing
and graphics, for DT, SVM and RF. The libraries are
“mvpart,” “e1071,” and "randomForest." The
experimental environment is an Intel® Core™ 2 Duo
CPU, E8400 at 3.00 GHz and with 4.00 GB of RAM;
we did not need long computational times for
processing.
5.1 Ordering policy and experiment
We used two ordering policies for this study: the
so-called (s, Q) policy and the (R, s, S) policy. These
policies are the most commonly used ordering policies
for inventory control.
For the (s, Q) policy, when the inventory position
declines to or below the reorder point, s, a batch
quantity of size Q is ordered. In this case, the
inventory levels never exceed the position s + Q .
For the (R, s, S) policy, at the end of each time
period of length R, if the inventory position declines
to or below the reorder point, s, an amount equal to the
difference between the order-up-to level S and the
current inventory position is ordered. In the special
case where s = S - 1, we call this the (R, S) policy.
In this study, we use the (s, Q) and (R, S) policies.
In particular, s, Q, and S of the ith item in the tth date
period are denoted by sit, Qit, and Sit, which are set as
follows:
TtIisQ itit , (2)
TtIiSP
d
d
t
SPtl
il
it
,
1
(3)
TtIissdRLTS iitit , (4)
dayth on the item of averageshipment :
item ofstock safety :
dayth on the item of level to-up-order :
dayth on the itemfor point reorder :
dayth on the item ofquantity ordering:
tid
iss
tiS
tis
tiQ
it
i
it
it
it
586 J Jpn Ind Manage Assoc
termreview periodic :
dayth on the item of shipments :
R
tid it
days ofset the:
items ofset the:
period sampling :
timelead:
T
I
SP
LT
Equation (2) shows that the ordering quantity of
item i is the same amount as the reorder point of item i.
Normally, when we can estimate ordering cost and
holding cost, we calculate the economic order quantity
for Qit. Equation (3) provides the sum average of daily
shipping volume in the sampling period. Equation (4)
shows that the replenishment level, Sit, was concerned
with lead-time and periodic review term.
5.2 Data Set
In numerical and simulation experiments, we used
the actual shipments data of a distribution company in
2010. These items are categorized broadly into 11
categories, such as beer, Japanese sake, foods, and
spices. There are no oblivious characteristic trends for
these categories, and we could not find any trends by
categorizing the items. There are no production items
in the inventory. We did not take up brand data in this
research because we were unable to get brand
information about the items in data. , But, if there is a
trend for brands, brands will be introduced into the
proposed method as a qualitative variable.
We simulate (s, Q) and (R, S) policies from the
shipping data, which are then used to calculate
shipping statistics. We set common parameters
between the set of (s, Q) and (R, S) policies as
described below. Sampling period SP was set to four
weeks, lead time LT was set to two days, and safety
stock, ssi was set individually to a special value, which
will not cause a stock-out of more than 1% of the total
number of shipments.
We compare the performance between three
ordering policies: one (s, Q) policy and two (R, S)
policies, for R=2 or 7 days. For simulation, we use the
first month of shipping data for calculating the initial
values of Sit. For the (R, S) policies, Sit values are
recalculated periodically, and sit and Qit are fixed at
the initial calculation values, which will not cause a
stock-out of more than 1% of the total number of
shipments.
For applying the classification method, we compare
the total storage of each ordering policy and set an
optimal ordering policy for each item. From the ratio
between the total storage of the specified ordering
policy and the other ordering policy, we select 100
items for each specified ordering policy. Because we
apply three ordering policies, 300 items are selected.
The other items were less frequently shipped, smaller
amounts, or the ratio between the total storage of the
specified ordering policy and the other ordering policy
is not so large. In order to compare each technique in
terms of whether or not prediction can be
appropriately done from shipping statistics, we select
two-thirds of 300 randomly selected items; so 200
items were selected, and used for learning data. The
other one-third (i.e., 100) items are used for testing,
and we compare the results of each method’s hit rate;
that is, how many optimal ordering policies of new
items can assigned using by each classifying method.
From 300 items selected, we prepare 100 data sets and
conduct learning and testing data randomly, and
applying each technique.
5.3 Experimental results
The results of applying Random Forests to the set
of shipping statistics and specified ordering policy for
the 300 items selected are shown in Fig. 6. In Fig. 6,
shipping statistics are ranked in descending order of
importance. The horizontal axis shows the rank of
shipping statistics, and the vertical axis shows the
importance of shipping statistics. From the results, we
obtain the importance of all shipping statistics, 165
shipping statistics, to be classified.
Fig. 6. The importance of each shipping statistics.
dayth on the item of averageshipment :
item ofstock safety :
dayth on the item of level to-up-order :
dayth on the itemfor point reorder :
dayth on the item ofquantity ordering:
tid
iss
tiS
tis
tiQ
it
i
it
it
it
0
1
2
3
4
5
6
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161
Imp
ort
an
ce o
f sh
ipp
ing
sta
tist
ics
(cal
cula
ted
by
Ra
nd
om
Fo
rest
s)
index of shipping statistics
1)2)
3)
Vol.64 No.4E (2014) 587
From this result, for testing the importance of
shipping statistics, we select the number of shipping
statistics.
For case 1), we use all shipping statistics for
applying each method.
For case 2), we select the top 20 important shipping
statistics from Random Forests, which were indexed 1
to 20 in Fig. 6. In the selected shipping statistics,
those considered as important such as average daily
shipping interval, probability that a shipping interval
will be less than one week, average shipments ping
times in a monthly shipping by omitting the non-
shipping months, skewness of daily amount shipped,
kurtosis of the number of daily shipping lots,
maximum number of monthly shipments, skewness of
the number of daily shipping lots, skewness of amount
shipped daily, probability that the daily shipping
interval will be less than LT, average shipping times
per month, standard deviation of daily shipping
interval, standard deviation of (day's shipping
amount/interval from previous shipment to day's
shipment), average of (day's shipping amount/interval
from previous shipment to the day's shipment), total
shipments, maximum shipping interval, average
amount shipped weekly shipping amount, average
shipments in week by omitting the non-shipping weeks,
kurtosis of (day's shipping amount/interval from
previous shipment to the day's shipment), average lots
shipped lots in daily shipping by omitting the non-
shipping days and average of shipments per weekly
shipping times.
For case 3), we select the bottom 20 important
shipping statistics from Random Forests, which were
indexed 145 to 165 in Fig. 6. In the shipping statistics
selected, the shipping statistics clearly considered as
insignificant, such as minimum shipping volume and
month-long shipment count, are also included.
For cases 1) ~ 3), we apply DT, SVM and RF, and
compare each hit rate.
Figure 7 shows the results of applying Random
Forests and other methods for each of the three cases
explained as cases 1) ~ 3). The bars in the figure
represent the hit rate of each classifying method. The
black bar shows the confidence interval (95%) for
each method and case. Beside of case 2), in cases 1)
and 3), there is no overlap between the methods of
each case. RF achieved the best hit rate in all cases.
As some researchers claim that RF has a robust hit rate
from insignificant variables, these experimental results
are easy to accept.
Fig. 7. The results of applying each classifying
method.
When all the shipping statistics are used, in Case 1),
the hit rate of the SVM is inferior to the hit rate of RF.
This is believed to be because several insignificant
shipping statistics are included for classification.
Therefore, in Case 2), which uses the top 20 important
shipping statistics for classification by RF, the hit rate
of SVM improved. The hit rate for DT improved too.
So, the importance of shipping statistics from RF is
meaningful for other analysis or classifying methods.
6 CONCLUSION
Traditional ABC analysis should be replaced with
multi-criteria classification approaches in order to
manage inventory more efficiently.
But, conventional multi-criteria ABC analysis has
the following three problems. First, there are limited
classes for applying appropriate ordering policy and
the inventory manager cannot change parameters
easily. Next, the shipping statistics considered to be
important are almost all fixed. Finally, conventional
multi-criteria ABC analysis is not applicable if
important shipping statistics data is not on hand.
In our numerical experiments, actual shipping data
was used to calculate shipping statistics and simulate
ordering policies. To evaluate the ordering policies,
we considered the simulation results of (s, Q) and (R,
S) policies for each item. In addition, an optimal
ordering policy was set for each item from the results
of the simulation. We then applied Random Forests
45%
50%
55%
60%
65%
70%
75%
1)all 165
statistics
2)top20
(from RF)
3)bottom20
(from RF)
hit
ra
te
classifying method
DT
SVM
RF
588 J Jpn Ind Manage Assoc
analysis between the shipping statistics and optimal
ordering policy to find the important shipping
statistics. Next, we applied decision tree, support
vector machine and Random Forests between the
(un)important shipping statistics and predicted the
optimal ordering policy. Random Forests was used for
the model to find the important shipping statistics for
classification. Finally, we compared the results when
important shipping statistics were used and not used.
From the classifying results, we conclude that if
there are many insignificant shipping statistics,
Random Forests outperforms the support vector
machine. On the other hand, if insignificant shipping
statistics can be eliminated, Random Forests or other
methods would be useful for classifying and deciding
ordering policies.
REFERENCES
[1] Gruen, T.W., Corsten, D. and Bharadwaj, S.:
“Retail Out-of-stocks: A Worldwide
Examination of Extent Causes and Consumer
Responses,” Grocery Manufacturers of America,
Washington, D.C. (2002)
[2] Zipkin P.: “Old and New Methods for Lost-Sales
Inventory Systems,” Oper. Res., Vol. 56, No. 5,
pp. 1256-1263 (2008)
[3] Janakiraman, G., Seshadri, S. and Shanthikumar,
J.G.: “A Comparison of the Optimal Costs of
Two Canonical Inventory Systems,” Oper. Res.,
Vol. 55, No. 5, pp. 866-875 (2007)
[4] Huh, W.T., Janakiraman, G., Muckstadt, J.A. and
Rusmevichientong, P.: “Asymptotic Optimality
of Order-Up-To Policies in Lost Sales Inventory
Systems,” Manage. Sci., Vol. 55, Issue 3, pp.
404-420 (2009)
[5] Levi, R., Janakiraman, G. and Nagarajan, M.: “A
2-approximation Algorithm for Stochastic
Inventory Control Models with Lost Sales,”
Math. Oper. Res., Vol. 33, No. 2, pp. 351-374
(2008)
[6] Bijvank, M. and Vis, I.F.A.: “Lost-sales
Inventory Theory: A Review,” Eur. J. Oper. Res.,
Vol. 215, Issue 1, pp. 1-13 (2011)
[7] van Donselaar K.H. and Broekmeulen R.
A.C.M.: “Determination of Safety Stocks in a
Lost Sales Inventory System with Periodic
Review, Positive Lead-time, Lot-sizing and a
Target Fill Rate,” Int. J. Prod. Econ., Vol. 143,
Issue 2, pp. 440-448 (2013)
[8] Tsai, C.Y. and Yeh, S.W.: “A Multiple Objective
Particle Swarm Optimization Approach for
Inventory Classification,” Int. J. Prod. Econ. Vol.
114, Issue 2, pp. 656-666 (2008)
[9] Mohammaditabar, D., Ghodsypour S.H. and
O’Brien, C.: “Inventory Control System Design
by Integrating Inventory Classification and
Policy Selection,” Int. J. Prod. Econ., Vol. 140,
Issue 2, pp. 655-659 (2012)
[10] Flores, B.E. and Whyback, D.C.: “Implementing
Multiple Criteria ABC Analysis,” J. Oper.
Manage., Vol. 7, Issues 1-2, pp. 79-84 (1987)
[11] Jamshidi, H. and Jain, A.: “Multi-criteria ABC
Inventory Classification: With Exponential
Smoothing Weights,” J. Global Bus. Issues, Vol.
2, No.1, pp. 61-67 (2008)
[12] Ng, W.L.: “A Simple Classifier for Multiple
Criteria ABC Analysis,” Eur. J. Oper. Res., Vol.
177, Issue 1, pp. 344-353 (2007)
[13] Ramanathan R.: “ABC Inventory Classification
with Multiple-criteria Using Weighted Linear
Optimization,” Comput. Oper. Res., Vol. 33,
Issue. 3, pp. 695-700 (2006)
[14] Ernst, R. and Cohen, M.A.: “Operations Related
Groups (ORGs): A Clustering Procedure for
Production/Inventory Systems,” J. Oper.
Manage., Vol. 9, Issue 4, pp. 574-598 (1990)
[15] Flores, B.E., Olson, D.L. and Dorai, V.K.:
“Management of Multicriteria Inventory
Classification,” Math. Comput. Modelling, Vol.
16, Issue 12, pp. 71-82 (1992)
[16] Gajpal, P.P., Ganesh, L.S. and Rajendran, C.:
“Criticality Analysis of Spare Parts Using the
Analytic Hierarchy Process,” Int. J. Prod. Econ.,
Vol. 35 Issues 1-3, pp. 293-297 (1994)
[17] Partovi, F.Y. and Burton, J.: “Using the Analytic
Hierarchy Process for ABC Analysis,” Int. J.
Oper. Prod. Manage., Vol. 13, Issue 9, pp. 29-
44 (1993)
[18] Partovi, F.Y. and Hopton, W.E.: “The Analytic
Hierarchy Process as Applied to Two Types of
Inventory Problems,” Prod. Invent. Manage. J.,
Vol. 35, Issue 1, pp.13-19 (1994)
[19] Partovi, F.Y. and Anandarjan, M.: “Classifying
Inventory Using an Artificial Neural Network
Vol.64 No.4E (2014) 589
Approach,” Comput. Ind. Eng., Vol. 41, Issue 4,
pp. 389-404 (2002)
[20] Guvenir, H.A. and Erel, E.: “Multi Criteria
Inventory Classification Using a Genetic
Algorithm,” Euro. J. Oper. Res., Vol. 105, Issue
1, pp. 29-37 (1998)
[21] Yu, M.C.: “Multi-criteria ABC Analysis Using
Artificial-intelligence-based Classification
Techniques,” Expert Syst. Appl, Vol. 38, Issue 4,
pp. 3416-3421 (2011)
[22] Vapnik, V.N.: The Nature of Statistical Learning
Theory, Springer-Verlag (1995)
[23] Xiao, Y., Zhang, R. and Kaku, I.: “A New
Approach of Inventory Classification Based on
Loss Profit,” Expert Syst. Appl., Vol. 38, Issue 8,
pp. 9382-9391 (2011)
[24] Tiacci, L. and Saetta, S.: “An Approach to
Evaluate the Impact of Interaction between
Demand Forecasting Method and Stock Control
Policy on the Inventory System Performance,”
Int. J. Prod. Econ., Vol. 118, Issue 1, pp. 63-71
(2009)
[25] Berling, P. and Marklund, J.: “Heuristic
Coordination of Decentralized Inventory
Systems Using Induced Backorder Costs,” Prod.
Oper. Manag., Vol. 15, Issue 2, pp. 294-310
(2006)
[26] Nagasawa, K., Irohara, T., Matoba, Y. and Liu,
S.: “Selecting Ordering Policy and Items
Classification Based on Canonical Correlation
and Cluster Analysis,” Ind. Eng. Manage. Syst.,
Vol. 11, No. 2, pp. 134-141 (2012)
[27] Basu, A. and Mandal, A.: Canonical Correlation,
International Encyclopedia of Education,
Elsevier, pp. 52-57 (2010)
[28] Hotelling, H.: “Relations between Two Sets of
Variables,” Biometrika, Vol. 28, No. 3, pp. 321-
377 (1936)
[29] Breiman, L.: “Random Forests,” Mach. Learn.,
Vol. 45, No. 1, pp. 5-32 (2001)
[30] Breiman, L., Friedman, J.H., Olshen, R.A. and
Stone, C.J.: Classification and Regression Trees,
Chapman & Hall/CRC, New York (1984)
[31] Verikas, A., Gelzinis, A. and Bacauskiene, M.:
“Mining Data with Random Forests: A Survey
and Results of New Tests,” Pattern Recognition,
Vol. 44, Issue 2, pp. 330-349 (2011)
590 J Jpn Ind Manage Assoc