TY - GEN

T1 - Learning in infinite-horizon inventory competition with total demand observations

AU - Zeinalzadeh, Ashkan

AU - Alptekinoglu, Aydin

AU - Arslan, Gurdal

PY - 2012

Y1 - 2012

N2 - We consider single-period and infinite-horizon inventory competition between two firms that replenish their inventories as in the well-known newsvendor model. Normally customers have a preference for shopping in one firm or the other. A fixed percentage of them who encounter a stockout in the firm of their first choice, though, visits the other firm. This substitution behavior makes the firm's replenishment decisions strategically related. Our main contribution is to introduce a simple learning algorithm to inventory competition. The learning algorithm requires each firm (a) to have the knowledge of its own critical fractile, which the firm can calculate using the values of its own per unit revenue, order cost, and holding cost; and (b) to observe its own total demand realizations. They do not necessarily know their true demand distributions. The firms need not even have any information about each other, beyond the implicit information encoded in their own total demand realizations affected by their competitors' inventory decisions. In fact, the firms need not even be aware that they are engaged in inventory competition. We prove that the inventory decisions generated by the learning algorithm converge, with probability one, to certain threshold values that constitute an equilibrium in pure Markov strategies for an infinite-horizon discounted-reward inventory competition game.

AB - We consider single-period and infinite-horizon inventory competition between two firms that replenish their inventories as in the well-known newsvendor model. Normally customers have a preference for shopping in one firm or the other. A fixed percentage of them who encounter a stockout in the firm of their first choice, though, visits the other firm. This substitution behavior makes the firm's replenishment decisions strategically related. Our main contribution is to introduce a simple learning algorithm to inventory competition. The learning algorithm requires each firm (a) to have the knowledge of its own critical fractile, which the firm can calculate using the values of its own per unit revenue, order cost, and holding cost; and (b) to observe its own total demand realizations. They do not necessarily know their true demand distributions. The firms need not even have any information about each other, beyond the implicit information encoded in their own total demand realizations affected by their competitors' inventory decisions. In fact, the firms need not even be aware that they are engaged in inventory competition. We prove that the inventory decisions generated by the learning algorithm converge, with probability one, to certain threshold values that constitute an equilibrium in pure Markov strategies for an infinite-horizon discounted-reward inventory competition game.

UR - http://www.scopus.com/inward/record.url?scp=84869433668&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84869433668&partnerID=8YFLogxK

U2 - 10.1109/acc.2012.6315678

DO - 10.1109/acc.2012.6315678

M3 - Conference contribution

AN - SCOPUS:84869433668

SN - 9781457710957

T3 - Proceedings of the American Control Conference

SP - 1382

EP - 1387

BT - 2012 American Control Conference, ACC 2012

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2012 American Control Conference, ACC 2012

Y2 - 27 June 2012 through 29 June 2012

ER -