How should a firm modify its product assortment over time when learning about consumer tastes? In this paper, we study dynamic assortment decisions in a horizontally differentiated product category for which consumers' diverse tastes can be represented as locations on a Hotelling line. We presume that the firm knows all possible consumer locations, comprising a finite set, but does not know their probability distribution. We model this problem as a discrete-time dynamic program; each period, the firm chooses an assortment and sets prices to maximize the total expected profit over a finite horizon, given its subjective beliefs over consumer tastes. The consumers then choose a product from the assortment that maximizes their own utility. The firm observes sales, which provide censored information on consumer tastes, and it updates beliefs in a Bayesian fashion. There is a recurring trade-off between the immediate profits from sales in the current period (exploitation) and the informational gains to be exploited in all future periods (exploration). We show that one can (partially) order assortments based on their information content and that in any given period the optimal assortment cannot be less informative than the myopically optimal assortment. This result is akin to the well-known "stock more" result in censored newsvendor problems with the newsvendor learning about demand through sales when lost sales are not observable. We demonstrate that it can be optimal for the firm to alternate between exploration and exploitation, and even offer assortments that lead to losses in the current period in order to gain information on consumer tastes. We also develop a Bayesian conjugate model that reduces the state space of the dynamic program and study value of learning using this conjugate model.
|Publication status||Published - 2012|