Application of Dynamic Programming Method to Marketing Decisions Based on Customer Database

Zhongqiu Zhao; Xiaofei Li; Baolong Ma; Jinlin Li

doi:10.21078/JSSI-2016-169-08

Article Publicly Available

Application of Dynamic Programming Method to Marketing Decisions Based on Customer Database

Zhongqiu Zhao , Xiaofei Li , Baolong Ma and Jinlin Li

Published/Copyright: April 25, 2016

Published by

Become an author with De Gruyter Brill

Author Information

From the journal Journal of Systems Science and Information Volume 4 Issue 2

Abstract

The paper focuses on modeling longitudinal customer behavior and develops a dynamic programming (DP) to show how customer transaction database may be used to guide marketing decisions such as pricing and the design of customer reward programs. Dynamic programming is not as a tool to marketing decisions making in this research but rather as a description of consumer behavior. The results show that the method provides a means for evaluating the effectiveness of marketing strategy, for example, customer reward programs. Moreover, the findings from the model estimation indicate that reward program can actually increase the customer’s purchase level and stimulate the repeat purchase behavior.

Keywords: dynamic programming (DP); customer reward programs; marketing decisions; customer behavior

1 Introduction

Recently, a lot of researchers pay more attention to the customer purchase behavior based on the different marketing stimulis which include the short term promotions and long term programs[1–3]. The short term promotions are usually in a form of the coupon, different shipping policy and instant discount. The long-term programs, which are often called customer reward programs, are designed to maintain and enhance customer loyalty. As one of most popular marketing strategies, customer reward programs have attracted considerable interest among researchers[4–7]. The goal of reward programs is to encourage customer frequent purchase behavior which is potentially beneficial to the firm in a form of points, frequent flyer mileage, free gifts, and so on[3]. In practice, a loyalty card, rewards card, point card, advantage card, or club card is a plastic or paper card, visually similar to the credit card or debit card, which can identify the card holder as a member in a customer reward program[3]. In addition, the programs also can be viewed as a dynamic incentive based on the cumulative purchase amount in a certain period, which shifts the customer purchase behavior from myopic or single period decision making to a dynamic or multiple period decision making[8].

What this research focus is how the customer reward programs stimulate the customer purchase behavior. We adopt the dynamic programming (DP) algorithm to analyze the influence of such reward programs, since the customer makes decisions not only based on the current rewards from the purchase but also the future rewards received from the customer reward programs[3]. The approach is based on the idea that a customer’s observed sequence of decisions may be interpreted as the solution to a dynamic optimization problem. DP method is ideal for analyzing individual choices which are based on both current and future expected benefits[8]. That a reward program rewards customer on the level of purchasing over a specified period is a prime example of such a decision problem. Another advantage of dynamic programming method is that the estimated coefficients can be used to conduct simulations which replicate the consumer’s dynamic decision process[9].

2 Theoretical Background

2.1 Customer Reward Programs

The importance of effective customer relationships as a key to promote customer value and shareholder value is widely emphasized. Lately, many retail companies have introduced customer reward program which is an important customer relationship management tool[10] to enhance customer loyalty. It is currently popular in many industries, such as gasoline stations, supermarkets, airline business and clothing stores, and has achieved high participation rates among consumers[11, 12]. By 2012, approximately 2.65 billion customer reward program memberships were held by U.S. consumers[13], and 42% indicated that they used such customer reward programs much more for their purchases than in 2008[14]. In some ways, this increasing popularity is largely a function of improving information technology and a philosophical trend towards customer focused marketing. A customer reward program is an integrated system of marketing actions that aims to reward and encourage customers’ loyal behaviors through incentives[15]. These programs typically allow customers to accumulate their purchase amount and redeem free rewards by purchasing repeatedly from an enterprise or alliance enterprise. For example, “reach 30,000 points to redeem a round-trip ticket”. Prior empirical researches have provided abundant evidence of the effectiveness of customer reward program. And they found positive effects of retail reward programs on customer purchase behavior. Such programs encourage repeat buying and thereby improve retention rates by providing incentives for customers to purchase more frequently and in larger volumes. Furthermore, firms would gain additional customers, increase the share-of-wallet (SOW) from existing ones, or prevent loyal customers switching to other sellers, among other things by introducing customer reward programs[16]. In other words, the ultimate goal of customer reward programs is to prompt loyal customers to purchase more frequently and more.

A common characteristic shared by the industries where customer reward programs are popular is that the means exist to monitor customer transaction histories and to conduct individual level marketing. Another special characteristic of loyalty programs is that their attractiveness may change dynamically with a customer’s decisions. As purchases are made, both the customer’s investment in the program and the customer’s likelihood of earning a reward increase. Conversely, when a customer decides not to purchase in a given period, the likelihood of earning a reward decreases, because the customer moves no closer to the reward threshold, and the time left to earn rewards shrinks. These dynamic factors are a challenge in the modeling of customer response to loyalty programs. For a frequency program to be effective in increasing loyalty, it must have a structure that motivates customers to view purchases as a sequence of related decisions rather than as independent transactions. That is, the structure must give customers an incentive to adopt a dynamic perspective[17]. As a result, we adopted dynamic programming to stimulate the customer purchase behavior.

2.2 Dynamic Programming Model

Dynamic programming is set of techniques concerned with the general problem of controlling a dynamic system whose evolution from state to state can be influenced by the application of controls and yields a stream of state- and control-dependent payoffs[17]. The methods of individual level customer behavior have begun to appear in marketing research, for example, Romana Khan Serdar Sayman and Stephen J. Hoch (2014) examined buyers’ willingness to pay a price premium for a firm offering a loyalty program reward using an analytical model of dynamic consumer choice[14]. A perhaps somewhat less obvious application of dynamic programming to customer behaviors involves the use of dynamic programming as a behavioral model of customer decisions. By assuming that consumers are making decisions based upon decision rules that consider both the immediate benefits and future consequences associated with an action, a dynamic programming model may be used within a statistical inference procedure to estimate models of individual behavior.

This paper pays close attention to how the firm might use DP method to model the customer decision is a fairly intuitive and straightforward application. This general type of approach to modeling decisions is referred to as discrete choice dynamic programming or estimable structural dynamic programming.

3 The dynamic Formulation Procedure and Algorithm

3.1 Dynamic Customer Behavior in a Reward Program

This part develops a model of customer response to a reward program. In a reward program, customer purchase behavior in each period of whole year can be formulated based on their current cumulative purchase. The whole year is divided into 52 weeks according to the real happened purchase data. At the end of the year, the merchant will distribute the reward on the basis of customer whole year cumulative purchase amount. However, after this time period, the total purchase amount will restart from zero. It means that even the customer’s cumulative purchase amount is very big, but still could lose the chance to get the reward just because of not touching the reward level that the merchant sets. Therefore the reward programs will dynamically stimulate the customer’s purchase behavior based on the reward level and reward scale they set. According to the date, customer has 4 choices in each period, that is, no purchase, small purchase, medium purchase and large purchase respectively. The price in each period is supposed to randomly change associated with the last period price and the purchase level the customer chose in last period. Due to lack of real price data, it is assumed that the price for each period is uniform distribution and has three different levels which are high price, medium price and low price. As a result, the customer state can be defined in each period as two dimensions, which is cumulative purchase amount and price separately.

Because of the use of the part of the paper frame, we define the customer state and stochastic factor from the paper. However, due to the inaccessible of the data and all the parameter that the author estimated, we cannot consider all the factors influencing the customer purchase behavior, the probability evolution of the price for next period and write down the clear expression of the reward function under different actions. However, we give the reasonable estimate of the parameter inside the reward function by using two piece of common knowledge. Firstly, the marginal reward is decreasing as the increase of the cumulative purchase. Secondly, to the same cumulative purchase amount, the higher the current price is, the less reward the customer gets. With the known data, we assume that the customer purchase is a multi-periods decision making and could be replicated by a DP method, then estimate the parameters from the data and DP algorithm to see the influence of the short-term promotion and reward programs. Whereas we use known parameter from Lewis[3] to see the customer’s each period action by adopting the DP algorithm.

3.2 Formulation as a DP Algorithm

The state space is a vector of information about the environment that is relevant to the customer’s forward-looking optimization problem. The state space may consist of marketing-mix elements, such as the pricing environment in a given week, and customer-specific information, such as cumulative purchase amount. State x(t) denotes the cumulative purchase amount at period t. p(t) denotes the price of the merchandise at period t where p(t) includes the high price and mediate price and low price. The quantity decision is likely to be based on marketing factors such as price and on individual-level factors such as inventories. A myopic decision maker would select from the options 1, 2, ⋯, J. Actions are as follows:

j = 1 no purchase;

j = 2 purchase small [0, 50);

j = 3 purchase medium [50,75);

j = 4 purchase large [75, +∞).

The increase of cumulative purchase amount is a comparative number on average. W is a stochastic factor, which denoted as w1: price high; w2: price medium; w3: price low. We assume the next period price could be high, mediate, low with uniform distribution.

The evolution of the state is

[Xt+1,Pt+1]=[Xt+∑j=14[33(j−1)dj(t),w(t)]],

where ∑j=14dj(t)=1. It means that customer only can choose one action in each period.

The reward functions for different action in each period are as follows:

Rno(t)=bno;Rsm(t)=bsm−bp,smPt−[(x(t)+33(jsm−1))/100]bsmde;Rmed(t)=bmed−bp,medPt−[(x(t)+33(jmed−1))/100]bmedde;Rlrg(t)=blrg−bp,lrgPt−[(x(t)+33(jlrg−1))/100]blrgde.

We set three level reward trigger in last period. Therefore, the reward for the last period T are as follows:

Rno(t)=bno;Rsm(T)=bsm+bp,smPT+bc,smCT+∑h=1Hbh,smSH(h)T+(∑level=13blevel)(XT−1+33);Rmed(T)=bmed+bp,medPT+bc,medCT+∑h=1Hbh,medSH(h)T+(∑level=13blevel)(XT−1+66);Rlrg(T)=blrg+bp,lrgPT+bc,lrgCT+∑h=1Hbh,lrgSH(h)T+(∑level=13blevel)(XT−1+j).

Where we have

IfXT−1+33(j−1)<1000,thenblevel=1=blevel=2=blevel=3=0;If1000≤XT−1+33(j−1)<2000,thenblevel=1=0.1,blevel=2=blevel=3=0;If2000≤XT−1+33(j−1)<3000,thenblevel=1=0,blevel=2=0.5,blevel=3=0;If3000<XT−1+33(j−1),thenblevel=1=blevel=2=0,blevel=3=0.8.

The Bellman equation is

J(t)=maxj∈J∗{Rj(t)+αEpw⁡Jj(t+1)},

where α is discount factor. When α is equal to 0, it means that the customer doesn’t care the next period reward. A finite period DP algorithm is used in this research. According to the customer purchase habit, the time horizon is usually divided into 52 weeks. Here, to see the influence of reward programs and decrease of the calculation burden, we only consider 10 periods to the end of the year.

3.3 Results Analysis

We plot the policies the customer chose at one period ahead of the T for both with reward program and without reward program.

Figure 1

The policies chosen by the customer under the reward program

Here X axial denotes the cumulative purchase amount and the different price under the same cumulative purchase amount, whereas the Y axial denotes the different level of purchase the customer choose. 4 means buy large and 1 means buy nothing.

Figure 2

The policies chosen by the customer without reward program

We could see from these two figures that there is a jump in the first one. The place of the jump is just around the cumulative purchase approach to the 3000. In this cumulative purchase area which is below 3000 but approach 3000, the customer is willing to buy large to cross the reward level 3000 to get the reward in the final period. However, in the long term, the stimulation of reward program will move forward, like Figure 3.

Figure 3

Ten periods to the T with reward in T

Figure 4

Ten periods to T without reward in T

In a long term, the reward program influences all the customer purchase actions around the reward trigger level by the cumulative effect.

4 Conclusions

This article presents a DP method to modeling longitudinal customer purchase behavior. The model measures the influence of reward programs by considering customers’ sequences of purchase as a solution to a dynamic optimization problem. A primary strength of the approach is that the dynamic programming method provides a good solution for modeling dynamic customer behavior. The research shows that this model is a great tool by which to monitor or track the development of customer purchase behavior. The method includes both the influence of previous behavior in terms of cumulative purchase amount and expectations of future prices and loyalty rewards. Modeling the forward-looking behaviors intuitively appeal the customer response to a reward program[18]. With the popularity of reward programs, this paper provides a modeling method for evaluating the effectiveness of marketing strategy.

The model estimates the effects of the reward program. The results show that the reward program increases customer repeat-purchase rate effectively. Specially, the customers tend to increase their purchase to cross the reward level to get the reward in the final point of a reward period. However, in the long term, the stimulation of reward program will move forward. In a word, the reward program increases the customer’s purchase level actually and stimulates customer repeat purchase behavior. On the other hand, the model also provides a platform for conducting simulation studies because the estimated coefficients can be used to conduct simulations that replicate the consumer’s dynamic decision process.

5 Limitations and Future Research

The model is developed using data from internal transaction databases. When modeling customer behavior using this type of data, it is usually necessary to make reasonable assumptions about the customer purchase decision. In general, the DP method adopted in this research is to assume the presence of a relatively constant outside alternative. In both the newspaper subscription and Internet grocery purchase decisions the customer has a no buy option in each period. In general, neither dataset includes data or covariates that indicate how the relative attractiveness of the no buy or outside alternative change in each period. By way of compensation for this deficiency, prices are adjusted using relevant CPI data for comparable products. More specific variations in the outside alternatives are captured by the stochastic or error component.

In terms of future research this research focuses on what is hoped to be a series of studies focusing on the dynamics of customer behavior and customer management. The current work focuses mostly on the development of methodologies for customer analysis and management. The next step is to begin to examine both databases in an effort to test and add to the extant body of knowledge related to customer loyalty.

Supported by the National Natural Science Foundation of China (71272059, 71432002) and Beijing Colleges and Universities Young Talent Program (YETP1189)

References

[1] Rust R T, Chung T S. Marketing models of service and relationships. Marketing Science, 2006, 25(6): 560–580.10.1287/mksc.1050.0139Search in Google Scholar

[2] Sun B H, Neslin S A, Srinivasan K. Measuring the impact of promotions on brand switching when consumers are forward looking. Journal of Marketing Research, 2004, 40(9): 389–405.10.1509/jmkr.40.4.389.19392Search in Google Scholar

[3] Lewis M. The influence of loyalty programs and short-term promotions on customer retention. Journal of Marketing Research, 2004, 41(8): 281–292.10.1509/jmkr.41.3.281.35986Search in Google Scholar

[4] Liu Y P, Yang R. Competing loyalty programs: Impact of market saturation, market share, and category expandability. Journal of Marketing, 2009, 73(1): 93–108.10.1509/jmkg.73.1.093Search in Google Scholar

[5] Yi Y J, Jeon H S. Effects of loyalty programs on value perception, program loyalty and brand loyalty. Journal of the Academy of Marketing Science, 2003, 31(3): 229–241.10.1177/0092070303031003002Search in Google Scholar

[6] Valeria S, Bradlow E T, Fader P S. Stockpiling points in linear loyalty programs. Journal of Marketing Research, 2015, 2(52): 253–267.Search in Google Scholar

[7] Dorotic M, Verhoef P C, Fok D, et al. Reward redemption effects in a loyalty program when customers choose how much and when to redeem. International Journal of Research in Marketing, 2014, 4(31): 339–355.10.1016/j.ijresmar.2014.06.001Search in Google Scholar

[8] Lewis M. Incorporating strategic consumer behavior into customer valuation. Journal of Marketing, 2005, 69(4): 230–238.10.1509/jmkg.2005.69.4.230Search in Google Scholar

[9] Sharp B, Sharp A. Loyalty programs and their impact on repeat purchase loyalty patterns. International Journal of Research in Marketing, 1997, 14: 473–486.10.1016/S0167-8116(97)00022-0Search in Google Scholar

[10] Kang J, Alejandro T B, Groza M D. Customer-company identification and the effectiveness of loyalty programs. Journal of Business Research, 2015, 2(68): 464–471.10.1016/j.jbusres.2014.06.002Search in Google Scholar

[11] Melnyk V, Bijmolt T. The effects of introducing and terminating loyalty programs. European Journal of Marketing, 2015, 3(49): 54–76.10.1108/EJM-12-2012-0694Search in Google Scholar

[12] Zhang J. The impact of an item-based loyalty program on consumer purchase behavior. Journal of Marketing Research, 2012, 1(49): 50–65.10.1509/jmr.09.0211Search in Google Scholar

[13] Leenheera J, van Heerdeb H J, Bijmoltc T H A, et al. Do loyalty programs really enhance behavioral loyalty? An empirical analysis accounting for self-selecting members. International Journal of Research in Marketing, 2007, 24(1): 31–47.10.1016/j.ijresmar.2006.10.005Search in Google Scholar

[14] Sayman S, Hoch S T. Dynamics of price premiums in loyalty programs. European Journal of Marketing, 2014, 48(314). 10.1108/EJM-11-2011-0650.Search in Google Scholar

[15] Berry J. Bulking up: The 2013 COLLOQUY loyalty census: Growth and trends in U.S. loyalty program activity. http://www.colloquy.com/files/2013-COLLOQUY-Census-Talk-White-Paper.pdf.Search in Google Scholar

[16] Mintel. American lifestyle 2013: Five years later. Available at http://academic.mintel.com.Search in Google Scholar

[17] Lewis M V. Applications of dynamic programming to customer management. Northwestern University, Chicago, 2001.Search in Google Scholar

Received: 2015-7-1

Accepted: 2016-1-7

Published Online: 2016-4-25

Articles in the same Issue

https://doi.org/10.21078/JSSI-2016-169-08

Keywords for this article

dynamic programming (DP); customer reward programs; marketing decisions; customer behavior