Towards optimal personalization: synthesisizing machine learning and operations research

Tricks of the optimization trade¶

That was fun, but now I’d like to complicate things (and make them more realistic) by assuming that there are pricing plans for each of these message types that change with number of messages sent:

Media <500 500-999 1,000-1,499 >1,500
Email 0.25 0.20 0.15 0.10
Push notification 0.30 0.25 0.225 0.20
Text message 0.85 0.65 0.50 0.30
Phone Call 5.00 5.00 5.00 5.00

You employ your own employees for the phones, so this cost does not benefit from economies of scale.

The budget constraint will now have to be rewritten, and it will be much more difficult. Seriously, it’s pretty tricky.

With the tiered pricing, we introduce step functions to the cost. What follows is the best method I could think up to solve this problem, but I would love to know if anybody has better ideas.

To start, we’ll consider $<$500 messages to be our base tier. We will then call each subsequent tier Tier 1, Tier 2, and Tier 3 (500-1,000, 1,000-1,499, and $>$1,500 messages, respectively). We must create two new types of variables.

The first will be indicator variables which indicate whether or not we have “activated” a particular tier for a particular message type. Mathematically, let us define $a_{jk} \in {0, 1}$ to indicate whether or not we have sold at least the minimum amount of messages of type $j$ in Tier $k$.

The second type of variable will be integer variables $t_{jk} \in \mathbb{Z}_{\geq 0}$ which tell us how many messages of type $j$ that we have sent above the minimum amount of messages for tier $k$. The way that we will calculate the price for sending $N$ messages of type $j$ will be to calculate the base cost and then subtract “discounts” from the total cost as we move up in tiers.

Okay, to summarize, we need to define both $a_{jk}$, our tier activation indicators, and $t_{jk}$, our tier message counters. Let’s let $l_{k}$ be the minimum (lower) amount of messages needed to be sent in order to move up in a Tier. With the following two constraints, the behavior of $a$ will be fully defined

where $X$ is a sufficiently large number (like the maximum of $\sum\limits_{i}m_{ij}$). This technique of using this large number is called the Big M method from the common nomenclature of using an $M$ instead of my choice of $X$ (I already used an $M$ earlier!). The two constraints above are quite confusing. I know this much is true because it took me a long time to make sure that I had them right. I would suggest walking through an example to make sure that you understand what’s going on. You can pick a single $k$, like the first Tier where $l_{1} = 500$. Now, imagine if 499 messages of some type $j$ had been sent such that $\sum\limits_{i}m_{ij} = 499$. What do the constraints say that $a_{j1}$ must be? Now, perform the same test for 500 and 501 messages and assure yourself that the constraints never disagree with each other and that $a_{j1}$ flips from 0 to 1 when it is supposed to.

With $a$ defined, it is one more set of mental gymnastics in order to define $t$:

The first constraint ensures that if $a$ is 0 then $t$ must be 0, as well. The other two constraints guarantee that when $a$ is 1, then $t_{jk} = \sum\limits_{i}m_{ij} - l_{jk}$ which is exactly the number of messages greater than the minimum threshold for Tier $k$.

With all of these god forsaken constraints defined, we can finally write our budget constraint. We will take $c_{jk}$ to be the cost for sending a single message from Tier $k$ where $k=0$ is the base cost. For $k>0$, we will take $c_{jk}$ to be the extra discount obtained by reaching that tier (it is positive, and we subtract it from the base cost).

Phew!

Math $\Rightarrow$ Code Redux¶

Finally, we get back to the code. We can reuse some of the old variables, but we’ll need to rewrite our cost matrix and add some tier information.