Saads Perfektewelt
🐞

Systematically visualize the impact of linear constraints in a MIP

Introduction

Mixed integer linear programming (MIP) is easy to comprehend on the surface, you just plug in a linear constraint and voilà! You now have transformed an LP into a MIP. Learning how to systematically write a model with its constraints and component is, sadly, a matter of training. Watching all these professors and these seniors formulate constraints at ease makes it look easy, which may or may not evoke an air of imposter syndrome.

The goal is to shape the perspective around the world of linear optimization, that's the passive effect of doing exercises. Some of these constraints seem intuitive but once you get into the deeper side of linear programming (I.E. Big M, logical constraints...etc), it becomes a bit hard to visualize the effect of a constraint with respect to the optimization problem that you are facing. If you are a beginner in operational research and having skipped mathematics, chances are, that every time these constraints are brought up, you might feel as if you are missing something elementary. It becomes much more frustrating if you try to research any specific since you can't tell where to start, as the anything revolving the constraint is not so centralized and accessible.

Shape your perception

This article is aimed at introducing the method that I use to understand any linear constraint. This method is sadly not directly communicated, as it is to be concluded through intuition and can not be easily googled. The average Operational research guide or blog expects the person to have some mathematical reflexes. This method is meant to be fool-proof, I believe that no one may face difficulties with it. This article serves also as a way to centralize most of the modelling tricks that I have learned, you may use it as a reference to come back to later on.

Unless relaxed, binary variables express the execution of a decision; if the variable equals 1, the decision to execute what the variable expresses is enabled. Do you produce X, do you use X, do you allocate X...etc...

\[ x=\begin{cases} 1 & \text{Decision is to be executed}\\ 0 & \text{Otherwise} \end{cases} \]

When writing mathematical models, it is imperative to express linear functions (hence the term Linear in the model's property). Many modelling tricks were invented to cope with constraints that can not be directly translated from a verbal statement. Allow me to present the following modelling trick:

\[ x \leq y \] \[ x, y \in \{0,1\} \]

This trick helps to formulate situations where the execution of a variable (in this case \(x\)) forces another variable (\(y\)) to be true, though we don't care about the latter, we want to force them both to be enabled if the former gets the value of 1. In other words:

Concrete example: If Product X is produced, I want Product Y to be produced alongside it. However, I still want the option to decide on producing Product Y independently of whether Product X gets produced.

How could I understand the desired outcome of this expression ? Before presenting the method, let me explain how this whole thing works:
Graphically, this is how the constraint looks like. Any point that exists outside the area is invalid/infeasible. The role of the constraint is to make sure that only the integer points within this area are chosen. With that said, the point (1,0) as seen in the picture below is infeasible, as the value 1 can never be lower or equal to 0.

Graphical depiction of the aforementioned constraint

That's right, by all logic, 1 can't be lower or equal than 0 for sure, so how do I perceive this ? Truth is, I don't draw a 2 dimensional axis and a feasible region ? There is an even faster way:

X Y Feasible ?
0 0 Feasible
1 0 Infeasible
0 1 Feasible
1 1 Feasible

Here is how I do it: I draw a table up to \(n\) number of variables on the columns, and then I list all of their possible values, luckily, what makes this method faster than graphically checks is the fact that the variables are binary and can't have anything other than 0s and 1s, which by extension means that you will have \(2^n\) rows. Finally, I check the feasibility with respect to the mechanics of the constraint. This is a method that I absorbed by watching a student do it few years ago, and as I mentioned, I have not seen this in any OR literature yet.

Although this method is meant to quickly help understand how a mathematical constraint works, I have seen that same student use it to write and design constraints. I believe this could help in this regard, especially if the variables are strictly binary. I would be happy to see if there is a better and quicker way, I will certainly update this guide if I ever find anything fruitful.

I have to admit that I still relay on my muscle memory to write optimization models, it takes some intelligence to write some cool constraints. It's an art!

Modeling If → NOT

We've seen the logical IF → Then true, the following becomes easier to understand after utilizing the method:

\[ x \leq 1 - y \] \[ x, y \in \{0,1\} \]
X Y Feasible ?
0 0 Feasible
1 0 Feasible
0 1 Feasible
1 1 Infeasible

Notice that \(x\) and \(y\) can never both contain a value of 1 at the same time, the constraint is feasible only when either \(x\) or \(y\) gets the value of 1 (or 0 optionally). Verbally, this expression says the following: If \(x\) gets 1 then \(y\) should NOT get 1 and vice versa. I believe this equation can be visually simplified by swapping its terms like this; this is how I remember it:

\[ x + y \leq 1 \]

Like this, you may easily remember it by considering that 2 can never be greater than 1. Next, I will display how to model the AND statement.

Modeling AND

Picture this: What if I want to either produce both \(x\) and \(y\) at the same time or produce neither. I can't simply plug+in the last inequality from the previous section because there would always be the off chance that either \(x\) or \(y\) get a value of 1. Let's review the nature of the desirable output, either:

One idea that I would use as a headstart is to use the following equation: \(x + y = 2\), but the issue with this one is that it does not allow the first part of the condition to function, as it will always result into producing both products. One workaround to deal with this is to introduce a new binary variable \(z\) that goes enabled only if \(x\) and \(y\) have a value of 1.

\[z \leq x\] \[z \leq y\] \[x + y - 1 \leq z \] \[x,y,z \in \{0,1\}\]

With this combination, only the two desired outcomes may happen. All of these constraints are essential. The third inequality is the very key component of the whole logic. As far as logic is concerned, with the boolean checks table in mind, for each variable, we check the feasibility not for just one constraint, but for them all:

X Y Z Feasible ?
0 0 0 Feasible
1 0 0 Infeasible
0 1 0 Infeasible
0 0 1 Infeasible
1 1 0 Infeasible
1 0 1 Infeasible
0 1 1 Infeasible
1 1 1 Feasible

Modelling precedence constraints

Precedence constraints usually concern problems where sequencing is a hard constraint that must occur, else the modeled problem may be infeasible. This is usually the case for complex problems like Vehicle routing problems (VRP), Traveling salesman problem (TSP) and scheduling where choosing the right permutations is crucial to controlling the objective function.
Naturally, when designing constraints for such problems, in VRPs for instance, you might be challenged by a capacity constraint, forcing you to choose the right sequence of customers such as to respect your limited capacity. In Flow/Job shop scheduling, you would want a sequence of jobs such as they don't overlap on each other and between each machine. Let's use the former as a use case!

In MIP, sequencing is often modeled with a decision variable that denotes if we execute action \(i\) before \(j\), do we visit customer \(i\) before \(j\), do we execute job \(i\) before \(j\)... A single index is not enough to tell a permutation (As of today, who knows if someone invents something crazy).

\[ x_{i,j}=\begin{cases} 1 & \text{Execute action \(i\) before action \(j\)}\\ 0 & \text{Otherwise} \end{cases} \]

In a TSP environment, you would want to have a Hamiltonian cycle across all the nodes. With that variable in mind, one way to achieve this is to introduce a constraint that expresses the following: we will leave each node once. If applied as intended, you should have a graph where each node has 1 successor and 1 predecessor. since we will enter a node and leaving it; in other words: Make sure we use the paths between each node only once:

\[ \sum\limits_{i \in V \setminus \{j\}}{x_{ij} = 1} \quad \forall i \in V \] \[ \sum\limits_{j \in V \setminus \{i\}}{x_{ij} = 1} \quad \forall j \in V \]

This would be the best time to talk a bit in-depth about the syntax of these constraints. You see, because of how generic these problems are, in academics, these equations are written with the above algebraic syntax to optimize readability. Using the table would have you decompose the constraint first. I usually start with the space domain first, we have a for statement here that defines the amount of constraints that we have to account for, in this case it would be for every \(i\) node in the graph, next we have a sum statement here that simply chains every variable. We will sum over all the paths that are connected to the node \(i\), the node \(i\) is excluded because that would bring up a variable that decides on a path that goes from the node \(i\) and leads back to it. As you can see, the key here is to always comprehend the syntax to decompose.

Let \(V\) be the set of nodes, the above constraints verbally say: For each pair of nodes, only one path would be chosen. This constraint is forced for every \(i\) and \(j\) couple. The objective function takes care of the optimization part in which it chooses the set of connections following the logic of the 2 constraints. For each node \(i\) and for each node \(j\) you will have \(V - 1\) constraints to check. In the following table I will present how the feasibility check looks like for every \(j\)th node in the graph. Note that the TSP assumes graph symmetry, going from 1 to 2 is the same as going from 2 to 1 \(x_{12} = x_{21} \), so if the program ever enables one of these paths, it implicitly enables it from the other sides.

Constraint 1: \( x_{12} + x_{13} = 1 \) Constraint 2: \( x_{21} + x_{23} = 1 \) Constraint 3: \( x_{31} + x_{32} = 1 \)
\( x_{12} \) \( x_{13} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible
\( x_{21} \) \( x_{23} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible
\( x_{31} \) \( x_{32} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible
Constraint 1: \( x_{21} + x_{31} = 1 \) Constraint 2: \( x_{12} + x_{32} = 1 \) Constraint 3: \( x_{13} + x_{23} = 1 \)
\( x_{21} \) \( x_{31} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible
\( x_{12} \) \( x_{32} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible
\( x_{13} \) \( x_{23} \) Feasible?
0 0 Infeasible
0 1 Feasible
1 0 Feasible
1 1 Infeasible

Since we have a set of 3 nodes, you would expect your TSP solution to have exactly \(V\) number of arcs, so without even thinking about it, feasible are the inequalities that enable no more and no less than 3 edges over all the nodes.

As said earlier, with respect to the above constraints, the objective function will choose the edges that ensure the tour with the minimal costs. Funnily enough, these constraints are sufficient for a case where the graph contains 3 to 4 nodes, as no combination of edges can make an infeasible TSP solution, until you introduce a 5 nodes graph....

Nu uh! This is not a Hamiltonian tour!

So if you have noticed, we told the MIP to look for the cheapest connections, and so it executed what we told it. Sadly it does not come in terms with the desired output. If you think about it, and if we hard check all the edges, all the constraints are feasible. The issue lies then in the logic of the constraints then.

One idea to deal with this is to weight each node with an index to indicate its position in the permutation. In a TSP, the starting node matters not since we are going to visit it anyways. Using a continuous variable \(u_{i}\), each node will be assigned an index, we would then want to implement a logic that says the following: If we use the path \(x_{i,j}\), the index of the \(j\) should be higher than \(i\)'s index.

\[ x_{i,j} \Rightarrow u_{i} - u_{j} = 1 \] \[ u_{1} = 1 \] \[ 2 \leq u_{i} \leq n \]

So far so good, but then, how do you link these 2 constraints together ? Getting deeper in operational research would eventually expose you to a famous constraint that helps against permutation-based problems. The Miller-Tucker-Zemling (Miller et al., 1960) can be seen in various MIPs, it uses continuous variables to magically eliminate subtours and ensure correct permuting.

As of my limited knowledge, the non-linearity of the constraint lies within that if statement and that's how the MTZ constraint comes into play. Using a big constant \(M\) (ideally equal to the number of nodes) together with the \(x_{i,j}\) variable one can combine these constraints into one:

\[ u_i - u_j + 1 \leq n(x_{i,j}-1) \]

If \(x_{i,j}\) is 1, it cancels itself together with the constant and the inequality turns into \(u_{j} \geq u_{i} + 1 \), because \(u_{i}\) ranges between 2 and \(n\), and together with the other constraints, the final solution would be a graph containing no subtours. Let's take a look at how this would look like by the time we reach the 3rd iteration:

Let's pretend that \(x_{2,4} = 1\), this turns the inequality into \(u_{4} \geq u_{2} + 1\), as of reaching node \(2\), the algorithm decided that \(u_{2} = 3\):

\(u_{2} + 1\) \(u_{4}\) Feasible ?
4 2 Infeasible
4 3 Infeasible
4 4 Feasible
4 5 Feasible

The solver has no choice but to either assign either 4 or 5, and because of the subsequent node after taking that path, it ends up ultimatly choosing the value \(u_j = 4\). If for whatever reason the said path gets a 0, then the big constant \(n\) will outweight every possible choice, neglating the connection between the nodes, as no value for the two variables \(u_i\) and \(u_j\) can ensure feasibility.

For VRP and Scheduling problems, enumeration is not necessary as these problems already come with continuous variables that can be intelligently used as an engine for the inequalities.

Don't constraint yourself! Be creative!

One constraint setup that I saw years ago happened to be a fancy answer to the following question, which comes from a test that I had to pass:

"Now assume that if product 8 is produced, products 7 and 11 cannot be produced. Give the required linear constraint(s) on the variables \(Y_p\)! (2 points)"

Using some muscle memory, you may use the following constraints that help achieve what the question demands:

\[Y_8 + Y_7 \leq 1\] \[Y_8 + Y_7 \leq 1\]

Both of these constraints are correct, but there is an even fancier way; a formulation that I found so elegant that lives in my head rent free:

\[Y_8 + \frac{1}{2} (Y_7 + Y_{11}) \leq 1\]

Every time I tutor people in quantitative methods, the name of the said subject that contains that question. I always bring this up inequality up. The way it is written makes me often want to toy around with the arithmetics, doing my best to optimize the number of constraints that I have to formulate. My point is: Don't restrict yourself with all of those tricks, try every playing with the formulas, see what you can achieve and unlock!

Reference

Miller, C.E., Tucker, A.W. and Zemlin, R.A. (1960) Integer Programming Formulations and Traveling Salesman Problems. Journal of the Association for Computing Machinery, 7, 326-329 . https://doi.org/10.1145/321043.321046