Building the Layer Cake: A quick-value approach to AI and machine learning

For every truck in their fleet, UPS currently recalculates the best route after every single package is delivered. So, if you and your neighbor are both getting a package, in the time in takes to deliver them, UPS has determined the optimal route for all remaining packages on the truck … twice! This is a machine learning feat that required satellite launches, GPS and map experts, network theory, theories of optimization from mathematics, ten years, and hundreds of millions of dollars (see this Harvard Business School article). However, UPS didn’t start with this complex solution when they decided to optimize routes.

The following story I heard at the recent Data Science Connect (virtual) conference a few weeks ago. It contains a powerful lesson for thinking about complex projects.

As it turns out, the first thing UPS did was simply make sure that all the packages on a given, single truck were headed to the same neighborhood (or delivery area). They made a simple, common-sense change that immediately delivered value. Additionally, this change set them up for the eventual big-budget, time-consuming project.

If the packages on a single truck need to go to many different parts of a city, there is a limit to how much route optimization can improve things. It would be like asking, “what’s the best way to pay my bills by driving one penny at a time to each business I owe?” We can figure out an optimal route, but maybe there are some more basic problems to solve here first.

There is a general principle for AI or Big Data projects that we can extract from this story, and to illustrate it, let’s think about cake. I’ve taken this metaphor from Bill Franks’ (Chief Analytics Officer of the International Institute for Analytics) wonderful talk (Franks 2020 — full reference below).

Rainbow Layer Cake clipart. Free download transparent .PNG | Creazilla
Layer Cake. Image used with permission under a Creative Commons CC0 1.0 Universal Public Domain Dedication.

Consider a layer cake. When you decide to make one, you don’t go straight from having no cake to having four layers of cake (Figure above). Rather, you put down the first layer and then proceed to build on top of it with each layer setting up the conditions for the next. First, UPS had to group similar delivery area boxes onto trucks. Then, they had to figure out how to optimize the truck’s route once. Finally, they figured out how to optimize routes in real time.

Notice that each “layer” of the UPS optimization cake needed the layer before it. Deciding what delivery area packages go in what truck set up the success of optimizing routes. Then, it doesn’t make sense to try optimizing a route hundreds of times a day before you try optimizing it once. So, the single-optimization layer sets up the success of the real-time optimization layer.

Additionally, the construction of each layer provided immediate value. Grouping packages was better than not. Optimizing the routes once was better than no optimization. And finally, real-time adjustment is better than single optimization at the start of the day.

This story has informed my thinking. Now when researchers ask me about machine-learning tasks (e.g., predicting X a certain number of minutes before it happens), I think about what the simpler base layers of the cake might look like. I think about what deeper, basic questions we can answer that will provide quick value while setting us up for success on the big (whole layer cake) question.


Franks, Bill. (2020, October 7-9). Scaling Data Science & Analytics [Conference presentation]. DSC 2020 Conference. Virtual.

Leave a Reply

Your email address will not be published. Required fields are marked *