Any software engineer or data scientist will understand the pressure that comes with shipping new products or services. There’s the push of having to deliver rapidly and the pull of having to deliver quality code. Each company is somewhere on the spectrum of speed vs quality.
Silicon Valley, with its mantra of “move fast and break things”, relies on speed and iteration, whereas companies in regulated industries go for quality over speed of execution.
To quantify this push-and-pull, Ward Cunningham introduced the concept of technical debt in 1992. It quantifies the cost that comes with speedy execution. Of course, just like with financial debt, there are often valid reasons to go into technical debt.
But it’s important to reduce technical debt as you progress with the product or service you’ve released. As explained by the people at Google AI, the cost of technical debt tends to compound up to the point where a new iteration will cost much more money and time than it otherwise could have.
Technical debt is something that we at Widget Brain have struggled with as well, to the point where we came up with our own solution that still helps us to this day. But to make this a bit more concrete, let’s talk about three examples of technical debt, as well as how you can solve or even avoid the problems that come with it.
It might be surprising to know that any system which uses some type of algorithm or machine learning is at least 95% supporting code. The code that does the actual machine learning is at most 5% of the total amount.
All that supporting code is called glue code. It gathers data from all the separate packages that the machine learning system requires to function, and makes sure it arrives in the right place at the right time.
Although it’s inevitable that the majority of your code will be glue code, it’s still important to realise that too much glue code can quickly become expensive in terms of time as well as money. It can tie a system to specific data packages, which can make experimentation with other packages very time-consuming.
For example, here at Widget Brain we often had to rewrite code multiple times because of data that was mostly similar to previous data, but different enough that we had to rewrite it. Eventually, we realised that we could automate much of our glue code if we made sure the data followed a certain format.
Pipeline jungles are a special kind of glue code. Preparing data for a machine learning system can often become a combination of code pulling data from different sources, linked through messy code, which is in itself linked to other modules too.
This introduces unnecessary complexity, and it means that testing such pipelines becomes expensive, because they require end-to-end integration tests before you can understand whether they’re compatible with your machine learning system.
The nature of algorithms is so that they consist of many different pieces of code in an overarching system of code. We quickly found out that this makes it very difficult to change one thing, because you’ll change everything that’s connected in the pipeline too.
That’s one of the reasons why we developed a centralised platform we could use internally that would streamline the code architecture and that allows us to push updates through faster, because it’s only one funnel instead of several.
A final problem that serves as an example of technical debt are experimental codepaths. Because of the rigid nature of glue code and pipeline jungles, it’s often tempting for software engineers to create alternative algorithms to experiment and innovate with. This is because none of the existing infrastructure will need to be reworked.
However, these codepaths tacked onto your infrastructure can interact with each other in unpredictable, often dangerous ways. Additionally, making sure these codepaths are compatible with everything you’ve built can once again become expensive very quickly.
When creating glue code, it’s often tempting and seemingly time-efficient to use open-source APIs readily available in online databases. However, reimplementing the API for your specific database is a much better option in the long-term, because you’ll be much more certain there’s no strange code that could introduce a vulnerability into your system.
Additionally, in order to avoid too much glue code and too many pipeline jungles, it’s important to first think about how you will collect data, where you will collect it from, and what format it should follow. Having a plan in place and sticking with it often helps in avoiding code complexity.
For experimental codepaths, it’s important to entirely isolate the code from the rest of your system. There should be no links to existing modules. It’s also vital to periodically check which experimental codepaths can be entirely removed.
However, these are in-the-moment solutions. The root of many of the problems that come with technical debt lie in the distinction between data science and data engineering. Because many organisations treat both as separate categories, each group is at risk of coming across unexpected problems because it doesn’t understand the code of the other group.
This is why our data science and data engineering teams are combined, because we strongly believe it’s much better to either fully integrate the data science with the data engineering team, or at least have them work together closely. Understanding how the glue code works helps our data science team, and understanding how the machine learning works helps our data engineering team.
We at Widget Brain developed the Algorithm Factory to solve its problems with technical debt internally at first. But we quickly realised that the Algorithm Factory could be valuable for other organisations too.
As it stands, we use the Algorithm Factory as the single platform to run all our algorithms through. It reduces the amount of pipeline jungles to a minimum, while also setting clear infrastructural data paths for our IT team. It allows us to leverage the power of algorithms while making sure our levels of technical debt stay as low as possible.
Want to know more about the Algorithm Factory and how we can help you manage your algorithms? Go to www.widgetbrain.com and book a demo today.
Suddenly there is a new reality. To handle change in unprecedented times is to learn and decide faster than before. That means old data has to be forgotten and new decisions have to be made in the near and far future. Read this full guide on knowledge, data-driven decisions and automation to help you fully prepare for the next disruption.
Due to the crisis, most generic demand forecasting models in place today are no longer as accurate as they used to be and a relative approach has to be taken in order to find the “new normal” when compared to traditional historical patterns. Read more about the new normal in demand forecasting.
As the challenges presented by COVID-19 continue to affect the workforce around the world, companies are finding new ways to prepare for the next unexpected crisis. Fortunately, various tech solutions provide the opportunity to do just that. In our eyes, one of the most important is automated scheduling. Read more.