Spectre, Meltdown and the case of Technical Debt

Few days have passed after celebrating the arrival of 2018, and yet, the technology world has been already shaken hard by a once in a lifetime type of event. A hardware-based security vulnerability was disclosed yesterday afternoon that highlights that a bug or flaw is present in multiple processor chips produced in the last decade. As a result of this vulnerability, programs can access data from other programs in memory. This data could be anything from passwords, emails, messages, etc. Basically any business or personal file could be potentially accessed.

Technology companies have responded with high velocity providing patches and updates. Most notably, public cloud providers (specifically Microsoft Azure, Amazon Web Services (AWS) and Google Cloud) have responded in an exceptional way but not shy of difficult decisions taken along the way. Hours after Google Project Zero disclosed the security vulnerability in a report that included the details of the vulnerability, two attack exploit details (Spectre & Meltdown) and a Proof of Concept (PoC) that validated the vulnerability, all Cloud providers immediately accelerated and executed an immediate Planned Maintenance to ensure none of their customers get exposed to one these exploits. It is expected that all the infrastructure running in the Azure, AWS or Google public cloud will be protected entirely with little to no impact. These providers are also monitoring activities in their ecosystem to check if there is any attempt to run these exploits in their platforms.

The nature of this event have surfaced one of the most forbidden and forgotten topics in technology: Technical Debt.

Technical Debt is a value that reflects the implied cost of any deferred work that needs to be done as a result of taking the minimum viable alternative or even do nothing at all. Like financial debt, for example a loan, it accumulates “interest” in some cases “compound interest”. This means that the longer it takes for an organization to solve the problem, the higher the complexity (i.e., higher the cost) of performing these activities. In the technology world, a typical example that makes technical debt more evident is Hardware and software lifecycles (Upgrades, patching, etc.). It is more complex and expensive to refresh a system that has been in operation for 10 years untouched than a system that was deployed last month. The impact is palpable beyond technology. It also impact processes and people in the organization.

What really fascinates me about the Spectre and Meltdown exploits is that it will show us the entire spectrum of technical debt. The three public cloud vendors mentioned above represent the upper side of the spectrum or what I will called the “Close to Zero Technical Debt” approach while the lower side of the spectrum represents the no action approach or what I will call “Stay vulnerable” approach. What we will see in the next few months across the enterprise is a variety of post-vulnerability states ranging from no action to immediate action. As a result, you will be learning about successful execution of these exploits and loss of data which translates into financial, reputational and customer risk (e.g., Millions or Billions of Dollars)

In the past, a “Close to Zero Technical Debt” approach would have been considered an impossibility, not even an option to be put on the table. Suggesting such an approach might be a synonymous of a career-end move. And that is suggesting it, not even execute it. Why is that? I believe that there are two reasons why that happens: First, that technical debt is difficult to quantify. Last but not least, something that is not easily quantifiable falls into subjective conversations that with little or no education on the matter can end up being resolved by “accepting or deferring the risk”.

In this particular case, even in at a subjective level, these exploits ar proving that we have been managing Technical Debt backwards. But Public Cloud Providers can’t follow this old bad habit. The cost of staying vulnerable for them is a death sentence. Imagine deploying a Virtual Machine or subscribing to any cloud service with the disclaimer: “There is a possibility that your data could be compromised due to a vulnerability that will be addressed in the future, by accepting this terms and condition you are also accepting this risk, cross finders ;)”.

Unacceptable and yet, I predict that risk acceptance/deferral will be an option that will be considered and discussed in a lot of meeting rooms across the globe. I would like to believe that the data we have on this vulnerability is enough to dismiss the alternative of doing nothing, especially with example of leadership shown by these cloud providers. But only time will tell.

I would like to leave you with a call to action. If you are a decision maker or participant around these type of conversations in your organization, ask yourself the following questions:

  1. What is the true cost of staying vulnerable?
  2. Is my organization making the best decision and taking the right actions for the benefit of our customers? Even if they are difficult
  3. How would I feel if my providers decide and do the same thing that I’m doing for my customers?

Leave a Reply

Your email address will not be published. Required fields are marked *