There is new security vulnerability that should be triggering a massive server fleet wide upgrade and patch for data center operators everywhere.  This one undermines fundamental encryption features embedded into servers’ trusted platform module (TPM).   According to Sophos.com, “this one’s a biggie.”

Yet, it’s unlikely anyone will actually patch their firmware to fix this serious issue.

Why?  A lack of automation.  Even if you agree with the urgency of this issue,

  1. It’s unlikely that you can perform a system wide software patch or system re-image without significant manual effort or operational risk
  2. It’s unlikely that you are actually using TPM because they are tricky to setup and maintain
  3. It’s unlikely that you have any tooling that automates firmware updates across your fleet
  4. It’s unlikely that you have automation to gracefully roll out an update that can coordinate BIOS and operating system updates
  5. Even if you can do the above (IF YOU CAN, PLEASE CALL ME), it’s unlikely that you can coordinate updating both patching the BIOS and re-encrypting/rotating the data signed by the keys in the TPM

Being able to perform actions should be foundational; however, I know from talking to many operators that there are serious automation and process gaps at this layer.  These gaps weaken the whole system because we neither turn on security features embedded in our infrastructure nor automate ways to systematically maintain them.

This type of work is hard to do.  So we don’t do it, we don’t demand it and we don’t budget for it.

Our systems are way too complex to expect issues like this to be improved away by the next wave of technology.  In fact, we see the exact opposite.  The faster we move, the more flaws are injected into the system.  This is not security problem alone.  Bugs, patches and dependencies cause even more system churn and risk.

I have not given up hoping that our industry will prioritize infrastructure automation so that we can improve our posture.  I’ve seen that fixing the bottom layers of the stack makes a meaningful difference in the layers above.  If you’ve been following our work, then you already know that is the core of our mission at RackN.

It’s up to each of us individually to start fixing the problem.  It won’t be easy but you don’t have to do it alone.  We have to do this together.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.