By rethinking physical ops in cloud-native terms, RackN has managed to distill out a powerful provisioning tool for DevOps and SRE minded operators who need robust API/CLI, Day 2 Ops, security and control as primary design requirements. By bootstrapping foundational automation with Digital Rebar Provision, DevOps teams lay a foundation for data center operations that improves collaboration between operators and SRE teams: operators enjoy additional control and reuse and SREs get a doorway into building a fully automated process.
CaaPuccino: A frothy mix of containers and platforms. Check out Krish Subramanian’s (@krishnan) Modern Enterprise podcast (audio here) today for a surprisingly deep and thoughtful discussion about how frothy new technologies are impacting Modern Enterprise IT. Of course, we also take some time to throw some fire bombs at the
Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at email@example.com or tweet Rob (@zehicle) or RackN (@rackngo) SRE Items of the Week DigitalRebar Provision deploy Docker’s LinuxKit
Welcome to the first post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at firstname.lastname@example.org. SRE Items of the Week Things I Learned Managing Site Reliability for Some of the World’s Busiest
This week, we added Install Wizard templates to the DC/OS install automation we build in collaboration with Mesosphere last year. That makes it even easier to run DC/OS on physical infrastructure. Like our Kubernetes work, the Digital Rebar automation uses the same community dcos_install.sh that’s used in the community documentation. The difference is that we’re also driving all the underlay prep and configuration automatically.
Operators should be able to buy infrastructure (physical and cloud) from any vendor and run it in a consistent way. Instead of days or weeks to get infrastructure running, it should take hours and be fully automated from power-on. We should be able to rehearse on cloud and transfer that automation directly to (and from) physical without modification. That practice and pace should be the norm instead of the exception.
Our focus on SRE series continues… At RackN, we see a coming infrastructure explosion in both complexity and scale. Unless our industry radically rethinks operational processes, current backlogs will escalate and stability, security and sharing will suffer. An entire chapter of the Google SRE book was dedicated to the benefits
This post is part of an SRE series grounded in the ideas inspired by the Google SRE book. Every Ops team I know is underwater and doesn’t have the time to catch their breath. Why does the load increase and leave Ops behind? It’s because IT is increasingly fragmented and