0 comments on “Surgical Ansible & Script Injections before, during or after deployment”

Surgical Ansible & Script Injections before, during or after deployment

RackN CEO, Rob Hirschfeld, has been posting about our unique composable operations approach with Digital Rebar to enable hybrid infrastructure and mix-and-match underlay tooling.

This post shows some remarkable flexibility enabled by the approach that allow operators to take limited, secure operations against running systems.

via Surgical Ansible & Script Injections before, during or after deployment. — Rob Hirschfeld


0 comments on “Digital Rebar Training Videos”

Digital Rebar Training Videos

We’re excited to announce an updated set of Digital Rebar training videos.  In response to requests to go beyond the simple Quick Start guide, we created a dedicated training channel and have been producing 15 minute tutorials on a wide range of topics.

Want us to cover a topic?  Just ask us on Gitter!


In some cases, these videos contain information that has not made it into documentation yet.  Our documentation is open source, we’d love to incorporate your notes to help make the experience easier for the next user.


0 comments on “The rise of Site Reliability Engineers (SRE) — Rob Hirschfeld”

The rise of Site Reliability Engineers (SRE) — Rob Hirschfeld

Using infrastructure effectively is a competitive advantage for Google and their SREs carry tremendous authority and respect for executing on that mission.

I’ve been writing about Site Reliability Engineering (SRE) tasks for nearly 5 years under a lot of different names such as DevOps, Ready State, Open Operations and Underlay Operations. SRE is a term popularized by Google (there’s a book!) for the operators who build and automate their infrastructure. Their role is not administration, it is redefining how infrastructure is used and managed within Google.

SRE is about operational excellence and we keep up with the increasingly rapid pace of IT.  It’s a recognition that we cannot scale people quickly as we add infrastructure.  And, critically, it is not infrastructure specific.

via Evolution or Rebellion? The rise of Site Reliability Engineers (SRE) — Rob Hirschfeld

0 comments on “Why RackN Is Joining Infrastructure Masons”

Why RackN Is Joining Infrastructure Masons

A few months ago, Dean Nelson, who for many years ran the eBay data center strategy shared his vision of Infrastructure Masons ( http://www.infrastructuremasons.org ) with me and asked us if we were interested in participating. He invited us to attend the very first Infrastructure Masons Leadership Summit which is being held November 16th at Google HQ in Sunnyvale, CA. We were honored and are looking forward to it.

In short, the Infrastructure Masons organization is comprised of technologists, executives and partners entrusted with building and managing the physical and logical structures of the Digital Age. They are dedicated to the advancement of the industry, development of their fellow masons, and empowering business and personal use of infrastructure to better the economy, the environment, and society.membership-symbol

During our conversation, like water, electricity, or transportation, Dean explained his belief is the majority of internet users expect instant connectivity to their online services without much awareness of the complexity and scale of the physical infrastructure that makes those services possible (I try to explain this to my children and they don’t care) and is taken for granted. Dean wants people and organizations that enable connectivity to receive more recognition for their contributions, share their ideas, collaborate and advance infrastructure technology to the next level. With leaders from FaceBook, Google and Microsoft (RackN too!) participating, he has the right players at the table who are committed to help deliver on his vision.

Managing multiple clouds, data centers, services, vendors and operational models should not be an impediment to progress for CIOs, CTO’s, cloud operators and IT administrators but an advantage. The overwhelming complexity in melding networking, containers and security should be simple and necessary. In the software-defined infrastructure, utility-based computing needs to be just like turning on a light or running a water faucet. RackN believes the ongoing lifecycle of automating, provisioning and managing hybrid clouds and data center infrastructure together under one operational control plane is possible. At RackN, we work hard to make that vision a reality. 

Looking forward into the future, we share Dean’s vision and look forward to helping drive the Infrastructure Masons mission.

Want to hear more?  Read an open ops take by RackN CEO, Rob Hirschfeld.

Author: Dan Choquette, Co-Founder/COO of RackN

0 comments on “Cloud Migrations & What We Can All Learn From NBA Legend Allen Iverson”

Cloud Migrations & What We Can All Learn From NBA Legend Allen Iverson


new aaNBA Hall-Of-Famer and 2001 league MVP Allen Iverson averaged more than 26 points a game during his career. He is recognized as one of the greatest to play his position. He is also well-known for his, “We’re Talkin’ ‘Bout Practice” rant (it’s on YouTube. Pure comedy). For multiple reasons, Iverson found little value practicing with his teammates as he felt he was “The Answer” (his actual nickname) to the team’s championship hopes. Even though it would have made the team more well-rounded, he didn’t feel the need to offload some of his offensive responsibilities to other players. Needless to say, Iverson never won an NBA title during his 17 year career.

I am seeing similar parallels with organizations which have “lifted and shifted” their legacy workloads to the cloud. While many have hit the traditional Day 1 problems such as buggy containers and the normal networking, hypervisor compatibility, security, compliance and performance issues, many are also experiencing buyer’s remorse once they get to the cloud as they have not fully embraced the risk of not making their workload stack portable from AWS to GCE or even back to private IaaS and have also overlooked the role of the existing staff of IT engineers in this journey.

In my opinion, cloud portability is a nebulous term. Products such as RightScale, Scalr, Morpheus and other cloud managers are excellent at moving workloads from one cloud to another but are actually only doing half the job. When advanced technologies such as SDN, container orchestration technologies, microservices and the constant churn of configuration updates and changes which impact the entire stack, it presents a lifecycle management nightmare.  Manually updating cookbooks, manifests and automation runbooks and composing these operations in a monolithic format is an arduous chore and a huge time sink. Additionally, if the workload has been refactored for only AWS/Cloud Formation and a couple of months later consumption costs become unbearable, the customer is locked-in and any OPEX advantage hoped to be gained has evaporated.  maxresdefault

We are also seeing the traditional IT engineer squeezed out of the cloud movement- and they shouldn’t. While it is sensible for data centers to be consolidated (or shut down entirely) over a 3-5 year period, security, data locality, compliance and licencing are all major factors considering keeping some on-prem IaaS and physical gear. In the event in which a workload needs to be moved back to a VM cluster, private cloud or on bare metal, who is there to ensure that these important functions are addressed? While they are not coders or full-stack DevOps engineers by trade,  with collaboration, intelligent automation tools which make the right provisioning and configuration decisions that follow a unified CloudOps model,  IT engineers help DevOps teams focus on getting the cloud to where it needs to be and perceived CAPEX and OPEX benefits realized.

At RackN, we automatically abstract away the unknown complexities of cloud migration-to-production use cases and allow CIOs to continue to innovate, modernize and make portable their workloads and operational models. We believe the cloud and infrastructure underlay pertaining to platform portability and the traditional IT engineer are critical teammates needed to be part of a winning strategy even if Allen Iverson doesn’t think so.


About the Author:

Dan Choquette is Co-Founder/COO of RackN. With Allen Iverson as inspiration, Dan will continue to work on his jump shot (which is an effort in futility).


1 comment on “Bugs Bunny, Prince and Enabling True Hybrid Infrastructure Consumption”

Bugs Bunny, Prince and Enabling True Hybrid Infrastructure Consumption

OK- Stay with me on this. I’m drawing parallels again.  🙂

Like many from my generation, my initial exposure to classical music and opera was derived from Bugs Bunny on Saturday mornings (culturally deprived, I know). One of the cartoons I remember well was with Bugs trying to get even with the heavy-set opera singer who disrupts Bugs’ banjo playing. In order to exact his revenge, Bugs infiltrates the opera singer’s concert by impersonating the famous long-hared (hared…get it?) conductor, Leopold Stokowski. He proceeds to force the tenor to hit octaves that structurally compromise the amphitheater and as it crumbles leaves him bruised and battered. Bugs is as always, victorious.


In examining Bugs’ strategy (let’s assume he actually had one), Bugs took over operations of the orchestra’s musical program to achieve his goal of getting the tenor “in-line” so to speak. As I prepare to head down to the OpenStack Conference in Austin, TX next week, I’m seeing similar patterns develop in the cloud and data center infrastructure space which are very “Bugs/Leopold-like”. With organizations deciding on how to consolidate data centers, containerize apps and move to the cloud, vendors and open source technologies offer value, however true operational, infrastructure and platform independence are not what they appear to be. For example, once you move your apps off the data center to AWS or VMware and then later determine you are paying too much or the workload is no longer is appropriate for the infrastructure, good luck replicating the configuration work done on CloudFormation on another cloud or back in the data center. Same rationale is applicable to other technologies such as converged infrastructure and proprietary private cloud platforms. As the customer, to achieve scale and remove operational pain you must fall in line. That in itself is a big commitment to make in a still-evolving and maturing technology industry and a dynamic business climate.

On an unrelated topic, I was saddened to learn of the passing of Prince this past week. While not a die-hard fan, I liked his music. He was a great composer of songs and had a style all to his own. Beyond his music and sheer talent, I admired his business beliefs and deep desire to maintain creative ownership and control of his music and his brand.

princeDespite his fortune and fame, there was a period in the middle of Prince’s career in which he felt creatively and financially locked-in by the big record companies. Once Prince (and the unpronounceable symbol) broke away from Warner Music, he was able to produce music under his own label. This action enabled him to create music without a major record label dictating when he needed to produce a new album and what it needed to sound like. In addition, he was now able to market his new recordings to the distribution platform that supported his artistic and financial goals. While still having ties to Warner Music, he was no longer bound by their business practices. Along with starting his own music subscription service, Prince cut deals with Arista, Columbia, iTunes and Sony. Prince’s music production had operational portability, business agility and choice (seven Grammy awards and 100 million record sales also help create that kind of leverage.).

While open APIs and containers offer some portability, at RackN we believe they do not offer a completely free market experience to the cloud and infrastructure consumer. If the business decides it is paying too much for AWS, it should not allow for the operational underlay and configuration complexity to lock them to the infrastructure provider. They should be able to transfer their business to Google, Azure, Rackspace or Dreamhost with ease. We believe technologies that create portable, composable operational workflows drive true infrastructure and platform independence and as a benefit, reduces business risk. Choosing a platform and being forced to use it are two very different things.

In conclusion, when considering moving workloads to the cloud, converged infrastructure platforms or using DevOps automation tools, consider how you can achieve programmable operational portability and agility. Think about how you can best absorb new technologies without causing operational disruption in your infrastructure. Furthermore, ensure you can accomplish this in a repeatable, automated fashion. Analyze how you can abstract away complex configurations for security, networking and container orchestration technologies and make them adaptable from one infrastructure platform to another. Attempt to eliminate configuration versioning as much as possible and make upgrades simplistic and automated so your DevOps staff does not have to be experts (they are stressed out enough.).

If you are attending the OpenStack Conference this week, look me up. While I am far from a music expert, i’ll be happy to share with you my insights on how to spot a technology vendor that likes to play a purple guitar as opposed to one that eats carrots and plays the banjo.

-Dan Choquette: Co-Founder, RackN




Is Hybrid DevOps Like The Tokyo Metro?

By Dan Choquette

Is DevOps at scale like a major city’s subway system? Both require strict processes and operational excellence to move a lot of different parts at once. How else?

If you had the pleasure of riding the Tokyo Metro, you might agree that it’s an interesting – and confusing experience (especially if you need to change lines!) All totaled, there are 9 lines, roughly 180+ stations with a daily ridership of almost 7 million people!

tokyo subway

A few days ago, I had a conversation with a potential user deploying Kubernetes with Contrail networking on Google Cloud repeatedly in a build/test/dev scenario. The other conversation was around the need to provision thousands of x86 bare metal servers once to twice a week with different configurations and networking with the need to ultimately control their metal as they would a cloud instance in AWS. Cool stuff!

Since we here at RackN believe Hybrid DevOps is a MUST for Hybrid IT (after all, we are a start-up and have bet our very lives on such a thing so we REALLY believe it!) I thought about how Hybrid DevOps compares to the Tokyo Metro (earlier that day I read about Tokyo in the news and the mind wandered). In my attempt to draw the parallel, below is an SDLC DevOps framework that you have seen 233 times before in blogs like this one.


In terms of process, I’m sure you can notice how similar it is to the Metro, right?


<more crickets>

When both operate as they should, they are the epitome of automation, control, repeat-ability and reliability. In disciplined, automated at-scale DevOps environments it does have some similarity to the Ginza or Tozai line. You have different people (think apps) of all walks of life boarding a train needing to get somewhere and need to follow steps in a process (maybe the “Pusher” is the scrum or DevOps governance tool but we’ll leave that determination for the end). However, as I compare it to Hybrid DevOps, the Tokyo Metro is not hybrid-tolerant. With subways, if a new subway car is added, tracks are changed, or a new station is added instantaneously to better handle congestion everything stops or turns into a logistical disaster. In addition, there is no way of testing how it will all flow before hand. There will be operational glitches and millions of angry customers will not reach their destination in a timely fashion- or at all.

The same is metaphorically true for Hybrid DevOps in Hybrid IT. In theory, the Hybrid DevOps pipeline includes build/test/dev and continuous integration/deployment for all platforms, business models, governance models, security and software stacks in which are dependent with the physical/IaaS/container underlay. Developers and operators need to test against multiple platforms (cloud, VM and metal) and in order to realize value, assimilate into production rapidly while at the same time frequently adjusting to changes of all kinds. They also require the ability to layer multiple technologies and security policies into an operational pipeline which in turn has hundreds of moving parts which require precise configuration settings to be made in a sequenced, orchestrated manner.

At RackN, in order to continuously test, integrate, deploy and manage complex technologies in a Hybrid IT scenario is critical to a successful adoption in production. The most optimal way to accomplish that is to have in place a central platform than can govern Hybrid DevOps at scale that can automate, orchestrate and compose allthe necessary configurations and components in a sequenced fashion. Without one, hap-hazard assembly and lack of governance erodes the overall process and leads to failure. Just like the “Pusher” on the platform, without governance both the Tokyo Metro and a Hybrid DevOps model at scale being used for a Hybrid IT use case leads to massive delays, dissatisfied customers and chaos.




0 comments on “Kubernetes 18+ ways – yes, you can have it your way”

Kubernetes 18+ ways – yes, you can have it your way

By Rob Hirschfeld

Lately, I’ve been talking about the general concept of hybrid DevOps adding composability, orchestration and services to traditional configuration. It’s time add a concrete example because the RackN team is deliving it with Digital Rebar and Kubernetes.

So far, we enabled a single open platform to install over 18 different configurations of Kubernetes simply by changing command line flags [videos below].

By taking advantage of the Digital Rebar underlay abstractions and orchestration, we are able to use open community installation playbooks for a wide range of configurations.

So far, we’re testing against:

  • Three different clouds (AWS, Google and Packet) not including the option of using bare metal.
  • Two different operating systems (Ubuntu and Centos)
  • Three different software defined networking systems (Flannel, Calico and OpenContrail)

Those 18 are just the tip of the iceberg that we are actively testing. The actual matrix is much deeper.


The composable architecture of Digital Rebar means that all of these variations are isolated. We are not creating 18 distinct variations; instead, the system chains options together and abstracts the differences between steps.

That means that we could add different logging options, test sequences or configuration choices into the deployment with minimal coupling of previous steps. This enables operator choice and vendor injection in a way to allows collaboration around common components. By design, we’ve eliminated fragile installation monoliths.

All it takes is a Packet, AWS or Google account to try this out for yourself!

DevOps workers, you mother was right: always bring a clean Underlay.

Why did your mom care about underwear? She wanted you to have good hygiene. What is good Ops hygiene? It’s not as simple as keeping up with the laundry, but the idea is similar. It means that we’re not going to get surprised by something in our environment that we’d taken for granted. It means that we have a fundamental level of control to keep clean. Let’s explore this in context.

l_1600_1200_9847591C-0837-4A7D-A69D-54041685E1C6.jpegI’ve struggled with the term “underlay” for infrastructure of a long time. At RackN, we generally prefer the term “ready state” to describe getting systems prepared for install; however, underlay fits very well when we consider it as the foundation for a more building up a platform like Kubernetes, Docker Swarm, Ceph and OpenStack. Even more than single operator applications, these community built platforms require carefully tuned and configured environments. In my experience, getting the underlay right dramatically reduces installation challenges of the platform.

What goes into a clean underlay? All your infrastructure and most of your configuration.

Just buying servers (or cloud instances) does not make a platform. Cloud underlay is nearly as complex, but let’s assume metal here. To turn nodes into a cluster, you need setup their RAID and BIOS. Generally, you’ll also need to configure out-of-band management IPs and security. Those RAID and BIOS settings specific to the function of each node, so you’d better get that right. Then install the operating system. That will need access keys, IP addresses, names, NTP, DNS and proxy configuration just as a start. Before you connect to the wide, make sure to update to your a local mirror and site specific requirements. Installing Docker or a SDN layer? You may have to patch your kernel. It’s already overwhelming and we have not even gotten to the platform specific details!

Buried in this long sequence of configurations are critical details about your network, storage and environment.

Any mistake here and your install goes off the rails. Imagine that your building a house: it’s very expensive to change the plumbing lines once the foundation is poured. Thankfully, software configuration is not concrete but the costs of dealing with bad setup is just as frustrating.

The underlay is the foundation of your install. It needs to be automated and robust.

The challenge compounds once an installation is already in progress because adding the application changes the underlay. When (not if) you make a deploy mistake, you’ll have to either reset the environment or make your deployment idempotent (meaning, able to run the same script multiple times safely). Really, you need to do both.

Why do you need both fast resets and component idempotency? They each help you troubleshoot issues but in different ways. Fast resets ensure that you understand the environment your application requires. Post install tweaks can mask systemic problems that will only be exposed under load. Idempotent action allows you to quickly iterate over individual steps to optimize and isolate components. Together they create resilient automation and good hygiene.

In my experience, the best deployments involved a non-recoverable/destructive performance test followed by a completely fresh install to reset the environment. The Ops equivalent of a full dress rehearsal to flush out issues. I’ve seen similar concepts promoted around the Netflix Chaos Monkey pattern.

If your deployment is too fragile to risk breaking in development and test then you’re signing up for an on-going life of fire fighting. In that case, you’ll definitely need all the “clean underwear” you can find.

1 comment on “We need DevOps without Borders! Is that “Hybrid DevOps?””

We need DevOps without Borders! Is that “Hybrid DevOps?”

The RackN team has been working on making DevOps more portable for over five years.  Portable between vendors, sites, tools and operating systems means that our automation needs be to hybrid in multiple dimensions by design.

Why drive for hybrid?  It’s about giving users control.

launch!I believe that application should drive the infrastructure, not the reverse.  I’ve heard may times that the “infrastructure should be invisible to the user.”  Unfortunately, lack of abstraction and composibility make it difficult to code across platforms.  I like the term “fidelity gap” to describe the cost of these differences.

What keeps DevOps from going hybrid?  Shortcuts related to platform entangled configuration management.

Everyone wants to get stuff done quickly; however, we make the same hard-coded ops choices over and over again.  Big bang configuration automation that embeds sequence assumptions into the script is not just technical debt, it’s fragile and difficult to upgrade or maintain.  The problem is not configuration management (that’s a critical component!), it’s the lack of system level tooling that forces us to overload the configuration tools.

What is system level tooling?  It’s integrating automation that expands beyond configuration into managing sequence (aka orchestration), service orientation, script modularity (aka composibility) and multi-platform abstraction (aka hybrid).

My ops automation experience says that these four factors must be solved together because they are interconnected.

What would a platform that embraced all these ideas look like?  Here is what we’ve been working towards with Digital Rebar at RackN:

Mono-Infrastructure IT “Hybrid DevOps”
Locked into a single platform Portable between sites and infrastructures with layered ops abstractions.
Limited interop between tools Adaptive to mix and match best-for-job tools.  Use the right scripting for the job at hand and never force migrate working automation.
Ad hoc security based on site specifics Secure using repeatable automated processes.  We fail at security when things get too complex change and adapt.
Difficult to reuse ops tools Composable Modules enable Ops Pipelines.  We have to be able to interchange parts of our deployments for collaboration and upgrades.
Fragile Configuration Management Service Oriented simplifies API integration.  The number of APIs and services is increasing.  Configuration management is not sufficient.
 Big bang: configure then deploy scripting Orchestrated action is critical because sequence matters.  Building a cluster requires sequential (often iterative) operations between nodes in the system.  We cannot build robust deployments without ongoing control over order of operations.

Should we call this “Hybrid Devops?”  That sounds so buzz-wordy!

I’ve come to believe that Hybrid DevOps is the right name.  More technical descriptions like “composable ops” or “service oriented devops” or “cross-platform orchestration” just don’t capture the real value.  All these names fail to capture the portability and multi-system flavor that drives the need for user control of hybrid in multiple dimensions.

Simply put, we need devops without borders!

What do you think?  Do you have a better term?

%d bloggers like this: