0 comments on “Podcast – A Nice Mix of Ansible and Digital Rebar”

Podcast – A Nice Mix of Ansible and Digital Rebar

Rob Hirschfeld, CEO and Co-Founder, RackN talks with Stephen Spector, HPE Cloud Evangelist about the recent uptake in Ansible news as well as how Digital Rebar Provision assists Ansible users.

Listen to the 9 minute podcast here:

As this is the launch of L8ist Sh9y Podcast from RackN we encourage you to visit our site at https://soundcloud.com/user-410091210 or subscribe to the RSS Feed. We will also be publishing on iTunes as well shortly.

0 comments on “Coal or Diamonds? Configuration Management is Under Pressure”

Coal or Diamonds? Configuration Management is Under Pressure

Cloud Native thinking is thankfully changing the way we approach traditional IT infrastructure.  These profound changes in how we build applications with 12-factor design and containers has deep implications on how we manage configuration and the tools we use to do it.  These are not cloud only impacts – the changes impact every corner of IT data centers.

“You still have to do configuration management but… we’re getting to a point we can do a lot less” (8:30)

Configuration Management is both necessary and very hard. I’ve written and spoken about the developer rebellion against Infrastructure (and will again at DOD Dallas!).  The TL;DR on that lightning talk is “infrastructure sucks.”

In this podcast, Eric and I have time to stretch out and really discuss what’s going on with in both broad and specific terms.  At the 15 minute mark, we start talking about how “radical simplicity” is coming to provisioning and deployment automation.  We break down how the business needs for repeatable and robust automation are driving IT to rethinking huge swaths of their infrastructures.  That transitions into making a whole data center into a CI/CD pipeline.

Podcast: Episode 15: The Death of Configuration Management with Rob Hirschfeld 

“If we have radically better control of the physical infrastructure, then I don’t need anything else to install Kubernetes.” (22:00)

Like always, Eric and I are not shy about taking on IT hot topics.  Dig deep, enjoy and let us know what YOU think about these topics.  We want to hear from you.

0 comments on “August 11 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

August 11 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Report: DevOps is still considered a new phenomenon
http://sdtimes.com/report-devops-still-considered-new-phenomenon  

While companies have grasped that DevOps leads to an increase in innovation, DevOps adoption and implementation still remains a challenge for many. Logz.io, an AI-powered log analytics company, released its DevOps Pulse 2017 survey in time for today’s SysAdmin Day, highlighting some of the challenges and benefits to DevOps.

The DevOps Pulse report this year was based on data from a survey of 700 companies, with an additional section on DevOps culture because, according to Logz.io, it’s one topic that wasn’t researched enough. READ MORE

Immutable Infrastructure Deployment Challenges for DevOps
http://bit.ly/2vFAWq1

Rob Hirschfeld and Gareth Rushgrove (@garethr) discuss the issues.

DevOps vs SRE vs Cloud Native Talk at DevOps Summit
http://news.sys-con.com/node/4134816 

In his session at @DevOpsSummit at 21st Cloud Expo, Rob Hirschfeld, CEO and co-founder of RackN, will explore this trend and discuss concrete ways to cope with the coming changes. He’ll look at the reasons why SRE is attractive and get specific about ways that teams can bootstrap their efforts and keep their DevOps Fu strong.

Meet the Digital Rebar Mascot
http://bit.ly/2fvnrT7

The Digital Rebar project is pleased to announce our new mascot; however, she doesn’t have a name. We are looking for ideas and you can reach us at @digitalrebar, @zehicle, or comment on this blog. READ MORE
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

3 comments on “Immutable Deployment Challenges for DevOps”

Immutable Deployment Challenges for DevOps

Last week, Gareth Rushgrove (@garethr) and I hashed out our viewpoints on the intersection of DevOps and Immutable Infrastructure.  We recorded the call because we want to expand the discussion to include a broader audience and we’d love to hear your opinions!

The gist of the call is that DevOps processes are moving faster and faster as teams embrace the create-destroy-repeat pattern of cloud automation.  This pattern favors immutable images driven by cloudinit style bootstrapping.  This changes our configuration management practice because configuration is front loaded.  It also means that we destroy rather than patch.

We both felt that this immutable pattern will become dominate overtime.

However, there was significant nuance in our position about this change and the challenges that it will pose to operators.  If you care about how immutable infrastructure is going to impact your DevOps plans then you’ll enjoy listening to our short discussion.

If you’re still hungry for the how’s and why’s of Immutable infrastructure, I suggest listening to the excellent panel discussion RackN hosted last May.

0 comments on “July 7 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

July 7 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Presidential Campaigns & Immutable Infrastructure by @danielbryantuk
https://www.infoq.com/news/2017/06/presidential-infrastructure

At QCon New York 2017 Michael Fisher presented “Presidential Campaigns & Immutable Infrastructure” and discussed the implementation and challenges of provisioning infrastructure for the Hillary for America (HFA) campaign that ran during the 2015-2016 US regional and national elections. Immutable infrastructure was key to the technical success of the campaign – the team moved quickly, but were resilient against failure for the majority of the time. It can take more effort to apply the principle of immutability to everything being deployed, but it is beneficial and developers “like the handshake between SRE and dev”. READ MORE

So you want to be a SRE? by Ingo Averdunk‏ @ingoa
https://hackernoon.com/so-you-want-to-be-an-sre-34e832357a8c

About 9 months ago I set out to leave my teaching career of six years to pursue a career as a Software Engineer. I attended a 3 month Programming Bootcamp called Hackbright Academy during which I not only learned the fundamentals of programming, but more importantly, the fundamentals of what type of work excites me. I realized that I loved design. I loved data-model design, user experience design, architectural design, system design… The list goes on, I love design. Because of this, I thought the best place for me would be as a Front End Engineer, boy was I wrong. READ MORE

LinkedIn Releases Open Source Tools
https://www.martechadvisor.com/news/search-social-ads/linkedin-releases-opensource-tools/

The social networking service for professionals, LinkedIn, has announced that it will be releasing a couple of key tools that will be available as open source projects. These have been primarily created to help businesses deal with issues regarding website outages. The new tools will also be enabling organizations to automatically connect with engineers whenever their applications fail. READ MORE
___________

newsletter

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

  • 2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

0 comments on “What makes ops hard? SRE/DevOps challenge & imperative [from Cloudcast 301]”

What makes ops hard? SRE/DevOps challenge & imperative [from Cloudcast 301]

TL;DR: Operators (DevOps & SREs) have a hard job, we need to make time and room for them to redefine their jobs in a much more productive way.

Cloudcast-Logo-2015-Banner-BlueThe Cloudcast.net by Brian Gracely and Aaron Delp brings deep experience and perspective into their discussions based on their impressive technology careers and understanding of the subject matter.  Their podcasts go deep quickly with substantial questions that get to the heart of the issue.  This was my third time on the show (previous notes).

In episode 301, we go deeply into the meaning and challenges for Site Reliability Engineering (SRE) functions.  We also cover some popular technologies that are of general interest.

Author’s Note; For further information about SREs, listen to my discussion about “SRE vs DevOps vs Cloud Native” on the Datanauts podcast #89.  (transcript pending)

Here are my notes from Cloudcast 301. with bold added for emphasis:

  • 2:00 Rob defines SRE (more resources on RackN.com site).
    • 2:30 Google’s SRE book gave a name, even changed the definition, to what I’ve been doing my whole career. Evolved name from being just about sites to a full system perspective.  
    • 3:30 SRE and DevOps are aligned at the core.  While DevOps is about process and culture, SRE is more about the function and “factory.”
    • 4:30 Developers don’t want to be shoving coal into the engine, but someone, SREs, have to make sure that everything keeps running
  • 5:15 Brian asks about impedance mismatch between Dev and Ops.  How do we fix that?
    • 6:30 Rob talks about the crisis brewing for operations innovation gap (link).  Digital Rebar is designed to create site-to-site automation so Operators can share repeatable best practices.
    • 7:30 OpenStack ran aground because Operators because we never created a the practices that could be repeated.  “Managed service as the required pattern is a failure of building good operational software.”
    • 8:00 RackN decomposes operations into isolated units so that individual changes don’t break the software on top

  • 9:20 Brian talks about the increasing rate of releases means that operations doesn’t have the skills to keep up with patching.
    • 10:10 That’s “underlay automation” and even scarier because software is composited with all sorts of parts that have their own release cycles that are not synchronized.
    • 11:30 We need to get system level patch/security.update hygiene to be automatic
    • 12:20 This is really hard!

  • 13:00 Brian asks what are the baby steps?
    • 13:20 We have to find baby steps where there are nice clean boundaries at every layer from the very most basic.  For RackN, that’s DHCP and PXE and then upto Kubernetes.
    • 15:15 Rob rants that renaming Ops teams as SRE is a failure because SRE has objectives like job equity that need to be included.
    • 16:00 Org silos get in the way of automation that have antibodies that make it difficult for SREs and DevOps to succeed.
    • 17:10 Those people have to be empowered to make change
    • 17:40 The existing tools must be pluggable or you are hurting operators.  There’s really no true greenfield, so we help people by making things work in existing data centers.
    • 19:00 Scripts may have technical debt but that does not mean they should just be disposed.
    • 19:20 New and shiney does not equal better.  For example, Container Linux (aka CoreOS) does not solve all problems.  
    • 20:10 We need to do better creating bridges between existing and new.
    • 20:40 How do we make Day 2 compelling?

  • 21:15 Brian asks about running OpenStack on Kubernetes.
    • 22:00 Rob is a fan of Kubernetes on Metal, but really, we don’t want metal and vms to be different.  That means that Kubernetes can be a universal underlay which is threatening to OpenStack.
    • 23:00 This is no longer a JOKE: “Joint OpenStack Kubernetes Environments”
    • 23:30 Running things on Kubernetes (or OpenStack) is great because the abstractions hide complexity of infrastructure; however, at the physical layer you need something that exposes that complexity (which is what RackN does).

  • 25:00 Brian asks at what point do you need to get past the easy abstractions
    • 25:30 You want to never care ever.  But sometimes you need the information for special cases.
    • 26:20 We don’t want to make the core APIs complex just to handle the special cases.
    • 27:00 There’s still a class of people who need to care about hardware.  These needs should not be embedded into the Kubernetes (or OpenStack) API.

  • 28:00 Brian summarizes that we should not turn 1% use cases into complexity for everyone.  We need to foster the skill of coding for operators
    • 28:45 For SREs, turning Operators into coding & automation is essential.  That’s a key point in the 50% programming statement for SREs.
    • In the closing, Rob suggested checking out Digital Rebar Provision as a Cobbler replacement.

We’re very invested in talking about SRE and want to hear from you! How is your company transforming operations work to make it more sustainable, robust and human?We want to hear your stories and questions.

1 comment on “Cybercrime for Profit!? Five reasons why we need to start driving much more dynamic IT Operations”

Cybercrime for Profit!? Five reasons why we need to start driving much more dynamic IT Operations

Author’s call to action: if you think you already know this is a problem, then why do we keep reliving it?  We’re doing our part open with Digital Rebar and we need more help to secure infrastructure using foundational automation.

There’s a frustrating cyberattack driven security awareness cycle in IT Operations.  Exploits and vulnerabilities are neither new nor unexpected; however, there is a new element taking shape that should raise additional alarm.pexels-photo-169617.jpeg

Cyberattacks are increasingly profit generating and automated.

The fundamental fact of the latest attacks is that patches were available.  The extensive impact we are seeing is caused by IT Operations that relies on end-of-life components and cannot absorb incremental changes.  These practices are based on dangerous obsolete assumptions about perimeter defense and long delivery cycles.

It’s not just new products using CI/CD pipelines and dynamic delivery: we must retrofit all IT infrastructure to be constantly refreshed.

We simply cannot wait because the cybersecurity challenges are accelerating.  What’s changed in the industry?  There is a combination of factors driving these trends:

  1. Profit motive – attacks are not simply about getting information, they are profit centers made simpler with hard to trace cryptocurrency.
  2. Shortening windows – we’re doing better at finding, publishing and fixing issues than ever in the open.  That cycle assumes that downstream users are also applying the fixes quickly.  Without downstream adoption, the process fails to realize key benefit.
  3. Automation and machine learning – the attackers are using more and more sophisticated automation to find and exploit vulnerabilities.  Expect them to use machine learning to make it even more effective.
  4. No perimeter – our highly interconnected and mobile IT environments eliminate the illusion of a perimeter defense.  This not just a networking statement: our code bases and service catalogs are built from many outside sources that often have deep access.
  5. Expanding surface area – finally, we’re embedding and connected more devices every second into our infrastructure.  Costs are decreasing while capability increases.  There’s no turning back from that, we we should expect an ongoing list of vulnerabilities.

No company has all the answers for cybersecurity; however, it’s clear that we cannot solve this cybersecurity at the perimeter and allowing the interior to remain static.

The only workable IT posture starts with a continuously deployed and updated foundation.

Companies typically skip this work because it’s very difficult to automate in a cross-infrastructure and reliable way.  I’ve been working in this space for nearly two decades and we’re just delivering deep automation that can be applied in generalized ways as part of larger processes.  The good news is that means that we can finally start discussing real shared industry best practices.

Thankfully, with shared practices and tooling, we can get ahead of the attackers.

RackN focuses exclusively on addressing infrastructure automation in an open way.  We are solving this problem from the data center foundations upward.  That allows us to establish security practice that is both completely trusted and constantly refreshed.  It’s definitely not the only thing companies need to do, but that foundation and posture helps drive a better defense.

I don’t pretend to have complete answers to the cyberattacks we are seeing, but I hope they inspire us to more security discipline.  We are on the cusp of a new wave of automated and fast exploits.

Let us know if you are interested in working with RackN to build a more dynamic infrastructure.

0 comments on “If Private Cloud is dead. Where did it go? How did it get there? [JOINT POST]”

If Private Cloud is dead. Where did it go? How did it get there? [JOINT POST]

TL;DR: Hybrid killed IT.

I’m a regular participant on BWG Roundtable calls and often extend those discussions 1×1.  This post collects questions from one of those follow-up meetings where we explored how data center markets are changing based on new capacity and also the impact of cloud.  

We both believe in the simple answer, “it’s going to be hybrid.” We both feel that this answer does not capture the real challenges that customers are facing.

pexels-photo-325229So who are we?  Haynes Strader, Jr. comes at this from a real estate perspective via CBRE Data Center Solutions.  Rob Hirschfeld comes at this from an ops and automation perspective via RackN.  We are in very different aspects of the data center market.    

Rob: I know that we’re building a lot of data center capacity.  So far, it’s been really hard to move operations to new infrastructure and mobility is a challenge.  Do you see this too?

Haynes: Yes.  Creating a data center network that is both efficient and affordable is challenging. A couple of key data center interconnection providers offer this model, but few companies are in a position to truly leverage the node-cloud-node model, where a company leverages many small data center locations (colo) that all connect to a cloud option for the bulk of their computing requirements. This works well for smaller companies with a spread-out workforce, or brand new companies with no legacy infrastructure, but the Fortune 2000 still have the majority of their compute sitting in-house in owned facilities that weren’t originally designed to serve as data centers. Moving these legacy systems is nearly impossible.

Rob: I see many companies feeling trapped by these facilities and looking to the cloud as an alternative.  You are describing a lot of inertia in that migration.  Is there something that can help improve mobility?

Haynes: Data centers are physical presences to hold virtual environments. The physical aspect can only be optimized when a company truly understands its virtual footprint. IT capacity planning is key to this. System monitoring and usage analytics are critical to make growth and consolidation decisions. Why isn’t this being adopted more quickly? Is it cost? Is it difficulty to implement in complex IT environments? Is it the fear of the unknown?

Rob: I think that it’s technical debt that makes it hard (and scary) to change.  These systems were built manually or assuming that IT could maintain complete control.  That’s really not how cloud-focused operations work.  Is there a middle step between full cloud and legacy?

Haynes: Creating an environment where a company maximizes the use for its owned assets (leveraging sale leasebacks and forward-thinking financing) vs. waiting until end of life and attempting to dispose leads to opportunities to get capital injections early on and move to an OPEX model. This makes the transition to colo much easier, and avoids a large write-down that comes along with most IT transformations. Colocation is an excellent tool if it is properly negotiated because it can provide a flexible environment that can grow or shrink based on your utilization of other services. Sophisticated colo users know when it makes sense to pay top dollar for an environment that requires hyperconnectivity and when to save money for storage and day-to-day compute. They know when to leverage providers for services and when to manage IT tasks in-house. It is a daunting process, but the initial approach is key to getting to that place in the long term.

Rob:  So I’m back to thinking that the challenge for accessing all these colo opportunities is that it’s still way too hard to move operations between facilities and also between facilities and the cloud.  Until we improve mobility, choosing a provider can be a high stakes decision.  What factors do you recommend reviewing?

Haynes: There is an overwhelming number of factors in picking new colos:

  1. Location
  2. Connectivity/Latency
  3. Cloud Connectivity Options
  4. Pricing
  5. Quality of Services
  6. Security
  7. Hazard Risk Mitigation
  8. Comfort with services/provider
  9. Growth potential
  10. Flexibility of spend/portability (this is becoming ever-more important)

Rob: Yikes!  Are there minor operational differences between colos that are causing breaking changes in operations?

Haynes:  We run into this with our clients occasionally, but it is usually because they created two very different environments with different providers. This is a big reason to use a broker. Creating identical terms, pricing models, SLAs and work flows allow for clients to have a lot of leverage when they go to market. A select few of the top cloud providers do a really good job of this. They dominate the markets that they enter because they have a consistent, reliable process that is replicated globally. They also achieve some of the most attractive pricing and terms in the marketplace on a regular basis.

pexels-photo-119661.jpegRob: That makes sense.  Process matters for the operators and consistent practices make it easier to work with a partner.  Even so, moving can save a lot of money.  Is that savings justified against the risk and interruption?

Haynes: This is the biggest hurdle that our enterprise clients face. The risk of moving is risking an IT leader’s job. How do we do this with minimal risk and maximum upside? Long-term strategic planning is one answer, but in today’s world, IT leadership changes often and strategies go along with that. We don’t have a silver bullet for this one – but are always looking to partner with IT leaders that want to give it a shot and hopefully save a lot of money.

Rob: So is migration practical?

Haynes: Migration makes our clients cringe, but the ones that really try to take it on and make it happen strategically (not once it is too late) regularly reap the benefits of saving their company money and making them heroes to the organization.

Rob: I guess that brings us back to mixing infrastructures.  I know that public clouds have interconnect with colos that make it possible to avoid picking a single vendor.  Are you seeing this too?

Haynes: Hybrid, hybrid, hybrid. No one is the best one-stop shop. We all love 7-11 and it provides a lot of great solutions on the run, but I’m not grocery shopping there. Same reason I don’t run into a Kroger every time I need a bottle of water. Pick the right solution for the right application and workload.

Rob: That makes sense to me, but I see something different in practice.  Teams are too busy keeping the lights on to take advantage of longer-term thinking.  They seem so busy fighting fires that it’s hard to improve.

Haynes:  I TOTALLY agree. I don’t know how to change this. I get it, though. The CEO says, “We need to be in the cloud, yesterday,” and the CIO jumps. Suddenly everyone’s strategic planning is out the window and it is off to the races to find a quick-fix. Like most things, time and planning often reap more productive results.

Thanks for sharing our discussion!  

We’d love to hear your opinions about it.  We both agree that creating multi-site management abstractions could make life easier on IT and relatable to real estate and finance. With all of these organizations working in sync the world would be a better place. The challenge is figuring out how to get there!

%d bloggers like this: