0 comments on “June 23 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

June 23 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rack ngo)

SRE Items of the Week

Datanauts 089: SRE vs Cloud Native vs DevOps
http://bit.ly/2txPXWV

Rob Hirschfeld joins the Datanauts to talk about the term Site Reliability Engineer (SRE) and what it means for IT operations.

Rob explores how the SRE designation is an effort to put operations teams on a more equal footing with developers within an organization. Rob and the Datanauts also discuss how SREs line up with other industry trends such as the cloud native and DevOps movements. LISTEN HERE

Why Does DevOps Require a New Operating Model? By Mustafa Kapadia @MKapadiaTweets
https://devops.com/why-should-cios-redesign-their-organizations/

For many, redesigning the operating model is table stakes for a successful DevOps transformation. But have you ever wondered why? Popular wisdom will have you believe that the main reason for operating model redesign are to…

“Improve collaboration between business and IT”
“Realign metrics”
“Take full advantage of the new tools”
“And even jump start culture change”

While these are all good reasons, frankly they miss the point. Experience suggests there is a more practical reason – match ownership with desired output.

What do we mean by that? Well first, let’s look at how the current model works. READ MORE

What can developers learn from being on call? By Julia Evans @b0rk http://jvns.ca/blog/2017/06/18/operate-your-software/

We often talk about being on call as being a bad thing. For example, the night before I wrote this my phone woke me up in the middle of the night because something went wrong on a computer. That’s no fun! I was grumpy.

In this post, though, we’re going to talk about what you can learn from being on call and how it can make you a better software engineer!. And to learn from being on call you don’t necessarily need to get woken up in the middle of the night. By “being on call”, here, I mean “being responsible for your code when it breaks”. It could mean waking up to issues that happened overnight and needing to fix them during your workday! READ MORE

Kargo Ansible Playbooks foster Collaborative Kubernetes Ops
http://bit.ly/2qENw3I   

Why Kargo?
Making Kubernetes operationally strong is a widely held priority and I track many deployment efforts around the project. The incubated Kargo project is of particular interest for me because it uses the popular Ansible toolset to build robust, upgradable clusters on both cloud and physical targets. I believe using tools familiar to operators grows our community.

We’re excited to see the breadth of platforms enabled by Kargo and how well it handles a wide range of options like integrating Ceph for StatefulSet persistence and Helm for easier application uploads. Those additions have allowed us to fully integrate the OpenStack Helm charts (demo video). READ MORE

newsletter

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

  • 2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

0 comments on “June 2 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

June 2 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

RackN and our Co-Founder and CEO Rob Hirschfeld openly called for a significant change to the OpenStack and Kubernetes communities in his VMBlog.com post, How is OpenStack so dead AND yet so very alive to SREs?

“We’re going to keep solving problems in and around the OpenStack community.  I’m excited to see the Foundation embracing that mission.  There are still many hard decisions to make.  For example, I believe that Kubernetes as an underlay is compelling for operators and will drive the OpenStack code base into a more limited role as a Kubernetes workload (check out my presentation about that at Boston).  While that may refocus the coding efforts, I believe it expands the relevance of the open infrastructure community we’ve been building.

Building infrastructure software is hard and complex.  It’s better to do it with friends so please join me in helping keep these open operations priorities very much alive.”

To provide more information on this idea, Rob posted a new blog, OpenStack’s Big Pivot: our suggestion to drop everything and focus on being a Kubernetes VM management workload.

“Sometimes paradigm changes demand a rapid response and I believe unifying OpenStack services under Kubernetes has become an such an urgent priority that we must freeze all other work until this effort has been completed.”

This proposal has caused significant readership for a typical RackN blog as well as on social media so Rob has posted a 2nd post to further the proposal. (re)Finding an Open Infrastructure Plan: Bridging OpenStack & Kubernetes.

It’s essential to solve these problems in an open way so that we can work together as a community of operators.”

As you would expect, RackN is very interested in your thoughts on this proposal and its impact not only on the OpenStack and Kubernetes communities but also how it can transform the ability of IT infrastructure teams to deploy complex technologies in a reliable and scalable manner.

Please contact @zehicle and @rackngo to join the conversation.
_____________

Using Containers and Kubernetes to Enable the Development of Object-Oriented Infrastructure: Brendan Burns GlueCon Presentation

Is SRE a Good Term?
Interview with Rob Hirschfeld (RackN) and Charity Majors (Honeycomb) at Gluecon 2017


_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Velocity : June 19 – 20 in San Jose, CA

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #74

0 comments on “May 26 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

May 26 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

booth.PNG
Co-Founders of RackN Rob Hirschfeld and Greg Althaus at GlueCon

Reuven Cohen and Rob Hirschfeld Chat at GlueCon17
Reuven Cohen (@ruv) and Rob Hirschfeld discuss data center infrastructure trends concerning provisioning, automation and challenges. Rob highlights his company RackN and the open source project Digital Rebar sponsored by RackN.


_____________

Is SRE a Good Term?
Interview with Rob Hirschfeld (RackN) and Charity Majors (Honeycomb) at Gluecon 2017


_____________

How Google Runs its Production Systems – Get the Book
http://www.techrepublic.com/article/want-to-understand-how-google-runs-its-production-systems-read-this-free-book/

The book Site Reliability Engineering helps readers understand how some Googlers think: It contains the ideas of more than 125 authors. The four editors, Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy, managed to weave all of the different perspectives into a unified work that conveys a coherent approach to managing distributed production systems.

Site Reliability Engineering delivers 34 chapters—totaling more than 500 printed pages from O’Reilly Media—that encompass the principles and practices that keep Google’s production systems working. The entire book is available online at https://landing.google.com/sre/book.html, along with links to other talks, interviews, publications, and events.

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Velocity : June 19 – 20 in San Jose, CA

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #73

0 comments on “May 19 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

May 19 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week


Kargo Ansible Playbooks foster Collaborative Kubernetes Ops
http://blog.kubernetes.io/2017/05/kargo-ansible-collaborative-kubernetes-ops.html

kubernetes

Making Kubernetes operationally strong is a widely held priority and I track many deployment efforts around the project. The incubated Kargo project is of particular interest for me because it uses the popular Ansible toolset to build robust, upgradable clusters on both cloud and physical targets. I believe using tools familiar to operators grows our community.

We’re excited to see the breadth of platforms enabled by Kargo and how well it handles a wide range of options like integrating Ceph for StatefulSet persistence and Helm for easier application uploads. Those additions have allowed us to fully integrate the OpenStack Helm charts (demo video). READ MORE
___________

Cybercrime for Profit? Five reasons why we need to start driving much more dynamic IT Operations
https://rackn.com/2017/05/16/cybercrime-for-profit-five-reasons-why-we-need-to-starting-driving-much-more-dynamic-it-operations/
pexels-photo-169617

There’s a frustrating cyberattack driven security awareness cycle in IT Operations. Exploits and vulnerabilities are neither new nor unexpected; however, there is a new element taking shape that should raise additional alarm.

Cyberattacks are increasingly profit generating and automated. READ MORE
_____________

Building the SRE Culture at LinkedIn
https://engineering.linkedin.com/blog/2017/05/building-the-sre-culture-at-linkedin

Being a Site Reliability Engineer (SRE) means having to talk about hard problems. Site outages, complex failure scenarios, and other technical emergencies are the things we have to be prepared to deal with every day. When we’re not dealing with problems, we’re discussing them. We regularly perform post-mortems and root cause analyses, and we generally dig into complex technical problems in an unflinching way. READ MORE
_____________
Virtual Panel: OpenStack Summit Boston 2017 Debriefing


_____________

SRE vs. DevOps — a False Distinction?
https://devops.com/sre-vs-devops-false-distinction/

Just a few days before he died at the beginning of the 1990s, a wise man taught us that “the show must go on.” Freddie Mercury’s parting words have long provided the guiding light for many, if not all, ops teams. In their eyes, the production environment should be exposed to minimum risk, even at the expense of new features and problem resolution.

About 10 years ago, Google decided to change its approach to production management. It took the company only a few years to realize that while R&D focused on creating new features and pushing them to production, the Operations group was trying to keep production as stable as possible—the two teams were pulling in opposite directions. This tension arose due to the groups’ different backgrounds, skill sets, incentives and metrics by which they were measured. READ MORE
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Gluecon : May 24 – 25, 2017 in Denver, CO

  • Surviving Day 2 in Open Source Hybrid Automation – May 23, 2017 : Rob Hirschfeld and Greg Althaus

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #72

0 comments on “May 12 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

May 12 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

RobatOpenStack

OpenStack on Kubernetes: Will it blend? (OpenStack Summit Session) w/ Rob Hirschfeld

OpenStack and Kubernetes: Combining the Best of Both Worlds (OpenStack Summit Session) w/ Rob Hirschfeld

OpenStack Summit Boston Day 1 Notes by Rob Hirschfeld
https://robhirschfeld.com/2017/05/09/openstack-boston-day-1-notes/

Contrary to pundit expectations, OpenStack did not roll over and die during the keynotes yesterday.

In fact, I saw the signs of a maturing project seeing real use and adoption. More critically, OpenStack leadership started the event with an acknowledgement of being part of, not owning, the vibrant open infrastructure community. READ MORE

_______
Immutable Infrastructure Webinar

Attendees:

  • Greg Althaus, Co-Founder and CTO, RackN
  • Erica Windisch, Founder and CEO, Piston 
  • Christopher MacGown, Advisor, IOpipe
  • Riyaz Faizullabhoy,  Security Engineer, Docker
  • Sheng Liang, Founder and CEO Rancher Labs
  • Moderated by Stephen Spector, HPE, Cloud Evangelist

_______
SREies Part1: Configuration Management by Krishelle Hardson-Hurley

SREies is a series on topics related to my job as a Site Reliability Engineer (SRE). About a month ago, I wrote an article about what it means to be an SRE which included a compatibility quiz and resource list to those who were intrigued by the role. If you are unfamiliar with SRE, I would suggest starting there before moving on.

In this series, I will extend my description to include more specific summaries of concepts that I have learned during my first six months at Dropbox. In this edition, I will be discussing Configuration Management. READ MORE

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Interop ITX : May 15 – 19, 2017 in Las Vegas, NV

Gluecon : May 24 – 25, 2017 in Denver, CO

  • Surviving Day 2 in Open Source Hybrid Automation – May 23, 2017 : Rob Hirschfeld and Greg Althaus

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #71

0 comments on “May 5 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

May 5 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

RackN Announcement
[PRESS RELEASE] RackN Ends DevOps Gridlock in Data Center  

Today we announced the availability of Digital Rebar Provision, the industry’s first cloud-native physical provisioning utility.  We’ve had this in the Digital Rebar community for a few weeks before offering support and response has been great! READ MORE
_
______

Cloud Native PHYSICAL PROVISIONING? Come on! Really?!
 By Rob Hirschfeld

Today, RackN announce very low entry level support for Digital Rebar Provisioning – the RESTful Cobbler PXE/DHCP replacement.  Having a company actually standing behind this core data center function with support is a big deal; however…

We’re making two BIG claims with Provision: breaking DevOps bottlenecks and cloud native physical provisioning.  We think both points are critical to SRE and Ops success because our current approaches are not keeping pace with developer productivity and hardware complexity. READ MORE

RackN @ DevOpsDays Austin

IMG_0810

Slides from Rob Hirschfeld’s talk – The Server Cage Match

SRE vs DevOps vs Cloud Native: The Server Cage Match by Rob Hirschfeld

I don’t believe in DevOps shaming. Our community seems compelled to correct use of DevOps as an adjective for tools, teams and teapots. The frustration is reasonable: DevOps clearly taps into head space for both devs and operators who see a brighter automated future together. For example, check out this excellent DevOps discourse by Cindy Sridharan.

As an industry, we crave artificial conflict so it’s natural to try and distill site reliability engineering (SRE), DevOps and cloud native into warring factions when they are not. They all share a focus on Lean process. READ MORE

SRE News

What is DevOps? By Cindy Sridharan @copyconstruct  
https://medium.com/@cindysridharan/what-is-devops-5b0181fdb953

It happened again this week.

At this Wednesday’s Prometheus meetup I was hosting, I asked one of the attendees what he did for work.

He looked at me briefly before he barked one word in reply — DevOps — and then promptly made a beeline for the pizza at the back of the room. READ MORE
________

An Influx of Kubernetes Installers Raises Questions Around Conformance
https://thenewstack.io/kubernetes-installer-explosion-natural-enthusiasm/

For the Kubecon Europe last month, industry observer Joseph Jacks pulled together a list of over SIXTY (yes, 60) Kubernetes installers and services. This wealth of variation that made itself known as the conference, happily, kicked off a conformance effort to ensure that users get a consistent experience. I’m a strong believer that clear conformance builds ecosystems and have deep experience working on that from my OpenStack DefCore efforts.

In short, conformance is not a vendor issue: it’s a user experience and ecosystem issue.  READ MORE

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OpenStack Summit : May 8 – 11, 2017 in Boston, MA  

  • OpenStack and Kubernetes. Combining the best of both worlds – Kubernetes Day

Interop ITX : May 15 – 19, 2017 in Las Vegas, NV during    Open Source IT Summit – Tuesday, May 16, 9:00 – 5:00pm  

  • 3:15 – 4:05pm OpenStack and Kubernetes
  • 4:05 – 5:00pm Kubernetes for All

Gluecon : May 24 – 25, 2017 in Denver, CO

  • Surviving Day 2 in Open Source Hybrid Automation – May 23, 2017 : Rob Hirschfeld and Greg Althaus

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #70

0 comments on “RackN Ends DevOps Gridlock in Data Center [Press Release]”

RackN Ends DevOps Gridlock in Data Center [Press Release]

Today we announced the availability of Digital Rebar Provision, the industry’s first cloud-native physical provisioning utility.  We’ve had this in the Digital Rebar community for a few weeks before offering support and response has been great!

DR ProvisionBy releasing their API-driven provisioning tool as a stand-alone component of the larger Digital Rebar suite, RackN helps DevOps teams break automation bottlenecks in their legacy data centers without disrupting current operations. The stand-alone open utility can be deployed in under 5 minutes and fits into any data center design. RackN also announced a $1,000 starter support and consulting package to further accelerate transition from tools like Cobbler, MaaS or Stacki to the new Golang utility.

“We were seeing SREs suffering from high job turnover,” said Rob Hirschfeld, RackN founder and CEO. “When their integration plans get gridlocked by legacy tooling they quickly either lose patience or political capital. Digital Rebar Provision replaces the legacy tools without process disruption so that everyone can find shared wins early in large SRE initiatives.”

The first cloud-native physical provisioning utility

Data center provisioning is surprisingly complex because it’s caught between cutting edge hardware and arcane protocols and firmware requirements that are difficult to disrupt.  The heart of the system is a fickle combination of specific DHCP options, a firmware bootstrap environment (known as PXE), a very lightweight file transfer protocol (TFTP) and operating system specific templating tools like preseed and kickstart.  Getting all these pieces to work together with updated APIs without breaking legacy support has been elusive.

By rethinking physical ops in cloud-native terms, RackN has managed to distill out a powerful provisioning tool for DevOps and SRE minded operators who need robust API/CLI, Day 2 Ops, security and control as primary design requirements. By bootstrapping foundational automation with Digital Rebar Provision, DevOps teams lay a foundation for data center operations that improves collaboration between operators and SRE teams: operators enjoy additional control and reuse and SREs get a doorway into building a fully automated process.

A pragmatic path without burning downing the data center

“I’m excited to see RackN providing a pragmatic path from physical boot to provisioning without having to start over and rebuild my data center to get there.” said Dave McCrory, an early cloud and data gravity innovator.  “It’s time for the industry to stop splitting physical and cloud IT processes because snowflaked, manual processes slow everyone down.  I can’t imagine an easier on-ramp than Digital Rebar Provision”

The RackN Digital Rebar is making it easy for Cobbler, Stacki, MaaS and Forman users to evaluate our RESTful, Golang, Template-based PXE Provisioning utility.  Interested users can evaluate the service in minutes on a laptop or engage with RackN for a more comprehensive trail with expert support.  The open Provision service works both independently and as part of Digital Rebar’s full life-cycle hybrid control.

Scontactee specific features at http://rackn.com/provision/drsa.

Want help starting on this journey?  Contact us and we can help.

0 comments on “How about a CaaPuccino? Krish and Rob discuss containers, platforms, hybrid issues around Kubernetes and OpenStack.”

How about a CaaPuccino? Krish and Rob discuss containers, platforms, hybrid issues around Kubernetes and OpenStack.

CaaPuccino: A frothy mix of containers and platforms.

Check out Krish Subramanian’s (@krishnan) Modern Enterprise podcast (audio here) today for a surprisingly deep and thoughtful discussion about how frothy new technologies are impacting Modern Enterprise IT. Of course, we also take some time to throw some fire bombs at the end. You can use my notes below to jump to your favorite topics.

The key takeaways are that portability is hard and we’re still working out the impact of container architecture.

The benefit of the longer interview is that we really dig into the reasons why portability is hard and discuss ways to improve it. My personal SRE posts and those on the RackN blog describe operational processes that improve portability. These are real concerns for all IT organizations because mixed and hybrid models are a fact of life.

If you are not actively making automation that works against multiple infrastructures then you are building technical debt.

Of course, if you just want the snark, then jump forward to 24:00 minutes in where we talk future of Kubernetes, OpenStack and the inverted intersection of the projects.

Krish, thanks for the great discussion!

Rob’s Podcast Notes (39 minutes)

2:37: Rob intros about Digital Rebar & RackN

4:50: Why our Kubernetes is JUST UPSTREAM

5:35: Where are we going in 5 years > why Rob believes in Hybrid

  • Should not be 1 vendor who owns everything
  • That’s why we work for portability
  • Public cloud vision: you should stop caring about infrastructure
  • Coming to an age when infrastructure can be completely automated
  • Developer rebellion against infrastructure

8:36: Krish believes that Public cloud will be more decentralized

  • Public cloud should be part of everyone’s IT plan
  • It should not be the ONLY thig

9:25: Docker helps create portability, what else creates portability? Will there be a standard

  • Containers are a huge change, but it’s not just packaging
  • Smaller units of work is important for portability
  • Container schedulers & PaaS are very opinionated, that’s what creates portability
  • Deeper into infrastructure loses portability (RackN helps)
  • Rob predicts that Lambda and Serverless creates portability too

11:38: Are new standards emerging?

  • Some APIs become dominate and create de facto APIs
  • Embedded assumptions break portability – that’s what makes automation fragile
  • Rob explains why we inject configuration to abstract infrastructure
  • RackN works to inject attributes instead of allowing scripts to assume settings
  • For example, networking assumptions break portability
  • Platforms force people to give up configuration in ways that break portability

14:50: Why did Platform as a Service not take off?

  • Rob defends PaaS – thinks that it has accomplished a lot
  • Challenge of PaaS is that it’s very restrictive by design
  • Calls out Andrew Clay Shafer’s “don’t call it a PaaS” position
  • Containers provide a less restrictive approach with more options.

17:00: What’s the impact on Enterprise? How are developers being impacted?

  • Service Orientation is a very important thing to consider
  • Encapsulation from services is very valuable
  • Companies don’t own all their IT services any more – it’s not monolithic
  • IT Service Orientation aligns with Business Processes
    Rob says the API economy is a big deal
  • In machine learning, a business’ data may be more valuable than their product

19:30: Services impact?

  • Service’s have a business imperative
  • We’re not ready for all the impacts of a service orientation
  • Challenge is to mix configuration and services
  • Magic of Digital Rebar is that it can mix orchestration of both

22:00: We are having issues with simple, how are we going to scale up?

  • Barriers are very low right now

22:30: Will Kubernetes help us solve governance issues?

  • Kubernetes is doing a go building an ecosystem
  • Smart to focus on just being Kubernetes
  • It will be chaotic as the core is worked out

24:00: Do you think Kubernetes is going in the right direction?

  • Rob is bullish for Kubernetes to be the dominant platform because it’s narrow and specific
  • Google has the right balance of control
  • Kubernetes really is not that complex for what it does
  • Mesos is also good but harder to understand for users
  • Swarm is simple but harder to extend for an ecosystem
  • Kubernetes is a threat to Amazon because it creates portability and ecosystem outside of their platform
  • Rob thinking that Kubernetes could create platform services that compete with AWS services like RDS.
  • It’s likely to level the field, not create a Google advantage

27:00: How does Kubernetes fit into the Digital Rebar picture?

  • We think of Kubernetes as a great infrastructure abstraction that creates portability
  • We believe there’s a missing underlay that cannot abstract the infrastructure – that’s what we do.
  • OpenStack deployments broken because every data center is custom and different – vendors create a lot of consulting without solving the problem
  • RackN is creating composability UNDER Kubernetes so that those infrastructure differences do not break operation automation
  • Kubernetes does not have the constructs in the abstraction to solve the infrastructure problem, that’s a different problem that should not be added into the APIs
  • Digital Rebar can also then use the Kubernetes abstractions?

30:20: Can OpenStack really be managed/run on top of Kubernetes? That seems complex!

  • There is a MESS in the message of Kubernetes under OpenStack because it sends the message that Kubernetes is better at managing application than OpenStack
  • Since OpenStack is just an application and Kubernetes is a good way to manage applications
  • When OpenStack is already in containers, we can use Kubernetes to do that in a logical way
  • “I’m super impressed with how it’s working” using OpenStack Helm Packs (still needs work)
  • Physical environment still has to be injected into the OpenStack on Kubernetes environment

35:05 Does OpenStack have a future?

  • Yes! But it’s not the big “data center operating system” future that we expected in 2010. Rob thinks it a good VM management platform.
  • Rob provides the same caution for Kubernetes. It will work where the abstractions add value but data centers are complex hybrid beasts
  • Don’t “square peg a data center round hole” – find the best fit
  • OpenStack should have focused on the things it does well – it has a huge appetite for solving too many problems.
0 comments on “April 21 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

April 21 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

DigitalRebar Provision deploy Docker’s LinuxKit Kubernetes


_____________

Install Digital Rebar PXE Provision on a Mac OSX System and Test Boot using Virtual Box


_____________

Packet Pushers 333 Automation & Orchestration in Networking
http://packetpushers.net/podcast/podcasts/show-333-orchestration-vs-automation/

While the discussion is all about NETWORK DevOps, they do a good job of decrying WHY current state of system orchestration is so sad – in a word: heterogeneity.  It’s not going away because the alternative is lock-in.  They also do a good job of describing the difference between automation and orchestration; however, I think there’s a middle tier  of resource “scheduling” that better describes OpenStack and Kubernetes.

Around 5:00 minutes into the podcast, they effectively describe the composable design of Digital Rebar and the rationale for the way that we’ve abstracted interfaces for automation.  If you guys really do want to cash in by consulting with it (at 10 minutes), just contact Rob H.
_____________

Digital Magazine Launch: Increment On-Call
https://increment.com/on-call/

Increment is dedicated to covering how teams build and operate software systems at scale, one issue at a time. In this, our inaugural issue, we focus on industry best practices around on-call and incident response.
_____________

Need PXW? Try out this Cobbler Replacement
https://robhirschfeld.com/2017/04/11/provision-preview/ 

INTRO
We wanted to make open basic provisioning API-driven, secure, scalable and fast.  So we carved out the Provision & DHCP services as a stand alone unit from the larger open Digital Rebar project.  While this Golang service lacks orchestration, this complete service is part of Digital Rebar infrastructure and supports the discovery boot process, templating, security and extensive image library (Linux, ESX, Windows, … ) from the main project.

TL;DR: FIVE MINUTES TO REPLACE COBBLER?  YES.

The project APIs and CLIs are complete for all provisioning functions with good Swagger definitions and docs.  After all, it’s third generation capability from the Digital Rebar project.  The integrated UX is still evolving.
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

DevOpsDays Austin : May 4-5, 2017 in Austin TX  

OpenStack Summit : May 8 – 11, 2017 in Boston, MA  

  • OpenStack and Kubernetes. Combining the best of both worlds – Kubernetes Day  

Interop ITX : May 15 – 19, 2017 in Las Vegas, NV

Gluecon : May 24 – 25, 2017 in Denver, CO

  • Surviving Day 2 in Open Source Hybrid Automation – May 23, 2017 : Rob Hirschfeld and Greg Althaus

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #68

0 comments on “April 14 – Weekly Recap of All Things Site Reliability Engineering (SRE)”

April 14 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo). 

SRE Items of the Week

Continuous Discussions (#c9d9) Episode 66: Scaling Agile and DevOps in the Enterprise Watch Rob Hirschfeld in this Electric Cloud Podcast held on 4/11.

On the Continuous Discussions (#c9d9) podcast the discussion was on Scaling Agile and DevOps in the Enterprise.

  • What’s between scaling Agile and scaling DevOps?
  • What are some learnings and patterns for scaling Agile, that can be applied for starting and scaling a DevOps transformation in the enterprise?

Podcast Video Link: https://www.youtube.com/watch?v=uffUoX-O3g8
_____________

Rob Hirschfeld on Containers, Private Clouds, GIFEE, and the Remaining “Underlay Problem”
Rob Hirschfeld Q&A with Gene Kim on ITRevolution

INTRO
Back in October of 2016, I was at OpenStack Conference in Barcelona and ran into a long-time friend, Rob Hirschfeld. He surprised me by talking about a problem domain that we have had discussions about for years, reframing it as “the data center underlay problem.”

His provocative statement was that while OpenStack solves many problems, it didn’t address the fundamental challenges of how to run things like OpenStack on actual physical infrastructure. This is a problem domain that is being radically redefined by the container ecosystem.

This is a problem that Rob has been tirelessly working on for nearly a decade, and it was interesting to get his perspective on the emerging ecosystem, including OpenStack, Kubernetes, Mesos, containers, private clouds in general (which include Azure Stack), etc.  I thought it would be useful to share this with everyone.
_____________

Need PXE? Try out this Cobbler Replacement
Rob Hirschfeld Blog (https://robhirschfeld.com)

INTRO
We wanted to make open basic provisioning API-driven, secure, scalable and fast.  So we carved out the Provision & DHCP services as a stand alone unit from the larger open Digital Rebar project.  While this Golang service lacks orchestration, this complete service is part of Digital Rebar infrastructure and supports the discovery boot process, templating, security and extensive image library (Linux, ESX, Windows, … ) from the main project.

TL;DR: FIVE MINUTES TO REPLACE COBBLER?  YES.

The project APIs and CLIs are complete for all provisioning functions with good Swagger definitions and docs.  After all, it’s third generation capability from the Digital Rebar project.  The integrated UX is still evolving.
_____________

Open Source Collaboration: The Power of No & Interoperability
Christopher Ferris, IBM OpenTech

INTRO
It’s a common misconception that open source collaboration means saying YES to all ideas; however, the reality of successful projects is the opposite.

Permissive open source licenses drive a delicate balance for projects. On one hand, projects that adopt permissive licenses should be accepting of contributions to build community and user base. On the other, maintainers need to adopt a narrow focus to ensure project utility and simplicity. If the project’s maintainers are too permissive, the project bloats and wanders without a clear purpose. If they are too restrictive then the project fails to build community.

It is human nature to say yes to all collaborators, but that can frustrate core developers and users.

For that reason, stronger open source projects have a clear, focused, shared vision.  Historically, that vision was enforced by a benevolent dictator for life (BDFL); however, recent large projects have used a consensus of project elders to make the task more sustainable.  These roles serve a critical need: they say “no” to work that does not align with the project’s mission and vision.  The challenge of defining that vision can be a big one, but without a clear vision, it’s impossible for the community to sustain growth because new contributors can dilute the utility of projects.  [author’s note: This is especially true of celebrity projects like OpenStack or Kubernetes that attract “shared glory” contributors]
_____________

UPCOMING EVENTS
Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

DockerCon 2017 : April 17 – 20, 2017 in Austin, TX
DevOpsDays Austin : May 4-5, 2017 in Austin TX
OpenStack Summit : May 8 – 11, 2017 in Boston, MA  

  • OpenStack and Kubernetes. Combining the best of both worlds – Kubernetes Day  

Interop ITX : May 15 – 19, 2017 in Las Vegas, NV

Gluecon : May 24 – 25, 2017 in Denver, CO

  • Surviving Day 2 in Open Source Hybrid Automation – May 23, 2017 : Rob Hirschfeld and Greg Althaus

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #67