SRE Lessons from the Trenches April 15, 2020

from The Cloudcast·

Emil Stolarsky (@emilstolarsky) and Jaime Woo (@jaimewoo), co-founders of @IncidentLabsInc talk about experiences running web applications at scale, evolving into SRE roles, communicating SRE concepts across teams, and tips for initial success. SHOW: 446 SHOW SPONSOR LINKS: * MongoDB Homepage - The most popular database for modern applications * MongoDB Atlas - MongoDB-as-a-Service on AWS, Azure and GCP * Datadog Homepage - Modern Monitoring and Analytics * Try Datadog yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirt * DivvyCloud - Achieve continuous security & compliance. Request a free …



Emil Stolarsky (@emilstolarsky) and Jaime Woo (@jaimewoo), co-founders of @IncidentLabsInc talk about experiences running web applications at scale, evolving into SRE roles, communicating SRE concepts across teams, and tips for initial success.

SHOW: 446

SHOW SPONSOR LINKS:

SHOW NOTES:

  • Incident Labs (homepage)
  • Ovvy (tool) - On-Call Management
  • Google SRE Book

    Topic 1 - Welcome to the show. Tell us a little bit about your backgrounds, and some of your experiences that lead you to focus on SRE. * *

    Topic 2 - SRE is still an evolving concept, and people are still learning about it. How do you frame a conversation with people about how SRE works? How much is technology-centric and how much is culture/process-centric?

    Topic 3 - We’re all living in an unusual time, given the current COVID-19 pandemic. How do you see SRE changing as work environments change (e.g. WFH) or volume or change-rate is dramatically impacted? * *

    Topic 4 - What have you found are successful communication and collaboration models for SREs engineers with their associated teams (or other stakeholders)?

    Topic 5 - How well do you find different groups understand the concepts around error budgets and SLOs?

    Topic 6 - If people are just now getting started with SRE, what are some early tips (or tools) that you recommend for them to have initial success (or avoid failures)?

    FEEDBACK?

  • Email: show at thecloudcast dot net

  • Twitter: @thecloudcastnet