Course
Digicomp Code SRE101
Developing a Google SRE Culture («SRE101»)
Course facts
- Discussing Google’s view on DevOps philosophy and the relationship between DevOps and SRE
- Discussing the value SRE can provide to your IT operations
- Articulating Google’s technical and cultural fundamentals of SRE
- Assessing your organization’s maturity level in adopting SRE
- Identifying what skills to look for in a site reliability engineer, and how to train your existing workforce
- Discussing how Google can help you jumpstart SRE in your organization
Many IT organizations experience a disconnect between developers, who focus on agility, and operators, who focus on stability. Site Reliability Engineering (SRE) is how Google bridges the gap between development and operations while also providing mission-critical production support.
In this course, you'll learn the fundamentals and best practices of SRE, the importance of adopting an SRE culture, and how SRE can improve collaboration between IT and business leaders—and help the entire organization succeed.
1 Welcome to Developing a Google SRE Culture
- Define Site Reliability Engineering
2 DevOps, SRE, and Why They Exist
- Distinguish between DevOps and SRE
- Articulate the pillars of DevOps
- Explain how SRE practices align to DevOps pillars
3 SLOs with Consequences
- Explain the value SRE can provide to an organization
- Describe the technical fundamentals of SRE (SLOs, error budgets, and blameless postmortems)
- Describe the cultural fundamentals of SRE (Psychological safety, blamelessness, unified vision, collaboration, and knowledge sharing)
4 Make Tomorrow Better Than Today
- Describe the technical fundamentals of SRE (continuous integration/continuous delivery, canarying, and toil automation)
- Describe the cultural fundamentals of SRE (design thinking, prototyping, psychology of change, and resistance to change)
5 Regulate Workload
- Describe the technical fundamentals of SRE (measuring toil and reliability, and monitoring)
- Describe the cultural fundamentals of SRE (goal-setting, transparency, data-driven decision making)
6 Apply SRE in Your Organization
- Assess their organization’s SRE maturity level
- Identify where SRE can be applied within their business
- Recognize the skills an SRE needs
- Articulate the different types of SRE team implementations
- Advocate for SRE culture adoption across their organization
7 Final Assessment
- Assess your SRE technical and cultural fundamentals knowledge.
- IT leaders and business leaders who are interested in embracing SRE philosophy. Roles include, but are not limited to: CTO, IT director/manager, engineering VP/ director/manager.
- Other product and IT roles such as operations managers or engineers, software engineers, service managers, or product managers may also find this content useful as an introduction to SRE.
Recommended pre-reading: Site Reliability Engineering: How Google Runs Production Systems - Chapter 1 Introduction
Not covered
This course will not cover in-depth examples of SRE technical practices.