Today’s organizations deal with a higher volume of change in a more complex tech environment leading to a higher risk of outages and incidents. IT teams must improve service reliability and system resiliency. With automation and observability becoming key factors for more efficient and rapid deployments, the SRE profile has become one of the fastest-growing enterprise roles and set of operational practices for managing services at scale.
The SRE (Site Reliability Engineering) Practitioner workshop introduces ways to economically and reliably scale services in an organization. It explores strategies to improve agility, cross-functional collaboration, and transparency of health of services towards building resiliency by design, automation and closed loop remediations.
The workshop aims to equip participants with the practices, methods, and tools to engage people across the organization involved in reliability through the use of real-life scenarios and case stories. Upon completion of the workshop, participants will have tangible takeaways to leverage when back in the office such as implementing SRE models that fit their organizational context, building advanced observability in distributed systems, building resiliency by design and effective incident responses using SRE practices.
The workshop is developed by leveraging key SRE workshops, engaging with thought-leaders in the SRE space and working with organizations embracing SRE to extract real-life best practices and has been designed to teach the key principles & practices necessary for starting SRE adoption.