of 24/7 SRE staff managing day to day operations & monitoring, incident engagement, and disaster recovery activities. The... in and manage on-call rotation for the SRE Team Design, write and deliver software to improve the availability, scalability...
. Leadership & Collaboration: with C-staff, product management, engineering, and design partners. Communication: Create detailed... architecture diagrams, documents, and presentations. Focus on the User Experience (K8s users, Infrastructure Admin, MLOps staff...
excellence, site reliability engineering (SRE) and/or service monitoring, and who can partner effectively with Product, SRE... in service management, SRE or devops. This includes monitoring, logging, post-incident analysis, SRE tooling, and/or incident...