The Center for Educational Technology (CET) is an independent organization dedicated to advancing education in Israel and beyond
Our goal is to provide every student the opportunity to study in a learning environment that combines rich, quality content; state-of-the-art technology; and innovative pedagogy, which together will place students at the forefront of knowledge and prepare them for the challenges of the future
At any given moment we server 200,000+ teachers and over a 1,000,000 students
To handle this scale, we practice modern production operations with a complete self-serve approach and develop our platform as a product. We run complicated systems at scale which we design, develop and deploy
Day 0, Day 1, and Day 2 operations are critical components of a successful SaaS production system, ensuring its deployment, stability, and ongoing improvement.
The effective management of these operations requires highly skilled and dedicated teams, with a deep understanding of both the technology and the business needs it serves
We are searching for an experienced and dynamic Team Leader to manage our SRE teams. As a SaaS provider in the education sector, we believe in creating a seamless environment for learning and growth. This role is pivotal to ensure our infrastructure remains robust, agile, and cutting-edge, meeting the demands of educators and students alike.
:Key Responsibilities
Leadership & Collaboration
Lead, mentor, and grow the SRE teams to ensure optimal performance
Foster a collaborative environment with the R&D department, ensuring seamless CI/CD processes
Ensure smooth cross-functional communications between teams
:Infrastructure Management & Monitoring
Oversee 24*7 production monitoring, ensuring maximum uptime and swift issue resolution
Collaborate with the R&D teams to optimize CI/CD pipelines for both legacy and modern applications
Ensure robust and secure deployment practices in both Azure and AWS environments
:Cloud Platform Expertise
Oversee applications primarily running in Azure, while managing data lake operations in AWS.
Actively manage and optimize both IaaS resources such as VMs, MS SQL, IIS, etc., and more modern resources like Kubernetes (K8S).
:Tooling & Automation
Provide expertise in tools including Azure DevOps, GIT, and Jenkins.
Drive automation initiatives to minimize manual overhead and enhance reliability
Advocate for best practices in version control, CI/CD, and infrastructure-as-code
Continuous Improvement
Evaluate emerging DevOps and SRE practices and tools, recommending adoption where appropriate
Identify areas of improvement in our infrastructure and deployment practices and implement solutions
Ensure all systems meet security and compliance requirements
Legacy System Management
Provide strategies for the transition of legacy code and infrastructure to modern platforms
Ensure the consistent performance and reliability of legacy systems during this transitional period.
:Qualifications
Bachelor’s degree in computer science, Engineering, or related field
3+ years of professional experience in SRE or related roles
Minimum of 2 years in a leadership role within DevOps or SRE
Experience managing robust and scalable production SaaS environments
Has the expertise to design and implement solutions above Kubernetes including deep understanding of Kubernetes internals and patterns
Mastery of infrastructure as code concepts, implementation and lifecycle
Strong coding skills in Golang and Python
Extensive experience with CI/CD pipelines using tools such as Jenkins and GitLab, including setting up automated builds, testing, and deployment processes