Senior Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Extensive expertise in building and deploying web applications in AWS, Azure, and GCP., Strong knowledge of Kubernetes, Docker, and Terraform for deployment systems., Experience with reliability engineering practices including SLOs/SLIs and disaster resilience., Proficiency in Python and version control systems like Git..

Key responsabilities:

  • Develop and manage CiviForm products, ensuring robust and scalable production environments.
  • Handle staging and production environments, being on call for outages.
  • Collaborate with governments to address service-related issues and improve deployment systems.
  • Implement monitoring strategies and security measures for deployments.

Exygy logo
Exygy TPE https://www.exygy.com/
11 - 50 Employees
See all jobs

Job description

About Exygy

Exygy is a digital innovation studio on a mission to build resilient and healthy communities. We enable impact-focused organizations to rethink experiences and create digital products that solve their problems and delight users. Our diverse team brings a breadth of technical expertise, user-centric perspectives, and product strategy to every engagement. As a certified B-Corporation, we are driven by our fierce commitment to the betterment of humanity. Our clients include CARE International, QURE Healthcare, San Francisco Mayor's Office of Housing, and Hopelab.
Exygy has embraced a work from home philosophy and we are now a remote-first company. This role can be located anywhere in the U.S and may occasionally require traveling for meetings or team building events. Exygy remains connected and engaged with virtual team events, weekly all hands meetings and regular zoom workshops and trainings.

Summary

Exygy seeks an enthusiastic and experienced Senior SRE who is passionate about making a difference in the world with technology. Join our growing team to build and support a wide variety of high-impact projects across all of Exygy. This is a full-time remote position. As an SRE you’ll spend most of your time working with the CiviForm team to support product development, manage staging and production environments and develop deployment systems and infrastructure so that the platform is secure and dependable.

Who Does This Role Report To: 
Principal Engineer on Civiform

Supervisory Responsibilities:
This role does not have supervisory responsibility.
This role is a P3 level - this is an internal grade level which correlates with our salary bands and compensation philosophy


Responsibilities
  • Participate in the development of CiviForm products as a service, building upon our existing deployment system and building out a new Kubernetes-based prototype to ensure robust, secure, and scalable production instances.  
  • Manage staging and production environments, being on call to address outages
  • Work with governments with issues related to the service
  • Own and evolve the deployment systems
  • Participate in the development of a new CiviForm SaaS (Software as a Service). 
  • Own development of this deployment system utilizing Kubernetes from prototyping through to delivery.
  • Civiform’s existing infrastructure is currently defined with Python and Terraform, deployed into AWS and Azure. Improve the flexibility and features of the system to meet the needs of governments deploying CiviForm to their own cloud providers.
  • Define, implement, gather, and analyze metrics from deployments to identify areas for improvement related to cloud configuration
  • Partner with the engineering team to improve services through rigorous testing and release procedures, as well as resolving scaling issues and improving resilience
  • Draft Service Level Objectives and define Service Level Indicators, and implement them
  • Develop playbooks for deployments, including implementing a strategy for monitoring and alerting and how to address issuesIdentify and mitigate security risks in deployments
  • Contribute to CI/CD implementation and best practices 

  • Required Skills
  • Extensive expertise building and deploying web apps in AWS, Azure, and GCP Networking
  • Distributed systems
  • Public cloud and container security (RBAC, process isolation, network security, firewalls, certificate management, etc.)
  • Reliability engineering (disaster resilience, multi-zonal deployments, logging practices, SLOs/SLIs, monitoring, deployment strategy, etc.)
  • Kubernetes
  • Docker/containers
  • Terraform
  • Python
  • Version control systems (we use Git/GitHub)
  • Linux
  • DevOps concepts and best practices
  • Authentication technologies such as OIDC, SAML

  • Bonus Skills
  • Java
  • TypeScript
  • Apps utilizing the Play framework or similar server-side MVC frameworks
  • Benefits & Perks
     
    Our Values
    • Learning and Growth
    • Pursuit of Excellence
    • Leaning into Fear
    • Spirit of Generosity 
    • Embrace the Whole Person

    Mission Statement
    To ensure all communities, especially marginalized communities, have access to basic social needs. We use our expertise in strategic product development, thoughtful design, and tailored technical solutions to identify the greatest barriers and build solutions to overcome them.

    Vision Statement
    Everyone has equitable access to the basic social needs that encourage them to thrive. These social needs include but are not limited to affordable housing, physical and mental health care, food security, quality education, stable employment opportunities, and public benefits.

    Employee Enablement Support: 
    Laptop provided
    $2000 annual (per calendar year) remote environment setup which includes using this budget to outfit your home office, co-working spaces, coffee shops or to meet up and collaborate with you team mates.

    Wellness Budget
    $100 monthly to pay for your wellness item of choice (gym membership, classes, massages etc.)

    Professional Development:
    $1000 annual (per calendar year) stipend towards professional development

    Retirement & 401k Plans:
    Employees are eligible for a 100% employer match of up to 4% of employee contribution

    Medical:
    Full benefits package with options up to 100% coverage toward select medical, dental, and vision plans.

    Remote First Working Environment:
    • Exygy employees may work remotely across the US
    • Exygy employees main residence must be within the US

    Work Life Balance 
    Exygy proudly embraces work/life balance and has adopted a 4DWW policy beginning 2025, recognizing Monday - Thursday as business days. Full-time employees are expected to work 32-40 hours in a typical week. Fridays are flexible, allowing employees to take the day off when their workload allows.

    Collaborative working hours: 
    we aim to hold all  internal meetings between 10 AM - 3 PM PT. We expect all Exygy staff to be available during these set working hours

    Time Off: 
    Flexible paid time off, a minimum of 14 paid holidays, and an org-wide closure from Christmas Day through New Year's Day
    Competitive paid parental and family leave

    EEO & Commitment to Equity, Diversity, and Inclusion
    We are actively seeking to create a diverse and equitable work environment because we believe that creates a stronger team.

    Exygy values a diverse workplace and strongly encourages women, people of color, LGBTQIA individuals, people with disabilities, members of ethnic minorities, foreign-born residents, older members of society, and others from minority groups and diverse backgrounds to apply. Exygy is an equal opportunity employer. We will not discriminate against applicants because of race, color, sex (including pregnancy), sexual orientation, gender identity or expression, age, religion, national origin, disability, ancestry, marital status, veteran status, medical condition, or any protected category prohibited by local, state, or federal laws. All employees and contractors of Exygy are responsible for maintaining a work atmosphere free from discrimination and harassment by treating others with dignity and respect.

    Required profile

    Experience

    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Site Reliability Engineer (SRE) Related jobs