Match score not available

Senior Platform Reliability Engineer

Remote:

Full Remote

Experience:

Senior (5-10 years)

Work from:

United Kingdom

Offer summary

Qualifications:

Bachelor's degree in Computer Science or related field, Proven experience as a Site Reliability Engineer, Extensive experience with AWS services, Strong proficiency in Terraform, Proficient in Python or Bash scripting.

Key responsabilities:

Collaborate to design, build, and maintain cloud infrastructure
Develop monitoring solutions and manage incidents
Optimize system performance, reliability, and costs
Implement automation tools and CI/CD pipelines
Ensure security compliance and documentation

Luupli

11 - 50 Employees

See more Luupli offers

Job description

Job Title: Site Reliability Platform Engineer

About Luupli

Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and the planet. Our team is made up of passionate and dedicated individuals who are committed to making Luupli a success.

Role Description

We are seeking a talented and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure and services, primarily hosted on AWS. If you have a passion for problem-solving, a deep understanding of AWS services, hands-on experience with Terraform, and proficiency in scripting with Python or Bash, we invite you to apply for this exciting opportunity.

Role And Responsibilities

Infrastructure Design and Automation:
Collaborate with software engineering and operations teams to design, build, and maintain cloud-based infrastructure using AWS and Terraform.
Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure components.
Monitoring and Incident Management:
Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues.
Participate in incident response and root cause analysis efforts to drive continuous improvement and prevent future incidents.
Reliability and Performance Optimization:
Optimise system performance, reliability, and cost efficiency through continuous monitoring, performance tuning, and capacity planning.
Identify opportunities to automate manual processes and improve system resilience.
Scripting and Automation:
Utilise Python or Bash scripting to create and maintain automation tools for various operational tasks and deployments.
Implement and improve continuous integration and continuous deployment (CI/CD) pipelines.
Security and Compliance:
Collaborate with security teams to implement best practices for securing cloud infrastructure and services.
Ensure compliance with relevant industry standards and regulations.
Deployment and Release Management:
Support CI/CD pipelines for application deployments and updates.
Contribute to the design and implementation of deployment strategies that promote zero-downtime releases.
Documentation and Knowledge Sharing:
Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution procedures.
Participate in knowledge sharing with team members to enhance overall expertise and skill sets.

Requirements

Education and Experience:
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
Proven experience as a Site Reliability Engineer or similar role.
Technical Skills:
Extensive experience with Amazon Web Services (AWS) and its core services (EC2, S3, RDS, IAM, etc.).
Strong proficiency in infrastructure-as-code (IaC) tools, with a focus on Terraform.
Proficient in scripting with Python or Bash for automation and operational tasks.
Solid understanding of networking principles and protocols.
Knowledge of CI/CD pipelines and related tools.
Problem-Solving and Analytical Abilities:
Ability to diagnose and resolve complex technical issues in a fast-paced environment.
Analytical mindset to proactively identify potential system weaknesses and performance bottlenecks.
Collaboration and Communication:
Strong teamwork and collaboration skills to work effectively with cross-functional teams.
Excellent verbal and written communication skills.

Compensation

This is an equity-only position, offering a unique opportunity to gain a stake in a rapidly growing company and contribute directly to its success.