Site Reliability Engineer Job at Xsolla, Remote

VkJxTXp1czF1d25sTzIvQnFPSWxabTVZUFE9PQ==
  • Xsolla
  • Remote

Job Description

Requirements
  • Proven experience as a Site Reliability Engineer, or similar Software Engineering role in a large-scale production environment ( 5 years to 10 years)
  • overall in IT area (as Ops or Developer).
  • Proficiency in scripting languages such as Python, Bash. Strong understanding of Go and PHP will be a plus.
  • Deep knowledge of monitoring systems such as Datadog, Prometheus, Grafana.
  • Good understanding of continuous integration/continuous delivery processes and platforms (Gitlab preferred). Experience with Helm.
  • Experience with Docker, Kubernetes, or other container orchestration systems.
  • Familiarity with infrastructure automation tools like Terraform.
  • Experience with automation, system administration, and system hardening.
  • Experience with Linux-based infrastructures, Linux/Unix administration.
  • Demonstrated problem-solving skills, particularly debugging and troubleshooting complex software systems. Ability to work under pressure.
  • Excellent communication skills with a capacity to articulate and solve complex technical problems
  • Xsolla Technology Stack: Ubuntu, Kubernetes, Gitlab, Terraform, Terragrunt, Puppet, Nginx, Google Cloud Platform, Datadog, Prometheus, Grafana,
  • ELK, Zabbix and Harbor.
Responsibilities
  • Ensure high reliability and availability and meet SLAs, SLOs, and SLIs.
  • Monitor the system for issues and respond to incidents, ensuring quick resolution to maintain high system availability.
  • Drive incident resolution and process improvements to minimize downtime and increase operational transparency.
  • Ensure all key services are measured, monitored and raising alerts when needed.
  • Develop comprehensive monitoring solutions to provide full visibility to the different platform components using tools and services like Kubernetes, Datadog, Prometheus, Grafana and others.
  • Support services before they go live through activities such as capacity planning, monitoring setup, logging, and production readiness reviews.
  • Engage in service capacity planning and demand forecasting, performance analysis, and system tuning.
  • Collaborate with the development teams to enhance the product's operational stability.
  • Build and drive the automation systems that maintain system health
Education
  • IT professional certifications are not required, but it will be a plus
  • Certified Kubernetes Administrator or Developer
  • HashiCorp Certifications
  • GCP Certifications

Job Tags

Remote job,

Similar Jobs

Orr & Reno

Corporate Attorney Job at Orr & Reno

Orr & Reno seeks a mid-level transactional attorney to join its Concord, New Hampshire-based corporate practice team representing and advising closely-held businesses and entrepreneurs, both within New England and beyond. Responsibilities will include a wide range of matters...

Staples

Warehouse Order Picker Full Time 2nd Shift Job at Staples

 ...customers.**What you'll be doing:**As a warehouse associate you may work in one of the following four areas:Order Picker: You will pick...  ...: You will be responsible for selecting product, pulling, and packing totes, as well as cutting cases of merchandise and/or stocking... 

Outlier AI

Creative Operations Coordinator (Remote) Job at Outlier AI

Join a global community of talented professionals to shape the future of AI. Earn up to $15 USD/hr and additional rewards based on quality of submission. Outlier is committed to improving the intelligence & safety of AI models. Owned and operated by Scale AI , weve ...

Blue Sky Plumbing & Heating

Drain Technician Job at Blue Sky Plumbing & Heating

 ...you may be the perfect fit for our team. Summary: Primarily responsible for performing residential and light commercial drain cleaning and hydro jetting. Light plumbing repairs may also be required. Be on time and alert for assigned shift and work entire... 

Zobility

Production Planner Job at Zobility

Responsibilities: Program Office and Operational Execution Support. Focus on production planning, manufacturing readiness, review and generation of production plans. Oversee cross functional special project planning and execution. The ideal candidate would be ...