Open

DevOps – Car Cloud System Observability 

Posted 5 hours ago by Anthon Byberg
Gothenburg
Apply Now

Apply for this job

Job Description

In the Connectivity and Cloud Platform team, we work build products that make owning a car better.
We work on platforms such as Connected Car Cloud Platform, Telematics Platform and Mobile Network Access to connect our cars all over the world.
With these platforms we enable other product teams to efficiently build customer applications on top.

We are now looking to reinforce our team with a passionate Observability DevOps Developer.
You will be part of a team with colleagues in the same role with whom you can share and discuss experiences and problems. Are you curious, open for change and self-driven we would like to see an application from you.

You and your skills
We believe you have more than 2-10 years of experience on cloud infrastructure implementing monitoring such as log management, metrics and traces. This with understanding of technical architecture, development, design, and implementation of projects on various domains.
You probably have following education – BSc degree or higher in Computer science engineering or equivalent.   

Key Responsibilities
Design and Implement Observability Solutions: Build and maintain observability tools (monitoring, logging, tracing) to ensure the health and performance of microservices running on AWS. 
Monitoring & Logging: Set up and optimize monitoring using tools like Prometheus, Grafana, 
CloudWatch, OTEL and Splunk stacks for real-time insights into the AWS infrastructure. 
Distributed Tracing: Implement distributed tracing solutions (e.g., OpenTelemetry, Jaeger) to trace and debug service interactions across multiple microservices. 
Proactive Alerting: Establish alerting mechanisms to detect performance anomalies and potential failures in real-time. 
Dashboards & Reporting: Create dashboards and reports to monitor service-level objectives (SLOs), key performance indicators (KPIs), and overall system health. 
Incident Management: Investigate and troubleshoot issues, identifying root causes, and providing insights to reduce mean time to detection (MTTD) and mean time to resolution (MTTR). 
Collaboration with Teams: Collaborate with DevOps and development teams to ensure observability best practices are embedded into CI/CD pipelines and infrastructure as code (IaC) practices. 
Automation & Optimization: Automate manual monitoring and incident management processes to reduce operational overhead. 

Tool chain used
Frameworks: Docker, Kubernetes 
Infrastructure: AWS 
Development & GitOps tools: Gitlab, ArgoCD, Harbor, Sonarqube, Dependency Tracker, GIT 
Observability support tools: OTEL, Splunk, Pagerduty, Apica, Grafana, Slack, Confluence, Jira 
SW Languages: Python, Java, JavaScript, Typescript, Terraform, Ansible