Ritesh Sonawane

SRE | CKA | Kubernetes | Golang | Backend Engineering

About Me

Hi there! 👋 I'm Ritesh, a 23-year-old SRE, currently contributing remotely at CloudRaft ⛵. My expertise lies in Kubernetes and Golang, where I focus on driving efficiency and scalability. I take great pride in sharing my insights through blogs — feel free to explore them!

Beyond blogging, I curate a POC and Learnings section, showcasing innovative approaches to streamlining distributed systems like Kubernetes.

My current focus is on transforming SRE practices through unified toolsets, advancing toward Platform Engineering. I'm also exploring AI agents and their integration with SRE tools to unlock new possibilities.

If you're passionate about this field, currently exploring similar areas, or eager to learn, I’d love to connect and exchange ideas.

Don't forget to connect!

Experience

Associate SRE

CloudRaft (Remote) | April 2024 - Present

✦ Worked on Kubernetes, Observability, and Golang. Managed production Kubernetes clusters across AWS, Azure, and Bare Metal environments.

✦ Contributed to the internal AI-cloud platform, enabling the deployment of GPU-based VMs and VLLM workloads.

✦ Worked with open-source monitoring tools like Thanos, Mimir, Cortex, and VictoriaMetrics.

✦ Created GitOps-based CI/CD using ArgoCD, ArgoCD Image Updater, and related tools.

I have shared some insightful blogs based on my work at CloudRaft—feel free to check them out!


DevOps Engineer

Makerble (Remote) | October 2023 - April 2024

Here I started my career as a Kubernetes Admin. I was responsible for complete Infra.

✦ Led cloud migration from AWS EKS to Azure AKS across Staging, Pre-Production, and Production environments using Terraform Infrastructure as Code (IaC), achieving 40% cloud cost reduction and improved scalability.

✦ Optimized AWS infrastructure costs by 17% through Kubernetes resource optimization, implementing efficient CPU/Memory requests and limits, and advanced pod scheduling with Node Affinity and Taints/Tolerations.

✦ Implemented NGINX Ingress Controller with custom error pages to enhance user experience, maintain brand consistency, and improve SEO through proper HTTP status code handling.

✦ Designed and deployed automated CI/CD pipelines using Tekton for continuous integration and continuous deployment, reducing deployment time and human error across all environments.

✦ Automated Ruby on Rails Rake task execution across Staging, Pre-Production, and Production environments, streamlining developer workflows and database maintenance operations.

✦ Integrated automated Testsigma test execution post-Production deployment with Slack notifications for real-time QA visibility and faster feedback loops.

✦ Resolved AWS VPC IP address exhaustion by architecting and implementing secondary CIDR blocks and subnet expansion for EC2 instance scaling.

✦ Deployed self-hosted GitHub Actions runners on Oracle Cloud Infrastructure (OCI) to accelerate CI/CD pipeline execution speed and reduce GitHub Actions costs.

✦ Configured Rollbar error tracking integration with ArgoCD deployment hooks for comprehensive deployment monitoring and error logging across all environments.

✦ Implemented Robusta for proactive Kubernetes cluster monitoring, automated remediation, and intelligent alerting to prevent production incidents.

✦ Deployed Uptime Kuma for 24/7 website uptime monitoring and availability tracking with instant Slack alerts for downtime incidents.

✦ Configured Redis Insight dashboard with Kubernetes Ingress for real-time Redis performance monitoring, memory usage analysis, and query optimization.

✦ Architected secure VPN infrastructure using WireGuard and deployed Passbolt self-hosted password manager with VPN-only access for enhanced security posture.

✦ Created comprehensive Azure Kubernetes Service (AKS) architecture documentation and CI/CD pipeline diagrams for knowledge transfer, maintainability, and scalability planning.

✦ Integrated DeepSource for automated static code analysis, code quality monitoring, and technical debt reduction in continuous integration workflows.

✦ Automated Azure Virtual Machine startup scheduling using Azure Logic Apps to optimize cloud resource utilization and reduce unnecessary compute costs.

✦ Built comprehensive observability stack with Prometheus metrics collection and Grafana dashboards for infrastructure monitoring, application performance monitoring (APM), and SLA tracking on Azure.

✦ Deployed PerfectScale (formerly PrefectScale) for AI-powered Azure Kubernetes cost optimization and resource rightsizing recommendations.

✦ Implemented Snyk security scanning for vulnerability detection and remediation across application dependencies, container images, and Infrastructure as Code (IaC).

✦ Achieved zero downtime on Staging and Pre-Production environments by implementing Kubernetes health checks: Liveness Probes, Readiness Probes, and Startup Probes for port 3000 applications.

✦ Deployed BotKube for real-time Kubernetes event monitoring and automated Slack notifications for cluster alerts, pod failures, and deployment status updates.

✦ Implemented EFK stack (Fluent Bit, Elasticsearch, Kibana) for centralized log aggregation, log analysis, and visualization across distributed Kubernetes workloads.

✦ Optimized Ruby on Rails RuboCop CI pipeline from 45 minutes to 5 minutes (89% faster) by implementing parallel job execution and custom caching strategies.