Hi, my name is

Victor Bogo.

I am a continuous learning Site Reliability Engineer

I once read that “Hope is not a strategy” and since then I try to use tools and processes to improve reliability outcomes

About Me

Working as a Software Engineer in the first years of my career made me get in touch with interesting tools and platforms such as AWS, Docker, Kubernetes and Prometheus. After some time, I started to realize that the Site Reliability Engineering responsibilities and practices were something that I would really appreciate to work with. Here are a few technologies I've been working with recently:
  • Terraform
  • Amazon Web Services
  • Golang
  • Kubernetes
  • Prometheus
  • Thanos
  • Docker
  • OpenTelemetry
  • EFK logs stack
  • PostgreSQL


Site Reliability Engineer
37signals (Basecamp & Hey)

37signals is the company behind the well-known Basecamp and, more recently, Hey e-mail. It has been at the forefront of software development and remote work culture for decades and has chosen to be profitable since the 90s. Responsible for the conception of Ruby on Rails and other open-source tools, 37signals shared a lot with the community and continues to be one of the organizations that provoke changes in the industry.


  • Scale Prometheus monitoring in the cloud by running Thanos and its components
  • Revamp Prometheus on-prem monitoring by recreating the whole setup
Senior Site Reliability Engineer
Intricately Data

Intricately is a data company based in the US that provides go-to-market technical information for anyone who sells products to tech companies. After a few weeks in Intricately, it got sold to HG Insights, a company in the same market.


  • Build terraform modules for AWS Backup and Vault monitoring
  • Evaluate possible solutions for migrating an on-VM MongoDB to a managed solution
  • Tackle security tasks reported by a third-party organization
Senior Site Reliability Engineer
QuintoAndar Real Estate

QuintoAndar is an online real estate services company similar to Airbnb but for the long term. It is the second largest startup in LatAm, just behind Nubank. It has around 300 micro-services and more than 700 Kubernetes nodes in total. It has a big SRE team (approximately 70), and I was part of the observability-focused team.


  • Maintain all Prometheus instances used as the primary monitoring system. Each instance ran on top of Kubernetes and had more than 4TB of data on disk.
  • Install and configure Thanos to scale Prometheus monitoring through a lot of different Kubernetes clusters and improve how efficiently we could store monitoring data for the long-term
Site Reliability Engineer
Pier Insurtech

Pier is the first Brazilian Insurtech, trying to solve the problem related to the bureaucracy and lack of high granularity data-driven decision making in the insurance world. At Pier, we needed to start structuring the DevOps/SRE culture from scratch, bringing some useful practices and tools to the table like post mortems, incident response, key engineering metrics like MTTD, MTTR, success rate and latency.


  • Start monitoring apps using Newrelic APM
  • Build the logging stack using ElasticSearch, FluentD and Kibana
  • Define the logging pattern
  • Organize PostgreSQL replicas, users and overall maintenance
  • Secure AWS account and IAM related configuration
2018 - 2020
Site Reliability Engineer
ContaAzul Software

As ContaAzul’s platform is cloud based, all our customers access the same production environment and the number of customers is always growing we need to understand and apply some important concepts and practices in order to improve our overall service reliability. Besides being responsible for our cloud infrastructure, the SRE team is always looking into keeping the DevOps culture warm.

Some of this practices are:

  • Post Mortems, to keep learning from failure
  • Automation, to scale and keep things consistent
  • SLIs, SLOs and SLAs, to build a contract between our product and development teams seeking balance
  • Release Engineering, to guarantee that changes to the production environment have a controlled impact
  • Monitoring, to enable true understanding of what is happening in production

We currently own a self-managed Kubernetes cluster deployed to AWS responsible for running different kinds of application containers (microservices, jobs etc) and our data is mainly saved on PostgreSQL instances and S3

Tech: Kubernetes, Docker, Prometheus, AWS, Grafana, Opentracing, Jaeger, Java

2015 - 2018
Software Engineer
ContaAzul Software

Team: Billing & Payments

Our goal is to allow any customer using ContaAzul to exchange money through our platform. At ContaAzul, we have a lot of specific strategies related to payments. As no available platform met our requirements we opted to build and maintain our own payments infrastructure from the UIs that allow our customers to choose and pay using their preferred payment methods to the back-end that securely calculates things like subscriptions, recharge and discounts.

We have some core values respected and followed by everyone in the team:

  • Quality
  • Reliability
  • Testability
  • Software engineering principles

Our back-end microservices are mainly in Java (SprintBoot) and our front-end uses AngularJS with Material Design.

2013 - 2015
Software Engineer
Softexpert Software
Specific artifact development using web technologies (HTML, JavaScript, Ext, Ajax) with Object Oriented PHP and Java back-end connected to MS SQL Server, Oracle and Postgres. Java development environment configuration with Apache Ant, Maven and Tomcat. iReport report creation and maintenance.
2012 - 2013
Infrastructure Intern
Opentech Risk Management
Solved hardware and software day-to-day problems. General computer maintenance (computer parts replacement, cleaning and saving). Installation and configuration of new work environments (Formatting, program installation and initial configuration). Monitoring of automated IT processes.

Get in Touch

Feel free to reach me out whenever you have any doubts or topics to discuss :)