Articles about Data Engineering

Running Airflow on Google Kubernetes Engine without Helm

Google Cloud Platform (GCP) can be a very good option for Airflow and, although it offers its own managed deployment of Airflow, Cloud Composer, managing our own deployment gives us more granular control over the underlying infrastructure, impacting choices such as what Python version to run and even when to upgrade Airflow itself.

The Airflow community maintains a Helm chart for Airflow deployment on a Kubernetes cluster. The Helm chart comes with a lot of resources, as it contains a full Airflow deployment with all the capabilities. We didn’t need all of that, and we wanted granular control over the infrastructure. Therefore, we chose not to use Helm, although it provides a very good starting point for the configuration.

Read more »