This project demonstrates a simple geospatial data pipeline orchestration using Apache Airflow, designed to update weather data for around 30 cities in India every 5 minutes. It serves as a practical introduction to the orchestration of geospatial data pipelines, where you will learn essential concepts related to Docker, Docker Compose, Airflow, and microservices.
-
Data Acquisition: The pipeline fetches weather data from an external API and processes it before exporting the results to a CSV file. This is achieved by creating a Directed Acyclic Graph (DAG) in Apache Airflow.
-
Scheduled Updates: The DAG is configured to run every 5 minutes, ensuring that the weather data remains up-to-date and reflects the latest conditions.
-
Interactive Map Visualization: A Streamlit application is developed to display an interactive map that visualizes the weather data for various locations across India. The Folium library is utilized to create this map, with each city represented by markers that indicate temperature.
-
Real-Time Updates: The Streamlit app is set to refresh every 5 minutes, allowing users to view the most current weather information without needing to manually reload the page.
-
Folder Structure:
app/ ├── app.py ├── Dockerfile ├── requirements.txt config/ dags/ ├── dag1.py ├── dag2.py plugins/ docker-compose.yaml
The folder structure organizes the project into distinct directories:
app/
: Contains the main application files, including the Streamlit app and Docker configuration.config/
: Holds configuration files.dags/
: Contains the Airflow Directed Acyclic Graphs (DAGs) for task orchestration.plugins/
: For any custom Airflow plugins.docker-compose.yaml
: Facilitates the orchestration of the entire application using Docker.
- Docker: Before setting up, make sure docker is installed in your system. Otherwise refer to Docker Installation
-
Clone the Repository:
git clone https://github.com/kavyajeetbora/airflow_streamlit_orchestration.git cd <repository-directory>
-
Setting Up Weather API Key:
- Sign Up: Create an account on the OpenWeatherMap website to obtain your API key.
- Generate API Key: After logging in, navigate to the API section and generate a new API key for your application.
- Configure Your Application: In the
config/
folder, refer to theconfig/config_example.txt
file for the format to enter your API key and create your own configuration file within theconfig
folder (config/config.cfg
).
-
Build the Docker Image:
docker-compose airflow-init
-
Run the Application in detached mode:
docker-compose up -d
- Airflow Web UI: http://localhost:8080
- Streamlit App: http://localhost:7751
To add additional Python packages, modify the Dockerfile
in the app
directory and rebuild the image.
- Learn Docker in 1 hour
- Airflow Tutorial for Beginners - Full Course in 2 Hours
- Airflow Documentation
- Deploying Streamlit using Docker
- Introduction to docker compose ?
- Why Docker Compose ?
This project is licensed under the Apache License, Version 2.0.