A job scheduler is a system that manages the execution of tasks at specified times or intervals. These jobs may include batch processing, data pipelines, report generation, or recurring background tasks such as sending reminders.
Examples include cron, Apache Airflow, and Kubernetes CronJobs.
In this article, we will walk through the process of designing a scalable distributed job scheduling service that can handle millions of tasks, and ensure high availability.
Let’s begin by clarifying the requirements.
Before diving into the design, let’s outline the functional and non-functional requirements.