From AP Scheduler to Cloud Scheduler: A Smarter Approach to Scheduling
Consider Django. Why do we use it? Django serves as a powerful back-end framework that handles client-side requests, processes them, and returns responses. But what happens when a request takes longer to process? Delayed responses can impact the client-side experience, leading to frustration and poor performance for your web application. This is where scheduling tools come into play. A scheduler can help offload time-consuming tasks by handling them periodically or asynchronously, improving overall user experience and reducing the server’s load. While Django doesn’t come with built-in scheduling capabilities, you can rely on external libraries like Celery Beat, AP Scheduler, and Cloud Scheduler to effectively manage and schedule tasks, ensuring smoother operations and better client-side responsiveness. Key Considerations: Complexity of Tasks: For simple tasks, Django Q or APScheduler might suffice. For complex workflows and distributed processing, Celery Beat and Cloud Scheduler is often preferred. Scalability: If you need to distribute tasks across multiple workers, Celery Beat is a strong contender. Integration: Choose a scheduler that integrates well with your existing Django project and infrastructure. Maintainability: Consider the ease of configuration, monitoring, and maintenance of the chosen scheduler. When to use schedulers: Schedulers are an essential tool in computing that allows you to automate repetitive tasks and tasks that we have to perform at specific times. Automate repetitive tasks: Consider the following scenario: we need to create reports at predetermined intervals. To accomplish this, we must utilise schedulers, which are simple to use and manage. Executing tasks at specific times and specific intervals: Schedulers help you save time and money by automating tasks. Schedulers can be used to run jobs automatically rather than by hand. For instance, you can use an interval scheduler such as AP Scheduler to set a 10-minute interval if you need to add data to a table every ten minutes. After that, there will be no need to manually execute the script because the scheduler will automatically input the data every ten minutes. This guarantees that the task is completed precisely and consistently while also freeing up your time. Triggering Tasks Based on Events: Imagine a system that processes uploaded images. When a user uploads a photo, it’s saved to a specific folder. A scheduled task, perhaps using a file system watcher, could be triggered as soon as a new file arrives in that folder. The task would then automatically resize the image, create thumbnails, and store the processed versions, ready for display on the website. When NOT to use schedulers: While Schedulers are powerful tool but they are not the best solution always. Real-time tasks: if a task needs to respond immediately to user, then the scheduler is not a good choice. Schedulers are designed for background or delayed processing. For example, when a user clicks a button, the response should be immediate, not scheduled. One-off tasks: If you only need to run any task only once, and this is a simple operation, manually executing is simpler than a scheduled task. Creating a scheduled task adds overhead, so it’s not worth it for very simple, one-time operations. Over-complex scheduling requirement: if your process depends on failure/success of other tasks then the scheduler is not the best choice. In such cases, a dedicated workflow management system might be more appropriate. Let’s start with AP Scheduler and Cloud Scheduler: AP Scheduler: A package called AP Scheduler, or Advanced Python Scheduler, lets you plan when Python functions will run at particular intervals or times. It has two main scheduler types: Background Scheduler, which runs tasks in the background without interfering with the main program, and Blocking Scheduler, which stops the current thread until scheduled activities are finished. Because it ensures that the web server stays responsive while managing scheduled tasks, Background Scheduler is especially well-suited for Django applications. Using AP Scheduler in Django for Firestore Data Management: We are taking an example in which we are using Django+apscheduler+firestore. In an application where we are dealing with large numbers of incident data, it’s very difficult to ensure that the database remains optimised and clutter-free. If we are storing incident data in a Firestore collection named incident but only need to keep records for the last 30 days, we have to implement a mechanism in which we archive older incidents. A practical solution is to use AP Scheduler and schedule a background job that can automatically move older data to an archived incident collection every day at a specific time. This approach ensures that our Incident collection remains lightweight, improving performance and query efficiency. Project Setup (if not already done): Ensure you have a Django project set up. Configure your DB Setting: Make sure you have Google Firestore set up in your Django project. Add the Firestore credentials in settings.py. Create a .py file in apps (if not already created): Create a new file named scheduler.py in your apps. Install apscheduler: Install the apscheduler library using pip: Write the APScheduler Job: Create a scheduler.py file in your Django apps and here write your scheduler and its job. Here we write a function archive_old_incidents in which we define we have to move incidents older than 30 days to the ‘Archived_Incidents’ collection. Modify apps.py to Start APScheduler: In your Django app, open apps.py and update it: How this task operates: In this solution, incident data older than 30 days is automatically moved to an archive collection in Firestore using the apscheduler in Django. The start_scheduler() function from scheduler.py is called by the ready() method in apps.py when Django starts, initialising a background scheduler. A cron task is then configured by the scheduler to run daily at midnight (00:00). The archive_old_incidents() function runs at the appointed time, retrieving all incidents older than 30 days from the Incidents collection. After that, these events are removed from the original collection and copied to the Archived_Incidents collection. Cloud scheduler: Google Cloud’s Cloud Scheduler managed cron job service, you can schedule tasks by setting up scheduled triggers for HTTP/S endpoints, Pub/Sub topics, or App Engine services. Using









