Design for scalability¶
The job pipeline maintains one queue for every job type.
When a job is successfully validated, the pipeline will enqueue the job into the corresponding job type queue where they will remain in the status enqueued until they can be dispatched to the real-time engine where they will be processed. See Job states for more information about the lifecycle of a job.
The quota control system enforces a Job dispatch rate for every job type queue of your job pipeline, see Job dispatch rate for more details. As soon as the job type queue has available quota, a certain amount of jobs in that queue will be dispatched to the real-time engine.
Under normal conditions, the processing of a job inside the real-time engine may take between 5 seconds and 1 minute depending on a variety of factors.
Your application can influence which jobs in the queue are dispatched preferrentially and for how long they can stay in the queue:
- Publish a job request as soon as possible. This allows our system to optimize the utilization of your Job dispatch rate and available capacities for the job type in demand. If your application would just accumulate the workload throughout hours or days and then request them in large bulks in short period of time, it is more likely to experience longer turn-around time and lower than technically possible throughput.
- Publish a job request with a high
priorityvalue if you need them to be dispatched before other jobs of the same job type and use lowerpriorityvalue for jobs that are not time sensitive. Setting the job'spriorityfield can be useful, if your application uses the same repository and job type for multiple use-cases where one use-case is more time sensitive than the others. - Publish a job request with the Pub/Sub attribute
modigieJobExpireAfterand a period of time until when your application can tolerate the dispatching. Most users use their Pub/Sub repository to run background validation, verification, or enrichment where turnaround time is of no concern. Utilize the expiration option in scenarios where your application requests a job, but due to excessive time in the queue, a job would be dispatched long after the time the result would benefit your business.
Common Scenarios¶
Here are some common scenarios that illustrate how to effectively use job priority and expiration:
| Use Case | Priority | Expiration (modigieJobExpireAfter) |
Benefit |
|---|---|---|---|
| Time-sensitive enrichment | 75 - 100 (High) |
PT1H (1 hour) |
Ensures that jobs critical for immediate business operations are processed quickly and are not delayed by less important jobs. If the job cannot be processed within the hour, it is canceled, preventing wasted resources on outdated tasks. |
| Standard background processing | 50 (Default) |
Not set | Ideal for routine tasks like daily data validation or enrichment where immediate processing is not critical. These jobs will be processed in the order they are received after all high-priority jobs are dispatched. |
| Low-priority data cleansing | 0 - 25 (Low) |
P7D (7 days) |
Use for bulk data cleansing or other non-urgent tasks that can be performed during off-peak hours. The long expiration time ensures that these jobs will eventually be processed without interfering with more critical workloads. |
By strategically combining priority and modigieJobExpireAfter, you can build a robust and scalable system that can handle a wide variety of workloads and business requirements.