Quotas and limitations¶
When you use the Modigie Integration for Cloud Pub/Sub API, your usage is subject to quotas and limits from both Google Cloud Pub/Sub and the Modigie API. Use this guide to design your application for robust throughput at the scale your business needs.
Summary of limits
As a user of the Modigie API, you should be aware of three types of limits:
- Google Pub/Sub Quotas: These govern how quickly you can publish job requests to us and how quickly you can consume job responses. For most applications, these limits are very high.
- Modigie Job Dispatch Quotas: This is the rate at which Modigie processes your jobs from the queue. This is the most common limit your application will encounter.
- Modigie Job Queue Depth: This is the maximum number of jobs that can be waiting for processing at any given time (e.g., 100,000,000 per job type).
Best practices
- Design your application with exponential backoff for job submission.
- Anticipate that jobs may be queued during bursts, not processed instantly.
- Contact us if your application requires higher throughput than the standard quotas allow.
Google Cloud Pub/Sub quotas and limits¶
As described in the page Architecture of the Pub/Sub Integration, your application's repository provides tenant-specific Cloud Pub/Sub resources that are relevant for calculating throughput:
- Per job type: 1 Cloud Pub/Sub topic in a large region (
us-central1) - Per job type:
- 1 Cloud Pub/Sub pull subscription in a large region (
us-central1) - and/or 1 Cloud Pub/Sub push subscription in a large region (
us-central1)
- 1 Cloud Pub/Sub pull subscription in a large region (
By default, all Pub/Sub subscriptions are with-out exactly-once delivery, which is relevant for the quota limit in Cloud Pub/Sub.
Cloud Pub/Sub quotas¶
Note
The following calculations are base on these assumptions:
- The size of any Pub/Sub message (job request or response) will not exceed 100 kB.
- For each job, your application may receive up to ten response messages, but a typical lifecycle involves six responses (one per
status). For a list of all states, please refer to Life cycle of a job.
The following Google Cloud quotas are the most relevant for your application:
pubsub.googleapis.com/regionalpublisher-
Publisher throughput per region: 240,000 job requests per minute (4 GB/s) per job type
pubsub.googleapis.com/regionalpushsubscriber-
Push subscriptions throughput per region: 26,400 job responses per minute (440 MB/s) per job type
This typically translates to all job responses for around 4,400 unique jobs per minute per job type.
pubsub.googleapis.com/regionalsubscriber-
Pull subscriber throughput per region: 240,000 job responses per minute (4 GB/s) per job type
This typically translates to all job responses for around 40,000 unique jobs per minute per job type.
Tip
For details about all applicable quotas please refer to Pub/Sub quotas and limits.
Cloud Pub/Sub limits¶
Important
Google Cloud limits can't be increased.
The following Google Cloud limits are the most relevant for your application:
| Resource | Limits |
|---|---|
| Publish requests | 10MB (total size) and 1,000 messages |
| Message size (the data field) | 10MB |
| Attributes per message | 100 |
| Attribute key size | 256 bytes |
| Attribute value size | 1024 bytes |
Tip
For details about all applicable limits please refer to Pub/Sub quotas and limits. Limits can't be increased.
Modigie API quotas¶
With an active subscription, your application's Pub/Sub repository maintains a job pipeline for enqueuing and processing all jobs.
Job dispatch rate¶
To ensure fair use and system stability, we limit the rate at which jobs are processed. We protect our users from the “noisy neighbors” problem by reserving job processing capacity based on expected monthly usage, while still allowing for bursts in your workload.
Job quotas are enforced individually for each job type in your repository's pipeline and they are exclusive for each repository.
Time windows¶
To accommodate bursts in usage, we use a Token Bucket Algorithm with multiple rolling time windows to determin the effective job dispatch rate for your application. We maintain a separate quota for each job type and window size:
- past 30 days
- past 7 days
- past 1 day
- past 1 hour
- past 10 minutes
- past 1 minute
Each quota meters the number of enqueued jobs that the pipeline dispatched to the real-time engine in each time window and it limits how many can be dispatched at any given time. This system allows for workload peaks in smaller time windows, meaning the 1-minute rate is higher than simply the 30-day rate divided by the number of minutes.
Depending on the quota limits configured for your repository and how many jobs have been dispatched in each time window, the system calculates how much quota is available per time window. The limiting factor of a token bucket is the smallest available number of all time windows in the same bucket.
Example¶
Let's walk through an example to see how burst limiting works in practice. Along with your repository configuration, Modigie will share your specific quota limits. They may look like this:
| Job type | Metric | Limit |
|---|---|---|
verifyEmployment |
past 30 days | 100,000 |
verifyEmployment |
past 7 days | 40,833 |
verifyEmployment |
past 1 day | 10,208 |
verifyEmployment |
past 1 hour | 744 |
verifyEmployment |
past 10 minutes | 217 |
verifyEmployment |
past 1 minute | 38 |
enrichMobile |
past 30 days | 20,000 |
enrichMobile |
past 7 days | 8,167 |
enrichMobile |
past 1 day | 2,042 |
enrichMobile |
past 1 hour | 149 |
enrichMobile |
past 10 minutes | 43 |
enrichMobile |
past 1 minute | 8 |
The example means that in any given minute no more than eight enrichMobile jobs will be dispatched and excessive jobs will remain in the queue. If in the past minute five jobs have been dispatched, three more jobs can be dispatched from the queue, unless this number would exceed the quota in any other time window.
Assume your application hasn't used any enrichMobile jobs and now requests 145 jobs at once. Within the first minute, the pipeline will dispatch eight of them immediately, leaving 137 jobs in the queue because the past 1 minute quota has been exceeded.
11:03 (8 jobs dispatched) - Jobs in queue: 137
| Job type | Metric | Limit | Consumed | Available |
|---|---|---|---|---|
enrichMobile |
past 1 hour | 149 | 8 | 141 |
enrichMobile |
past 10 minutes | 43 | 8 | 35 |
enrichMobile |
past 1 minute | 8 | 8 | 0 |
When a quota's token bucket is empty (like the past 1 minute bucket here), no more jobs of that type are dispatched. After 60 seconds, the tokens for that window are replenished.
As noted earlier, each rate is calculated separately and the pipeline could still dispatch 38 verifyEmployment jobs because there were none dispatched recently.
11:04 (Tokens replenished) - Jobs in queue: 137
| Job type | Metric | Limit | Consumed | Available |
|---|---|---|---|---|
enrichMobile |
past 1 hour | 149 | 8 | 141 |
enrichMobile |
past 10 minutes | 43 | 8 | 35 |
enrichMobile |
past 1 minute | 8 | 0 | 8 |
Another eight jobs are dispatched, leaving 129 in the queue.
11:04 (8 more jobs dispatched) - Jobs in queue: 129
| Job type | Metric | Limit | Consumed | Available |
|---|---|---|---|---|
enrichMobile |
past 1 hour | 149 | 16 | 133 |
enrichMobile |
past 10 minutes | 43 | 16 | 27 |
enrichMobile |
past 1 minute | 8 | 8 | 0 |
This repeats. After 5 minutes, 40 jobs have been dispatched. The past 10 minutes bucket is now becoming the limiting factor.
11:07 - Jobs in queue: 105
| Job type | Metric | Limit | Consumed | Available |
|---|---|---|---|---|
enrichMobile |
past 1 hour | 149 | 40 | 109 |
enrichMobile |
past 10 minutes | 43 | 40 | 3 |
enrichMobile |
past 1 minute | 8 | 0 | 8 |
The pipeline will now dispatch only three jobs. For the next few minutes, the past 10 minutes quota will be the bottleneck. Once the earliest dispatches are more than 10 minutes old, the available rate will increase again.
Metrics of the job dispatch rate¶
These are the quota metrics enforced by the job pipeline.
| Job Type | Time Window | Metric Name |
|---|---|---|
enrichLinkedIn |
Past 30 days | wox.modigie.io/enrichLinkedInDispatchedJobsP30D |
| Past 7 days | wox.modigie.io/enrichLinkedInDispatchedJobsP7D |
|
| Past 1 day | wox.modigie.io/enrichLinkedInDispatchedJobsP1D |
|
| Past 1 hour | wox.modigie.io/enrichLinkedInDispatchedJobsPT1H |
|
| Past 10 minutes | wox.modigie.io/enrichLinkedInDispatchedJobsPT10M |
|
| Past 1 minute | wox.modigie.io/enrichLinkedInDispatchedJobsPT1M |
|
enrichMobile |
Past 30 days | wox.modigie.io/enrichMobileDispatchedJobsP30D |
| Past 7 days | wox.modigie.io/enrichMobileDispatchedJobsP7D |
|
| Past 1 day | wox.modigie.io/enrichMobileDispatchedJobsP1D |
|
| Past 1 hour | wox.modigie.io/enrichMobileDispatchedJobsPT1H |
|
| Past 10 minutes | wox.modigie.io/enrichMobileDispatchedJobsPT10M |
|
| Past 1 minute | wox.modigie.io/enrichMobileDispatchedJobsPT1M |
|
verifyEmployment |
Past 30 days | wox.modigie.io/verifyEmploymentDispatchedJobsP30D |
| Past 7 days | wox.modigie.io/verifyEmploymentDispatchedJobsP7D |
|
| Past 1 day | wox.modigie.io/verifyEmploymentDispatchedJobsP1D |
|
| Past 1 hour | wox.modigie.io/verifyEmploymentDispatchedJobsPT1H |
|
| Past 10 minutes | wox.modigie.io/verifyEmploymentDispatchedJobsPT10M |
|
| Past 1 minute | wox.modigie.io/verifyEmploymentDispatchedJobsPT1M |
|
verifyMobile |
Past 30 days | wox.modigie.io/verifyMobileDispatchedJobsP30D |
| Past 7 days | wox.modigie.io/verifyMobileDispatchedJobsP7D |
|
| Past 1 day | wox.modigie.io/verifyMobileDispatchedJobsP1D |
|
| Past 1 hour | wox.modigie.io/verifyMobileDispatchedJobsPT1H |
|
| Past 10 minutes | wox.modigie.io/verifyMobileDispatchedJobsPT10M |
|
| Past 1 minute | wox.modigie.io/verifyMobileDispatchedJobsPT1M |
Job queue depth¶
This is the maximum number of jobs of a certain type that can be in the pipeline in a created, validated, or enqueued status at any given time. For a diagram of all states, see Life cycle of a job.
If your application publishes job requests at a higher rate than jobs are disped, the queue depth will increase. As described earlier, the rate with which jobs are dispatched into the real-time engine for processing is limited by your Job dispatch rate.
Exceeding Queue Depth
When the job queue depth limit is reached, the system will not reject new jobs. However, you will experience significantly increased processing latency until the queue size is reduced. We strongly recommend monitoring your queue depth and pausing job submission if the limit is approached.
wox.modigie.io/enrichLinkedInJobQueueDepth- 100,000,000 enqueued
enrichLinkedInjobs per pipeline wox.modigie.io/enrichMobileJobQueueDepth- 100,000,000 enqueued
enrichMobilejobs per pipeline wox.modigie.io/verifyEmploymentJobQueueDepth- 100,000,000 enqueued
verifyEmploymentjobs per pipeline wox.modigie.io/verifyMobileJobQueueDepth- 100,000,000 enqueued
verifyMobilejobs per pipeline