Rate Limiting Best Practices

Rate limiting is a fundamental aspect of working with any API. Understanding why rate limits exist, how they work, and how to work effectively within them is crucial for building reliable integrations with monday.com.

Why Rate Limiting Exists

Rate limiting serves two primary purposes:

1. Protecting the API Infrastructure

Without rate limits, a single client could overwhelm the API with requests, degrading performance for all users. Rate limits ensure fair distribution of resources and prevent accidental (or intentional) abuse.

Think of it like a highway: without speed limits and traffic control, congestion would make the road unusable for everyone. Rate limits are the "traffic control" of APIs.

2. Encouraging Efficient API Usage

Rate limits incentivize developers to write efficient queries. Instead of making 100 requests for individual items, you're encouraged to batch them into a single request. This benefits both you (fewer network round-trips) and the API (fewer requests to process).

Monday.com's Rate Limiting Strategy

Unlike many APIs that use simple request-per-second limits, monday.com uses a complexity budget system. This is more sophisticated and fair.

Complexity Budget Model

Each monday.com account gets a complexity budget that regenerates over time:

Budget regenerates at a fixed rate (e.g., 100,000 points per minute)
Each query consumes points based on its complexity
Simple queries cost few points; complex queries cost many
When budget depleted, requests are rate limited until budget regenerates

Why complexity-based? It's fairer. A simple query that fetches one field shouldn't cost the same as a complex query joining multiple resources. Complexity budgets reward efficient queries.

Example:

ruby

# Low complexity (~10 points)
client.board.query(
  ids: [12345],
  select: ['id', 'name']
)

# High complexity (~500 points)
client.board.query(
  ids: [12345],
  select: [
    'id', 'name', 'description',
    { 'groups' => ['id', 'title', { 'items' => ['id', 'name', 'column_values'] }] }
  ]
)

The second query traverses multiple relationships and returns much more data, so it costs more.

Complexity Calculation and Query Cost

Understanding query cost helps you stay within your complexity budget.

Factors That Increase Complexity:

Nested relationships: Each level of nesting adds cost

ruby

# Low cost
select: ['id', 'name']

# Medium cost
select: ['id', 'name', { 'items' => ['id', 'name'] }]

# High cost
select: ['id', { 'groups' => ['id', { 'items' => ['id', { 'column_values' => ['id', 'text'] }] }] }]

Number of fields: More fields = higher cost

ruby

# Lower cost
select: ['id', 'name']

# Higher cost
select: ['id', 'name', 'description', 'state', 'board_kind', 'permissions', 'created_at', 'updated_at']

Number of results: Fetching 100 items costs more than fetching 10

ruby

# Lower cost
client.item.query(ids: [123], limit: 10)

# Higher cost
client.item.query(ids: [123], limit: 100)

Computed fields: Fields that require calculation are more expensive

Estimating Query Cost

monday.com's GraphQL API includes complexity information in responses. You can log this to understand your queries:

ruby

response = client.board.query(ids: [board_id])
complexity = response.dig('account_id', 'complexity')  # If available in response
logger.info("Query complexity: #{complexity}")

Rule of thumb: Start simple and add fields incrementally. Monitor which queries cause rate limiting and optimize those.

Rate Limiting Strategies: Proactive vs Reactive

There are two fundamental approaches to handling rate limits:

Reactive Strategy (Handle Errors)

Wait until you hit the rate limit, then back off:

ruby

def fetch_with_reactive_limiting
  begin
    client.board.query(ids: [board_id])
  rescue Monday::ComplexityError => e
    # Rate limited - wait and retry
    sleep(60)
    retry
  end
end

Pros:

Simple to implement
No complexity tracking needed
Maximizes throughput when under limit

Cons:

Requests fail and must be retried
Unpredictable latency (sudden delays when limit hit)
Can create thundering herd if multiple processes retry simultaneously

Proactive Strategy (Track and Throttle)

Track your complexity usage and throttle before hitting the limit:

ruby

class ComplexityTracker
  def initialize(budget_per_minute: 100_000)
    @budget = budget_per_minute
    @used = 0
    @window_start = Time.now
  end

  def track_request(estimated_cost)
    reset_if_new_window

    if @used + estimated_cost > @budget
      wait_time = 60 - (Time.now - @window_start)
      sleep(wait_time) if wait_time > 0
      reset_window
    end

    @used += estimated_cost
  end

  private

  def reset_if_new_window
    if Time.now - @window_start >= 60
      reset_window
    end
  end

  def reset_window
    @used = 0
    @window_start = Time.now
  end
end

Pros:

Predictable latency (no sudden rate limit errors)
Better for user experience (no failed requests)
More efficient (no wasted retry attempts)

Cons:

Complex to implement
Requires tracking state
May be overly conservative (wasting budget)

Hybrid Strategy (Best of Both)

Use proactive throttling with reactive fallback:

ruby

def fetch_with_hybrid_limiting
  tracker.track_request(estimated_complexity: 100)

  begin
    client.board.query(ids: [board_id])
  rescue Monday::ComplexityError => e
    # Reactive fallback if estimation was wrong
    logger.warn("Hit rate limit despite throttling")
    sleep(60)
    retry
  end
end

This combines the predictability of proactive throttling with the safety net of reactive handling.

Exponential Backoff: Why It Works

When you do hit a rate limit, exponential backoff is the gold standard retry strategy.

Linear Backoff (Don't Use)

ruby

# Bad: Linear backoff
retry_count.times do |i|
  sleep((i + 1) * 5)  # 5s, 10s, 15s, 20s
  retry
end

Problem: If many clients are rate-limited simultaneously (common during outages), they all retry at similar intervals, creating synchronized "waves" of requests that re-trigger the rate limit.

Exponential Backoff (Use This)

ruby

# Good: Exponential backoff
retry_count.times do |i|
  sleep(2 ** i)  # 1s, 2s, 4s, 8s, 16s, 32s
  retry
end

Why it works:

Backs off quickly: Gives the API (and your budget) time to recover
Disperses retries: Different processes retry at different times
Self-limiting: Long delays naturally limit retry attempts

Adding Jitter (Even Better)

ruby

# Best: Exponential backoff with jitter
retry_count.times do |i|
  base_delay = 2 ** i
  jitter = rand(0..base_delay * 0.1)  # Add 0-10% randomness
  sleep(base_delay + jitter)
  retry
end

Jitter adds randomness to prevent synchronized retries. If 100 clients are rate-limited at the same moment, jitter ensures they don't all retry at exactly the same time.

Queuing Requests: Benefits and Trade-offs

For high-volume integrations, queuing requests can smooth out traffic and prevent rate limiting.

Basic Queue Pattern

ruby

class MondayRequestQueue
  def initialize(requests_per_minute: 60)
    @queue = Queue.new
    @rate = requests_per_minute
    start_worker
  end

  def enqueue(request)
    @queue.push(request)
  end

  private

  def start_worker
    Thread.new do
      loop do
        request = @queue.pop
        execute_request(request)
        sleep(60.0 / @rate)  # Throttle to stay under limit
      end
    end
  end

  def execute_request(request)
    request.call
  rescue Monday::Error => e
    handle_error(e, request)
  end
end

Benefits:

Smooth traffic: Requests sent at steady rate, not bursts
Automatic throttling: Queue ensures you never exceed rate limit
Resilience: Failed requests can be re-queued
Prioritization: Implement priority queues for urgent requests

Trade-offs:

Added latency: Requests wait in queue before execution
Complexity: Requires queue management, monitoring, error handling
Memory usage: Large queues consume memory
Lost requests: Queue contents lost if process crashes (use persistent queue like Sidekiq/Redis)

When to Use:

High volume: Processing hundreds/thousands of requests per hour
Background jobs: Non-interactive operations where latency is acceptable
Batch operations: Syncing large datasets

When Not to Use:

Interactive requests: Users waiting for immediate responses
Low volume: Simple applications with occasional API calls
Real-time needs: Time-sensitive operations where queuing delay is unacceptable

Caching Responses: When It Helps

Caching API responses can dramatically reduce your rate limit consumption.

What to Cache

Good candidates:

Reference data (board schemas, column definitions)
Slow-changing data (board names, user lists)
Frequently accessed data (current user info)

Poor candidates:

Real-time data (item status updates)
User-specific data (for multi-user apps)
Large datasets (memory constraints)

Cache Implementation

ruby

def get_board_schema(board_id)
  cache_key = "monday_board_schema_#{board_id}"

  Rails.cache.fetch(cache_key, expires_in: 1.hour) do
    client.board.query(
      ids: [board_id],
      select: ['id', 'name', { 'columns' => ['id', 'title', 'type'] }]
    )
  end
end

TTL Considerations

Choosing the right Time-To-Live (TTL) is an art:

Short TTL (minutes):

Use for: Moderately dynamic data (item counts, recent updates)
Pro: More accurate data
Con: More API calls, less rate limit savings

Medium TTL (hours):

Use for: Slowly changing data (board configuration, user lists)
Pro: Balance between freshness and efficiency
Con: Data can be stale for parts of the day

Long TTL (days):

Use for: Static reference data (workspace structure, column types)
Pro: Maximum rate limit savings
Con: Stale data if structure changes

Indefinite (manual invalidation):

Use for: Truly static data
Pro: Zero API calls for cached data
Con: Must invalidate on changes (complex)

Cache Invalidation

The hard part of caching is knowing when to invalidate:

ruby

def update_board(board_id, attributes)
  result = client.board.update(board_id: board_id, **attributes)

  # Invalidate cache after update
  Rails.cache.delete("monday_board_schema_#{board_id}")
  Rails.cache.delete("monday_board_items_#{board_id}")

  result
end

Cache invalidation strategies:

Time-based (TTL): Simplest, works for most use cases
Event-based: Invalidate when data changes (requires tracking)
Versioned keys: Include version in cache key, bump on change
Background refresh: Refresh cache before expiry (always fresh, no cache misses)

Optimizing Query Complexity

The best way to avoid rate limiting is to reduce query complexity.

Technique 1: Request Only Needed Fields

ruby

# Bad: Fetching everything (high complexity)
client.board.query(
  ids: [board_id],
  select: ['id', 'name', 'description', 'state', 'board_kind',
           'permissions', { 'groups' => ['id', 'title', { 'items' => ['id', 'name'] }] }]
)

# Good: Fetching only what's needed (low complexity)
client.board.query(
  ids: [board_id],
  select: ['id', 'name', { 'groups' => ['id'] }]
)

Every field has a cost. Only request fields you actually use.

Technique 2: Pagination Over Large Queries

ruby

# Bad: Fetch all items at once (very high complexity)
client.item.query_by_board(
  board_id: board_id,
  limit: 1000
)

# Good: Paginate in smaller chunks (distributed complexity)
page = 1
loop do
  items = client.item.query_by_board(
    board_id: board_id,
    limit: 25,
    page: page
  )

  break if items.empty?
  process_items(items)
  page += 1
  sleep(0.5)  # Small delay between pages
end

Pagination spreads complexity over time, staying within your budget.

ruby

# Bad: Multiple queries (high total complexity)
boards.each do |board_id|
  client.board.query(ids: [board_id])
end

# Good: Single batched query (lower total complexity)
client.board.query(ids: board_ids)

monday.com's API supports fetching multiple resources in one query. Use it.

Technique 4: Denormalize When Possible

If you frequently need the same data, consider storing it locally:

ruby

# Instead of querying monday.com every time
def get_item_status(item_id)
  client.item.query(ids: [item_id], select: ['status'])
end

# Store status locally and sync periodically
class Item < ApplicationRecord
  def self.sync_statuses
    items = client.item.query(ids: Item.pluck(:monday_id), select: ['id', 'status'])
    items.each do |item_data|
      Item.find_by(monday_id: item_data['id']).update(status: item_data['status'])
    end
  end
end

Trade-off: Data staleness vs. API efficiency. Choose based on your freshness requirements.

Monitoring Rate Limit Usage

You can't optimize what you don't measure.

What to Monitor

Rate limit errors: How often are you hitting the limit?
Complexity per query: Which queries are most expensive?
Total complexity: How much of your budget are you using?
Retry frequency: How many retries are needed?

Monitoring Implementation

ruby

class MondayApiMonitor
  def self.track_request(query, complexity, duration)
    StatsD.increment('monday.api.requests')
    StatsD.gauge('monday.api.complexity', complexity)
    StatsD.timing('monday.api.duration', duration)
  end

  def self.track_rate_limit_error
    StatsD.increment('monday.api.rate_limit_errors')
    alert_if_threshold_exceeded
  end

  private

  def self.alert_if_threshold_exceeded
    error_rate = get_error_rate
    if error_rate > 0.05  # Alert if >5% of requests rate limited
      notify_team("Monday API rate limit errors elevated: #{error_rate}")
    end
  end
end

Setting Alerts

Configure alerts for:

High error rate: >5% of requests rate limited
Approaching budget: Using >80% of complexity budget
Sudden spikes: Complexity usage increases >50% hour-over-hour

Early warning allows you to optimize before users are impacted.

Distributed Rate Limiting Challenges

In distributed systems (multiple servers/processes), rate limiting becomes complex.

The Problem

Each process doesn't know what the others are doing:

ruby

# Process 1
client.board.query(...)  # Uses 1000 complexity points

# Process 2 (simultaneously)
client.board.query(...)  # Also uses 1000 complexity points

# Combined: 2000 points consumed, but neither process knows

If each process thinks it has the full budget, they'll collectively exceed the limit.

Solution 1: Centralized Rate Limiter

Use Redis to track shared complexity budget:

ruby

class DistributedRateLimiter
  def initialize(redis, budget_per_minute: 100_000)
    @redis = redis
    @budget = budget_per_minute
  end

  def acquire(cost)
    key = "monday_complexity:#{Time.now.to_i / 60}"  # Per-minute key

    @redis.watch(key)
    used = @redis.get(key).to_i

    if used + cost > @budget
      @redis.unwatch
      return false  # Budget exceeded
    end

    @redis.multi do
      @redis.incrby(key, cost)
      @redis.expire(key, 120)  # Expire after 2 minutes
    end

    true
  end
end

# Usage
unless rate_limiter.acquire(estimated_cost)
  sleep(60)  # Wait for next window
  retry
end

client.board.query(...)

Solution 2: Partition Budget

Divide complexity budget among processes:

ruby

# If you have 4 worker processes
process_budget = TOTAL_BUDGET / 4

# Each process tracks its own portion
tracker = ComplexityTracker.new(budget_per_minute: process_budget)

Trade-off: May underutilize budget if some processes are idle while others are busy.

Solution 3: Queue-Based (Recommended)

Use a centralized queue (Sidekiq, etc.) with a single worker:

ruby

# All processes enqueue requests
MondayRequestJob.perform_async(board_id: board_id)

# Single worker processes queue at controlled rate
class MondayRequestJob
  include Sidekiq::Job
  sidekiq_options throttle: { threshold: 60, period: 1.minute }

  def perform(board_id:)
    client.board.query(ids: [board_id])
  end
end

This naturally serializes requests and prevents distributed rate limiting issues.

Key Takeaways

Understand complexity budgets: monday.com uses complexity, not simple request counts
Be proactive: Track usage and throttle before hitting limits
Use exponential backoff: When rate limited, back off exponentially with jitter
Cache strategically: Cache slow-changing data with appropriate TTLs
Optimize queries: Request only needed fields, paginate large datasets
Monitor actively: Track complexity usage, error rates, and set alerts
Queue for scale: Use queues for high-volume, distributed systems
Test your limits: Understand your actual complexity budget through monitoring

Rate limiting isn't a obstacle—it's a design constraint that encourages efficient, scalable API usage. Work with it, not against it.

Rate Limiting Best Practices ​

Why Rate Limiting Exists ​

1. Protecting the API Infrastructure ​

2. Encouraging Efficient API Usage ​

Monday.com's Rate Limiting Strategy ​

Complexity Budget Model ​

Example: ​

Complexity Calculation and Query Cost ​

Factors That Increase Complexity: ​

Estimating Query Cost ​

Rate Limiting Strategies: Proactive vs Reactive ​

Reactive Strategy (Handle Errors) ​

Proactive Strategy (Track and Throttle) ​

Hybrid Strategy (Best of Both) ​

Exponential Backoff: Why It Works ​

Linear Backoff (Don't Use) ​

Exponential Backoff (Use This) ​

Adding Jitter (Even Better) ​

Queuing Requests: Benefits and Trade-offs ​

Basic Queue Pattern ​

Benefits: ​

Trade-offs: ​

When to Use: ​

When Not to Use: ​

Caching Responses: When It Helps ​

What to Cache ​

Cache Implementation ​

TTL Considerations ​

Cache Invalidation ​

Optimizing Query Complexity ​

Technique 1: Request Only Needed Fields ​

Technique 2: Pagination Over Large Queries ​

Technique 3: Batch Related Requests ​

Technique 4: Denormalize When Possible ​

Monitoring Rate Limit Usage ​

What to Monitor ​

Monitoring Implementation ​

Setting Alerts ​

Distributed Rate Limiting Challenges ​

The Problem ​

Solution 1: Centralized Rate Limiter ​

Solution 2: Partition Budget ​

Solution 3: Queue-Based (Recommended) ​

Key Takeaways ​

Rate Limiting Best Practices

Why Rate Limiting Exists

1. Protecting the API Infrastructure

2. Encouraging Efficient API Usage

Monday.com's Rate Limiting Strategy

Complexity Budget Model

Example:

Complexity Calculation and Query Cost

Factors That Increase Complexity:

Estimating Query Cost

Rate Limiting Strategies: Proactive vs Reactive

Reactive Strategy (Handle Errors)

Proactive Strategy (Track and Throttle)

Hybrid Strategy (Best of Both)

Exponential Backoff: Why It Works

Linear Backoff (Don't Use)

Exponential Backoff (Use This)

Adding Jitter (Even Better)

Queuing Requests: Benefits and Trade-offs

Basic Queue Pattern

Benefits:

Trade-offs:

When to Use:

When Not to Use:

Caching Responses: When It Helps

What to Cache

Cache Implementation

TTL Considerations

Cache Invalidation

Optimizing Query Complexity

Technique 1: Request Only Needed Fields

Technique 2: Pagination Over Large Queries

Technique 3: Batch Related Requests

Technique 4: Denormalize When Possible

Monitoring Rate Limit Usage

What to Monitor

Monitoring Implementation

Setting Alerts

Distributed Rate Limiting Challenges

The Problem

Solution 1: Centralized Rate Limiter

Solution 2: Partition Budget

Solution 3: Queue-Based (Recommended)

Key Takeaways