Rate limiting is the technique of capping the number of requests a single identifier (IP address, API key, user account) can make against a service within a time window. Common algorithms are fixed window, sliding window, token bucket, and leaky bucket; tools usually return HTTP 429 Too Many Requests with a Retry-After header once the cap is hit. Rate limits protect backends from accidental abuse (a runaway client retry loop), deliberate abuse (credential stuffing, scraping, DoS), and noisy-neighbor effects on shared infrastructure.
In a self-hosting context
Most self-hostable applications expose at least a basic per-IP rate limit on auth endpoints; Mattermost, Nextcloud, and n8n all do. For finer-grained control, push rate limiting up into the reverse proxy: Caddy, nginx, and Traefik all have rate-limit modules, and a self-hosted CDN like a Cloudflare replacement is the right place to absorb attack-scale traffic before it reaches the app. See Reverse proxy for proxy-level rate limiting.