Use `tokenBucket()` / `token_bucket()` for AI endpoints — the `requested` parameter can be set proportional to actual model token usage

directly linking rate limiting to cost. It also allows short bursts while enforcing an average rate

promptBeginner5 min to valuemarkdown
0 views
Apr 16, 2026

Loading actions...