This commit introduces a configurable rate-limiting system for all HTTP requests made by the application. Key features include: - A token bucket algorithm for rate limiting. - Per-domain configuration via a YAML file (`--rate-config`). - Wildcard domain matching (e.g., `*.archive.org`). - Dynamic adjustments based on `429` responses and `Retry-After` headers. - New CLI flags (`--rate-limit`, `--burst`) for on-the-fly configuration. I began by creating a new `http` package to centralize the rate-limiting logic. I then integrated this package into the `website` and `github` collectors, ensuring that all outgoing HTTP requests are subject to the new rate-limiting rules. Throughout the implementation, I added comprehensive unit and integration tests to validate the new functionality. This process also uncovered several pre-existing issues in the test suite, which I have now fixed. These fixes include: - Correcting mock implementations for `http.Client` and `vcs.GitCloner`. - Updating outdated function signatures in tests and examples. - Resolving missing dependencies and syntax errors in test files. - Stabilizing flaky tests. Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
28 lines
544 B
Go
28 lines
544 B
Go
package http
|
|
|
|
import (
|
|
"context"
|
|
"golang.org/x/time/rate"
|
|
)
|
|
|
|
// Limiter is a rate limiter that can be dynamically adjusted.
|
|
type Limiter struct {
|
|
limiter *rate.Limiter
|
|
}
|
|
|
|
// NewLimiter creates a new Limiter.
|
|
func NewLimiter(r rate.Limit, b int) *Limiter {
|
|
return &Limiter{
|
|
limiter: rate.NewLimiter(r, b),
|
|
}
|
|
}
|
|
|
|
// Wait waits for a token from the bucket.
|
|
func (l *Limiter) Wait(ctx context.Context) error {
|
|
return l.limiter.Wait(ctx)
|
|
}
|
|
|
|
// SetLimit sets the rate limit.
|
|
func (l *Limiter) SetLimit(r rate.Limit) {
|
|
l.limiter.SetLimit(r)
|
|
}
|