This commit introduces a configurable rate-limiting system for all HTTP requests made by the application.
Key features include:
- A token bucket algorithm for rate limiting.
- Per-domain configuration via a YAML file (`--rate-config`).
- Wildcard domain matching (e.g., `*.archive.org`).
- Dynamic adjustments based on `429` responses and `Retry-After` headers.
- New CLI flags (`--rate-limit`, `--burst`) for on-the-fly configuration.
I began by creating a new `http` package to centralize the rate-limiting logic. I then integrated this package into the `website` and `github` collectors, ensuring that all outgoing HTTP requests are subject to the new rate-limiting rules.
Throughout the implementation, I added comprehensive unit and integration tests to validate the new functionality. This process also uncovered several pre-existing issues in the test suite, which I have now fixed. These fixes include:
- Correcting mock implementations for `http.Client` and `vcs.GitCloner`.
- Updating outdated function signatures in tests and examples.
- Resolving missing dependencies and syntax errors in test files.
- Stabilizing flaky tests.
Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
Refactored the existing tests to use the `_Good`, `_Bad`, and `_Ugly`
testing convention. This provides a more structured approach to testing
and ensures that a wider range of scenarios are covered, including
valid inputs, invalid inputs, and edge cases.
In addition to refactoring the tests, this change also includes several
bug fixes that were uncovered by the new tests. These fixes improve the
robustness and reliability of the codebase.
The following packages and commands were affected:
- `pkg/datanode`
- `pkg/compress`
- `pkg/github`
- `pkg/matrix`
- `pkg/pwa`
- `pkg/vcs`
- `pkg/website`
- `cmd/all`
- `cmd/collect`
- `cmd/collect_github_repo`
- `cmd/collect_website`
- `cmd/compile`
- `cmd/root`
- `cmd/run`
- `cmd/serve`
This commit introduces a significant refactoring of the `cmd` package to improve testability and increases test coverage across the application.
Key changes include:
- Refactored Cobra commands to use `RunE` for better error handling and testing.
- Extracted business logic from command handlers into separate, testable functions.
- Added comprehensive unit tests for the `cmd`, `compress`, `github`, `logger`, and `pwa` packages.
- Added tests for missing command-line arguments, as requested.
- Implemented the `borg all` command to clone all public repositories for a GitHub user or organization.
- Restored and improved the `collect pwa` functionality.
- Removed duplicate code and fixed various bugs.
- Addressed a resource leak in the `all` command.
- Improved error handling in the `pwa` package.
- Refactored `main.go` to remove duplicated logic.
- Fixed several other minor bugs and inconsistencies.
This commit introduces a significant refactoring of the `cmd` package to improve testability and increases test coverage across the application.
Key changes include:
- Refactored Cobra commands to use `RunE` for better error handling and testing.
- Extracted business logic from command handlers into separate, testable functions.
- Added comprehensive unit tests for the `cmd`, `compress`, `github`, `logger`, and `pwa` packages.
- Added tests for missing command-line arguments, as requested.
- Implemented the `borg all` command to clone all public repositories for a GitHub user or organization.
- Restored and improved the `collect pwa` functionality.
- Removed duplicate code and fixed various bugs.
This commit fixes a critical build issue where the application was being compiled as an archive instead of an executable. This was caused by the absence of a `main` package.
The following changes have been made to resolve this and improve the development process:
- A `main.go` file has been added to the root of the project to serve as the application's entry point.
- A `Taskfile.yml` has been introduced to standardize the build, run, and testing processes.
- The build process has been corrected to produce a runnable binary.
- An end-to-end test (`TestE2E`) has been added to the test suite. This test builds the application and runs it with the `--help` flag to ensure the binary is always executable, preventing similar build regressions in the future.
This commit introduces several maintenance improvements to the repository.
- A `go.work` file has been added to define the workspace and make the project easier to work with.
- The module path in `go.mod` has been updated to use a GitHub URL, and all import paths have been updated accordingly.
- `examples` and `docs` directories have been created.
- The `examples` directory contains scripts that demonstrate the tool's functionality.
- The `docs` directory contains documentation for the project.
- Tests have been added to the `pkg/github` package following the `_Good`, `_Bad`, `_Ugly` convention.
- The missing `pkg/borg` package has been added to resolve a build error.