This commit introduces two key improvements to the application:
1. **Authenticated GitHub API Access:** The GitHub client now uses a personal access token (PAT) from the `GITHUB_TOKEN` environment variable if it is available. This increases the rate limit for GitHub API requests, making the tool more robust for users who need to collect a large number of repositories.
2. **Structured Logging:** The application now uses the standard library's `slog` package for structured logging. A `--verbose` flag has been added to the root command to control the log level, allowing for more detailed output when needed. This makes the application's output more consistent and easier to parse.
This commit introduces a new `collect website` command that recursively downloads a website to a specified depth.
- A new `pkg/website` package contains the logic for the recursive download.
- A new `pkg/ui` package provides a progress bar for long-running operations, which is used by the website downloader.
- The `collect pwa` subcommand has been restored to be PWA-specific.
This commit refactors the repository collection functionality to use the new `DataNode` package instead of the old `trix` package.
The `collect` and `all` commands have been updated to use the new `vcs` package, which clones Git repositories and packages them into a `DataNode`. The `trix` package and its related commands (`cat`, `ingest`) have been removed.
This commit introduces the core functionality of the Borg Data Collector.
- Adds the `collect` command to clone a single Git repository and store it in a Trix cube.
- Adds the `collect all` command to clone all public repositories from a GitHub user or organization.
- Implements the Trix cube as a `tar` archive.
- Adds the `ingest` command to add files to a Trix cube.
- Adds the `cat` command to extract files from a Trix cube.
- Integrates Borg-themed status messages for a more engaging user experience.