This commit introduces two key improvements to the application:
1. **Authenticated GitHub API Access:** The GitHub client now uses a personal access token (PAT) from the `GITHUB_TOKEN` environment variable if it is available. This increases the rate limit for GitHub API requests, making the tool more robust for users who need to collect a large number of repositories.
2. **Structured Logging:** The application now uses the standard library's `slog` package for structured logging. A `--verbose` flag has been added to the root command to control the log level, allowing for more detailed output when needed. This makes the application's output more consistent and easier to parse.
This commit addresses several issues identified in a code review to improve the overall quality and robustness of the application.
Key changes include:
- Refactored `cmd.Execute()` to return an error instead of calling `os.Exit`, making the application more testable.
- Fixed critical issues in `cmd/main_test.go`, including renaming `TestMain` to avoid conflicts and removing the brittle E2E test.
- Improved the GitHub API client in `pkg/github/github.go` by:
- Fixing a resource leak where an HTTP response body was not being closed.
- Restoring a parameterized function to improve testability.
- Adding support for `context.Context` and API pagination for robustness.
- Updated the `.github/workflows/go.yml` CI workflow to use the `Taskfile.yml` for building and testing, ensuring consistency.
- Added a `test` task to `Taskfile.yml`.
- Ran `go mod tidy` and fixed several unused import errors.
This commit fixes a critical build issue where the application was being compiled as an archive instead of an executable. This was caused by the absence of a `main` package.
The following changes have been made to resolve this and improve the development process:
- A `main.go` file has been added to the root of the project to serve as the application's entry point.
- A `Taskfile.yml` has been introduced to standardize the build, run, and testing processes.
- The build process has been corrected to produce a runnable binary.
- An end-to-end test (`TestE2E`) has been added to the test suite. This test builds the application and runs it with the `--help` flag to ensure the binary is always executable, preventing similar build regressions in the future.
This commit resolves a merge conflict by restructuring the `collect` command and its subcommands. The `collect git` and `collect github-release` commands have been consolidated under the new `collect github` command to provide a more organized and user-friendly command structure.
This commit also fixes a failing test in the `vcs` package by adding the necessary git config to the test setup.
Finally, this commit introduces the `collect github release` command, which allows downloading assets from the latest GitHub release of a repository. The command supports version checking, specific file downloads, and packing all assets into a DataNode.
This commit introduces a new command `collect github-release` that allows downloading assets from the latest GitHub release of a repository.
The command supports the following features:
- Downloading a specific file from the release using the `--file` flag.
- Downloading all assets from the release and packing them into a DataNode using the `--pack` flag.
- Specifying an output directory for the downloaded files using the `--output` flag.
This commit also includes a project-wide refactoring of the Go module path to `github.com/Snider/Borg` to align with Go's module system best practices.
This commit introduces several maintenance improvements to the repository.
- A `go.work` file has been added to define the workspace and make the project easier to work with.
- The module path in `go.mod` has been updated to use a GitHub URL, and all import paths have been updated accordingly.
- `examples` and `docs` directories have been created.
- The `examples` directory contains scripts that demonstrate the tool's functionality.
- The `docs` directory contains documentation for the project.
- Tests have been added to the `pkg/github` package following the `_Good`, `_Bad`, `_Ugly` convention.
- The missing `pkg/borg` package has been added to resolve a build error.
This commit introduces a new `collect website` command that recursively downloads a website to a specified depth.
- A new `pkg/website` package contains the logic for the recursive download.
- A new `pkg/ui` package provides a progress bar for long-running operations, which is used by the website downloader.
- The `collect pwa` subcommand has been restored to be PWA-specific.
This commit refactors the repository collection functionality to use the new `DataNode` package instead of the old `trix` package.
The `collect` and `all` commands have been updated to use the new `vcs` package, which clones Git repositories and packages them into a `DataNode`. The `trix` package and its related commands (`cat`, `ingest`) have been removed.
This commit introduces a new `DataNode` package, which provides an in-memory, `fs.FS`-compatible filesystem with a `debme`-like interface. The `DataNode` can be serialized to and from a TAR archive, making it suitable for storing downloaded assets.
The `pwa` and `serve` commands have been refactored to use the `DataNode`. The `pwa` command now packages downloaded PWA assets into a `DataNode` and saves it as a `.dat` file. The `serve` command loads a `.dat` file into a `DataNode` and serves its contents.
This commit introduces two new commands: `pwa` and `serve`.
The `pwa` command downloads a Progressive Web Application (PWA) from a given URL. It discovers the PWA's manifest, downloads the assets referenced in the manifest (start URL and icons), and packages them into a single `.tar` file.
The `serve` command takes a `.tar` file created by the `pwa` command and serves its contents using a standard Go HTTP file server. It unpacks the tarball into an in-memory filesystem, making it a self-contained and efficient way to host the downloaded PWA.
This commit introduces the core functionality of the Borg Data Collector.
- Adds the `collect` command to clone a single Git repository and store it in a Trix cube.
- Adds the `collect all` command to clone all public repositories from a GitHub user or organization.
- Implements the Trix cube as a `tar` archive.
- Adds the `ingest` command to add files to a Trix cube.
- Adds the `cat` command to extract files from a Trix cube.
- Integrates Borg-themed status messages for a more engaging user experience.