Borg/pkg
google-labs-jules[bot] c7e3ba297f feat: PDF metadata extraction
This commit introduces a new feature to extract and index metadata from collected PDF files.

The following changes have been made:
- Added a new `pdf` command with a `metadata` subcommand to extract metadata from a single PDF file.
- Added a new `extract-metadata` command to extract metadata from all PDF files within a given archive and create an `INDEX.json` file.
- Added a `--extract-pdf-metadata` flag to the `collect website` command to extract metadata from downloaded PDF files.
- Created a new `pdf` package to encapsulate the PDF metadata extraction logic, which uses the `pdfinfo` command from the `poppler-utils` package.
- Added unit tests for the new `pdf` package, including mocking the `pdfinfo` command.
- Modified `Taskfile.yml` to install `poppler-utils` as a dependency.

Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
2026-02-02 00:46:59 +00:00
..
compress feat: Add _Good, _Bad, and _Ugly tests 2025-11-14 10:36:35 +00:00
console feat: Add Borg Console and release workflow 2025-12-27 02:32:31 +00:00
datanode Improve test coverage for datanode and tim packages, and fix cmd tests 2025-11-23 18:58:32 +00:00
github feat: Add _Good, _Bad, and _Ugly tests 2025-11-14 10:36:35 +00:00
logger feat: Improve test coverage and refactor for testability 2025-11-03 18:25:04 +00:00
mocks feat: Improve test coverage and refactor for testability 2025-11-03 16:31:26 +00:00
pdf feat: PDF metadata extraction 2026-02-02 00:46:59 +00:00
player feat: v3 streaming with LTHN rolling keys and configurable cadence 2026-01-12 16:01:59 +00:00
pwa feat: Add ChaCha20-Poly1305 encryption and decryption for TIM files (.stim), enhance CLI for encryption format handling (stim), and include metadata inspection support 2025-12-26 01:25:03 +00:00
smsg feat: adaptive bitrate streaming (ABR) for HLS-style encrypted video 2026-01-13 15:40:15 +00:00
stmf feat: Add STMF form encryption and SMSG secure message packages 2025-12-27 00:49:07 +00:00
tarfs feat: Add ChaCha20-Poly1305 encryption and decryption for TIM files (.stim), enhance CLI for encryption format handling (stim), and include metadata inspection support 2025-12-26 01:25:03 +00:00
tim feat: Add ChaCha20-Poly1305 encryption and decryption for TIM files (.stim), enhance CLI for encryption format handling (stim), and include metadata inspection support 2025-12-26 01:25:03 +00:00
trix feat: Add ChaCha20-Poly1305 encryption and decryption for TIM files (.stim), enhance CLI for encryption format handling (stim), and include metadata inspection support 2025-12-26 01:25:03 +00:00
ui feat: Improve test coverage and refactor for testability 2025-11-03 18:25:04 +00:00
vcs feat: Add _Good, _Bad, and _Ugly tests 2025-11-14 10:36:35 +00:00
wasm/stmf feat: adaptive bitrate streaming (ABR) for HLS-style encrypted video 2026-01-13 15:40:15 +00:00
website feat: Add _Good, _Bad, and _Ugly tests 2025-11-14 10:36:35 +00:00