feat: Complete feature gap analysis audit

This commit adds a new file, AUDIT-FEATURES.md, which contains a thorough audit comparing dapp-fm's features against similar data collection tools.

The audit focuses on:
- Missing core features
- Competitive advantages
- Integration opportunities
- User workflow gaps

The comparison includes wget/curl, HTTrack, ArchiveBox, SingleFile, and rclone. This audit will help guide future development and strategic decisions.

Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
This commit is contained in:
google-labs-jules[bot] 2026-02-02 01:18:09 +00:00
parent cf2af53ed3
commit df6d841148

60
AUDIT-FEATURES.md Normal file
View file

@ -0,0 +1,60 @@
# Feature Audit: dapp-fm vs. Competitors
This audit compares the features of `dapp-fm` against popular data collection and archiving tools.
## Feature Comparison Matrix
| Feature | dapp-fm | wget/curl | HTTrack | ArchiveBox | SingleFile | rclone |
| ---------------------------- | ------- | --------- | ------- | ---------- | ---------- | ------ |
| **General** | | | | | | |
| Target | Websites, Git Repos, PWAs | Files, Websites | Websites | Websites | Webpages | Cloud Storage |
| Output Format | datanode, tim, trix, stim | Files | HTML | HTML, WARC, etc. | HTML | Files |
| **Website Archiving** | | | | | | |
| Recursive Download | Yes | Yes | Yes | Yes | No | N/A |
| Asset Capture (JS, CSS, etc.)| Yes | Yes | Yes | Yes | Yes | N/A |
| MHTML/WARC Output | No | No | No | Yes | No | N/A |
| Single Page Archive | Yes | Yes | Yes | Yes | Yes | N/A |
| **Data Sources** | | | | | | |
| Git Repositories | Yes | No | No | Yes | No | No |
| GitHub Releases | Yes | No | No | No | No | No |
| Progressive Web Apps (PWAs) | Yes | No | No | No | No | No |
| **Storage & Backend** | | | | | | |
| Cloud Storage Sync | No | No | No | No | No | Yes |
| **Advanced Features** | | | | | | |
| Headless Browser | No | No | No | Yes | Yes | N/A |
| Authentication | No | Yes | Yes | Yes | No | Yes |
| Rate Limiting | No | Yes | Yes | Yes | No | Yes |
| Filtering (Include/Exclude) | No | Yes | Yes | Yes | No | Yes |
| Scheduling | No | No | No | Yes | No | No |
| **Usability** | | | | | | |
| CLI Interface | Yes | Yes | Yes | Yes | No | Yes |
| GUI Interface | No | No | Yes | Yes | Yes (Browser Ext) | No |
## Analysis
### Missing Core Features
* **Headless Browser Rendering:** `dapp-fm` doesn't render pages in a headless browser, which means it may not capture content from single-page applications (SPAs) or websites that rely heavily on JavaScript.
* **Standard Archive Formats:** The tool doesn't export to standard formats like WARC or MHTML, which are widely used in web archiving.
* **Authentication and Rate Limiting:** `dapp-fm` lacks built-in support for handling websites that require logins or have rate limits.
* **Cloud Storage Integration:** Unlike `rclone`, `dapp-fm` cannot sync archives to various cloud storage providers.
* **Scheduling:** There's no built-in mechanism for scheduling recurring captures.
### Competitive Advantages
* **Diverse Data Sources:** `dapp-fm`'s ability to collect not just websites but also Git repositories and Progressive Web Apps gives it a unique advantage.
* **Proprietary Archiving Formats:** The `.trix` and `.stim` formats, with their encryption and compression capabilities, offer a secure and efficient way to store and share archives.
* **Simplicity and Focus:** `dapp-fm` has a clear focus on collecting specific types of online resources and packaging them into a portable format.
### Integration Opportunities
* **Browser Extension:** A browser extension, similar to `SingleFile`, could streamline the process of capturing single pages.
* **Cloud Storage Providers:** Integrating with services like Amazon S3, Google Cloud Storage, or Dropbox would make it easier for users to store and manage their archives.
* **CI/CD Integration:** `dapp-fm` could be integrated into CI/CD pipelines to automatically archive websites or applications after deployment.
### User Workflow Gaps
* **No GUI:** The lack of a graphical interface makes `dapp-fm` less accessible to non-technical users.
* **Limited Configuration:** The tool offers limited configuration options for things like filtering content, setting user agents, or handling cookies.
* **Post-Archival Management:** `dapp-fm` doesn't provide any tools for managing, searching, or viewing archives after they've been created.