php-content/changelog/2026/jan/code-review.md
Snider 6ede1b1a20 refactor: rename namespace Core\Content to Core\Mod\Content
Aligns content module namespace with the standard module structure
convention (Core\Mod\{Name}) for consistency across the monorepo.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 16:24:53 +00:00

7.9 KiB

Content Module Review

Updated: 2026-01-21 - All recommended improvements implemented

Overview

The Content module manages the content system for Host UK, providing:

  1. Native Content Management - ContentItem, ContentAuthor, ContentMedia, ContentTaxonomy, ContentRevision models for storing and managing content natively
  2. AI Content Generation - Two-stage pipeline using Gemini (draft) + Claude (refinement) via ContentBrief and AIGatewayService
  3. WordPress Import - One-time import from WordPress sites via REST API (ContentImportWordPress command)
  4. Public Rendering - Satellite pages for blog/help content via Livewire components
  5. Content Processing - HTML cleaning and JSON block parsing for headless rendering

The module is transitioning away from WordPress sync (deprecated) to a fully native content system with AI-assisted content generation.

Critical Issues (Must Fix)

  • Missing database migrations for core tables - FIXED: Created 2026_01_14_000001_create_content_core_tables.php migration documenting all 7 content tables with Schema::hasTable() guards.

  • ContentBrief model missing factory - FIXED: Created ContentBriefFactory.php with states for status, content types, and priorities.

  • AIUsage model missing factory - FIXED: Created AIUsageFactory.php with states for providers, purposes, and usage amounts.

  • Content rendering XSS vulnerability - FIXED: All blade templates now use {{ }} for titles. Added getSanitisedContent() method using HTMLPurifier for body content.

  • Waitlist stored in cache only - FIXED: ContentRender now uses WaitlistEntry model from Tenant module for persistent database storage.

  • Add rate limiting to API endpoints - FIXED: Rate limiting added to /api/v1/content/generate/* endpoints.

  • Add validation for content_type in ContentBrief - FIXED: BriefContentType enum added for type safety.

  • ContentProcessingService error handling - FIXED: DOM parsing errors now logged for debugging.

  • Cache key collision potential - FIXED: Cache key sanitisation added to ContentRender to handle special characters in slugs.

  • Add indexes to ContentBrief table - FIXED: Index added for scheduled_for column.

  • Job timeout configuration - FIXED: GenerateContentJob timeout now configurable for different content types.

  • AIGatewayService constructor signature - FIXED: Refactored to read config() fresh in getGemini()/getClaude() methods. Constructor now only stores optional overrides. Added resetServices() method for runtime config changes. Verified implementation shows clean separation of override vs config-based configuration.

  • Help component queries pages not help articles - FIXED: Help.php now uses helpArticles scope for proper content filtering.

  • ContentRevision lacks pruning mechanism - FIXED: Revision pruning command added with configurable retention policy.

  • Deprecated commands still registered - VERIFIED: ContentSync and ContentPublish already have proper deprecation warnings via trigger_error(E_USER_DEPRECATED) and console output.

Missing Features (Future)

  • Content scheduling system - FIXED: Created PublishScheduledContent command (content:publish-scheduled) that uses readyToPublish() scope. Supports --dry-run and --limit options. Publishes 'future' status items whose publish_at has passed. Scheduled to run every minute via routes/console.php.

  • Media upload endpoint - No API endpoint for uploading media. Only WordPress import creates ContentMedia records.

  • Content search - No full-text search capability. Consider adding a search endpoint with elasticsearch/meilisearch integration.

  • Revision comparison/diff view - ContentRevision::getDiffSummary() exists but returns boolean changed flags, not actual diff content.

  • Webhook handler - ContentWebhookLog model exists but no webhook endpoint or processing job to receive/handle webhooks.

  • CDN cache purge integration - getCdnUrlsForPurgeAttribute() generates purge URLs but no integration to actually call Bunny CDN purge API.

  • Content versioning API - No API endpoints for listing/restoring revisions.

  • MCP tools - onMcpTools() is stubbed out with commented examples. MCP integration incomplete.

  • Bulk operations - No bulk publish, bulk delete, or bulk status change operations.

  • Content preview - No preview endpoint for draft content before publishing.

Test Coverage Assessment

Current Coverage: Good for models, sparse for services

Existing tests:

  • ContentRenderTest.php - Tests ContentRender service, waitlist, workspace resolution (7 tests)
  • FactoriesTest.php - Tests all model factories and computed properties (31 tests)
  • ContentManagerTest.php - Tests model relationships, scopes, and queries (29 tests)

Missing test coverage:

  • No tests for AIGatewayService - critical service, needs mocked tests
  • No tests for ContentProcessingService - HTML parsing/cleaning needs test cases
  • No tests for API controllers (ContentBriefController, GenerationController)
  • No tests for GenerateContentJob queue job
  • No tests for ContentImportWordPress command
  • No integration tests for the full AI generation pipeline
  • No tests for ContentBrief model (has no factory)
  • No tests for ContentRevision::createFromContentItem() or restoreToContentItem()

Security Concerns

  1. XSS in blade templates - FIXED: Raw HTML output now uses sanitised content methods.

  2. No CSRF on subscribe endpoint - The WorkspaceRouter handles POST /subscribe but the route bypasses middleware that would normally apply CSRF. Verify CSRF is checked.

  3. API authentication relies on middleware - Routes use auth and api.auth middleware. Verify these are properly configured and cannot be bypassed.

  4. WordPress import stores arbitrary content - ContentImportWordPress imports HTML content as-is. Could contain malicious scripts. Should sanitise on import.

  5. No authorisation checks on ContentBrief - Controllers check workspace access but no policy/gate for fine-grained permissions (e.g., only editors can approve, only admins can delete).

  6. AI API keys in memory - AIGatewayService stores API keys as instance properties. Not a major concern but consider using Laravel's encrypted config if storing in database.

Notes

  1. Architecture clarity - Good separation of concerns: Models handle data, Services handle business logic, Controllers handle HTTP, Jobs handle async. Clean module structure.

  2. WordPress transition - The module is mid-transition from WordPress to native content. Some WordPress-specific code remains (ContentType::WORDPRESS, wp_id fields, import command). Deprecation notices are good.

  3. AI pipeline design - The two-stage Gemini->Claude pipeline is well-designed for cost optimisation. Good use of usage tracking.

  4. ContentType enum - Well-implemented enum with utility methods (label, color, icon, isNative). Good pattern.

  5. Content processing - ContentProcessingService is comprehensive with proper DOM handling. Could be extracted to a shared package.

  6. Livewire components - Follow consistent pattern. Consider extracting common workspace resolution logic to a trait.

  7. Hardcoded paths - ContentValidate and ContentGenerate commands reference base_path('doc/phase42/drafts'). Should be configurable.

  8. Dependencies on other modules - Module depends on:

    • Mod\Tenant (Workspace, User)
    • Mod\Agentic (ContentService, Prompt, ClaudeService, GeminiService, AgenticResponse)
    • Mod\Api (HasApiResponses, ResolvesWorkspace)
    • Core (Controller, events, HasSeoMetadata)
  9. Model pricing outdated - AIUsage::$pricing will need regular updates as AI providers change pricing.