core/docs

Build and Deploy / deploy (push) Failing after 7s

Details

docs: update documentation from implemented plans

Add new pages: scheduled-actions, studio, plug, uptelligence.
Update: go-blockchain, go-devops, go-process, mcp, lint, docs engine.
Update nav and indexes.

Co-Authored-By: Virgil <virgil@lethean.io>

2026-03-14 08:09:17 +00:00

9.9 KiB

Raw Blame History

Studio Multimedia Pipeline

Studio is a CorePHP module that orchestrates video remixing, transcription, voice synthesis, and image generation by dispatching GPU work to remote services. It separates creative decisions (LEM/Ollama) from mechanical execution (ffmpeg, Whisper, TTS, ComfyUI).

Architecture

Studio is a job orchestrator, not a renderer. All GPU-intensive work runs on remote Docker services accessed over HTTP.

Studio Module (CorePHP)
  ├── Livewire UI (asset browser, remix form, voice, thumbnails)
  ├── Artisan Commands (CLI)
  └── API Routes (/api/studio/*)
        │
  Actions (CatalogueAsset, GenerateManifest, RenderManifest, etc.)
        │
  Redis Job Queue
        │
        ├── Ollama (LEM) ─────── Creative decisions, scripts, manifests
        ├── Whisper ───────────── Speech-to-text transcription
        ├── Kokoro TTS ────────── Voiceover generation
        ├── ffmpeg Worker ─────── Video rendering from manifests
        └── ComfyUI ──────────── Image generation, thumbnails

Smart/Dumb Separation

LEM produces JSON manifests (the creative layer). ffmpeg and GPU services consume them mechanically (the execution layer). Neither side knows about the other's internals — the manifest format is the contract.

Module Structure

The Studio module lives at app/Mod/Studio/ and follows standard CorePHP patterns:

app/Mod/Studio/
├── Boot.php                    # Lifecycle events (API, Console, Web)
├── Actions/
│   ├── CatalogueAsset.php      # Ingest files, extract metadata
│   ├── TranscribeAsset.php     # Send to Whisper, store transcript
│   ├── GenerateManifest.php    # Brief + library → LEM → manifest JSON
│   ├── RenderManifest.php      # Dispatch manifest to ffmpeg worker
│   ├── SynthesiseSpeech.php    # Text → TTS → audio file
│   ├── GenerateVoiceover.php   # Script → voiced audio for remix
│   ├── GenerateImage.php       # Prompt → ComfyUI → image
│   ├── GenerateThumbnail.php   # Asset → thumbnail image
│   └── BatchRemix.php          # Queue multiple remix jobs
├── Console/
│   ├── Catalogue.php           # studio:catalogue — batch ingest
│   ├── Transcribe.php          # studio:transcribe — batch transcription
│   ├── Remix.php               # studio:remix — brief in, video out
│   ├── Voice.php               # studio:voice — text-to-speech
│   ├── Thumbnail.php           # studio:thumbnail — generate thumbnails
│   └── BatchRemixCommand.php   # studio:batch-remix — queue batch jobs
├── Controllers/Api/
│   ├── AssetController.php     # GET/POST /api/studio/assets
│   ├── RemixController.php     # POST /api/studio/remix
│   ├── VoiceController.php     # POST /api/studio/voice
│   └── ImageController.php     # POST /api/studio/images/thumbnail
├── Models/
│   ├── StudioAsset.php         # Multimedia asset with metadata
│   └── StudioJob.php           # Job tracking (status, manifest, output)
├── Livewire/
│   ├── AssetBrowserPage.php    # Browse/search/tag assets
│   ├── RemixPage.php           # Remix form + job status
│   ├── VoicePage.php           # Voice synthesis interface
│   └── ThumbnailPage.php       # Thumbnail generator
└── Routes/
    ├── api.php                 # REST API endpoints
    └── web.php                 # Livewire page routes

Asset Cataloguing

Assets are multimedia files (video, image, audio) tracked in the studio_assets table with metadata including duration, resolution, tags, and transcripts.

Ingesting Assets

use Mod\Studio\Actions\CatalogueAsset;

// From an uploaded file
$asset = CatalogueAsset::run($uploadedFile, ['summer', 'beach']);

// From an existing storage path
$asset = CatalogueAsset::run('studio/raw/clip-001.mp4', ['interview']);

Only video/*, image/*, and audio/* MIME types are accepted.

CLI Batch Ingest

php artisan studio:catalogue /path/to/media --tags=summer,promo

Querying Assets

use Mod\Studio\Models\StudioAsset;

// By type
$videos = StudioAsset::videos()->get();
$images = StudioAsset::images()->get();
$audio = StudioAsset::audio()->get();

// By tag
$summer = StudioAsset::tagged('summer')->get();

Transcription

Transcription sends assets to a Whisper service and stores the returned text and detected language.

use Mod\Studio\Actions\TranscribeAsset;

$asset = TranscribeAsset::run($asset);

echo $asset->transcript;           // "Hello and welcome..."
echo $asset->transcript_language;  // "en"

The action handles missing files and API failures gracefully — it returns the asset unchanged without throwing.

CLI Batch Transcription

php artisan studio:transcribe

Manifest-Driven Remixing

The remix pipeline has two stages: manifest generation (creative) and rendering (mechanical).

Generating Manifests

use Mod\Studio\Actions\GenerateManifest;

$job = GenerateManifest::run(
    brief: 'Create a 15-second upbeat TikTok from the summer footage',
    template: 'tiktok-15s',
);

// $job->manifest contains the JSON manifest

The action collects all video assets from the library, sends them as context to Ollama along with the brief, and parses the returned JSON manifest.

Manifest Format

{
  "clips": [
    {"asset_id": 42, "start_ms": 3200, "end_ms": 8100, "effects": ["fade_in"]},
    {"asset_id": 17, "start_ms": 0, "end_ms": 5500, "effects": ["crossfade"]}
  ],
  "audio": {"track": "original"},
  "voiceover": {"script": "Summer vibes only", "voice": "default", "volume": 0.8},
  "overlays": [
    {"type": "image", "asset_id": 5, "at": 0.5, "duration": 3.0, "position": "bottom-right", "opacity": 0.8}
  ]
}

Rendering

use Mod\Studio\Actions\RenderManifest;

$job = RenderManifest::run($job);

This dispatches the manifest to the ffmpeg worker service, which renders the video and calls back when complete.

CLI Remix

php artisan studio:remix "Create a relaxing travel montage" --template=tiktok-30s

Voice & TTS

use Mod\Studio\Actions\SynthesiseSpeech;

$audio = SynthesiseSpeech::run(
    text: 'Welcome to our channel',
    voice: 'default',
);

CLI

php artisan studio:voice "Welcome to our channel" --voice=default

Image Generation

Thumbnails and image overlays use ComfyUI:

use Mod\Studio\Actions\GenerateThumbnail;

$thumbnail = GenerateThumbnail::run($asset);

CLI

php artisan studio:thumbnail --asset=42

API Endpoints

Method	Endpoint	Description
`GET`	`/api/studio/assets`	List assets
`GET`	`/api/studio/assets/{id}`	Show asset details
`POST`	`/api/studio/assets`	Upload/catalogue asset
`POST`	`/api/studio/remix`	Submit remix brief
`GET`	`/api/studio/remix/{id}`	Poll job status
`POST`	`/api/studio/remix/{id}/callback`	Worker completion callback
`POST`	`/api/studio/voice`	Submit voice synthesis
`GET`	`/api/studio/voice/{id}`	Poll voice job status
`POST`	`/api/studio/images/thumbnail`	Generate thumbnail

GPU Services

All GPU services run as Docker containers, accessed over HTTP. Configuration is in config/studio.php:

Service	Default Endpoint	Purpose
Ollama	`http://studio-ollama:11434`	Creative decisions via LEM
Whisper	`http://studio-whisper:9100`	Speech-to-text
Kokoro TTS	`http://studio-tts:9200`	Text-to-speech
ffmpeg Worker	`http://studio-worker:9300`	Video rendering
ComfyUI	`http://studio-comfyui:8188`	Image generation

Configuration

// config/studio.php
return [
    'ollama' => [
        'url' => env('STUDIO_OLLAMA_URL', 'http://studio-ollama:11434'),
        'model' => env('STUDIO_OLLAMA_MODEL', 'lem-4b'),
        'timeout' => 60,
    ],
    'whisper' => [
        'url' => env('STUDIO_WHISPER_URL', 'http://studio-whisper:9100'),
        'model' => 'large-v3-turbo',
        'timeout' => 120,
    ],
    'worker' => [
        'url' => env('STUDIO_WORKER_URL', 'http://studio-worker:9300'),
        'timeout' => 300,
    ],
    'storage' => [
        'disk' => 'local',
        'assets_path' => 'studio/assets',
    ],
    'templates' => [
        'tiktok-15s' => ['duration' => 15, 'width' => 1080, 'height' => 1920, 'fps' => 30],
        'tiktok-30s' => ['duration' => 30, 'width' => 1080, 'height' => 1920, 'fps' => 30],
        'youtube-60s' => ['duration' => 60, 'width' => 1920, 'height' => 1080, 'fps' => 30],
    ],
];

Livewire UI

Studio provides four Livewire page components:

Asset Browser — browse, search, and tag multimedia assets
Remix Page — enter a creative brief, select template, view job progress
Voice Page — text-to-speech interface
Thumbnail Page — generate thumbnails from assets

Components are registered via the module's Boot class and available under mod.studio.livewire.*.

Testing

All actions are testable with Http::fake():

use Illuminate\Support\Facades\Http;
use Mod\Studio\Actions\TranscribeAsset;
use Mod\Studio\Models\StudioAsset;

it('transcribes an asset via Whisper', function () {
    Storage::fake('local');
    Storage::disk('local')->put('studio/test.mp4', 'fake-video');

    Http::fake([
        '*/transcribe' => Http::response([
            'text' => 'Hello world',
            'language' => 'en',
        ]),
    ]);

    $asset = StudioAsset::factory()->create(['path' => 'studio/test.mp4']);
    $result = TranscribeAsset::run($asset);

    expect($result->transcript)->toBe('Hello world');
    expect($result->transcript_language)->toBe('en');
});

9.9 KiB Raw Blame History