Administrator/beta-real-debrid-downloader

Sucukdeluxe efa0909e11 feat: Download System v2 — complete rewrite of download pipeline

Replace monolithic download-manager.ts (9500 lines) with 7 focused modules:

- error-classifier.ts: 25+ typed DownloadErrorKind enum, classifier functions
  for network/HTTP/debrid/extraction errors — no more string matching
- retry-manager.ts: Declarative per-error-kind retry policies, exponential
  backoff, shelving after 15 failures, state export/import
- stream-writer.ts: HTTP stream → file with pre-resume validation, stall
  detection, NTFS-aligned buffered writing, Range-ignored detection
- pipeline.ts: Single download lifecycle (unrestrict → stream → verify),
  throws typed errors, caller decides retry strategy
- post-processor.ts: Extraction state machine with hard caps (3 attempts
  per archive, 5 rounds per package), no infinite loops
- scheduler.ts: Queue management with priority-based slot allocation,
  heartbeat stall detection, global watchdog, provider cooldowns
- download-manager.ts: Drop-in orchestrator (~1500 lines), same public API

Fixes:
1. Hanging downloads: heartbeat-based stall detection + global watchdog
2. Wrong error classification: typed enum at point of origin
3. Unreliable resume: file size vs tracker validation, Range-ignored detection
4. Extraction loops: bounded retries with state machine

215 new unit tests for error-classifier and retry-manager (all passing).
Build compiles cleanly. Same IPC interface — UI unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-08 18:14:17 +01:00

10 KiB

Raw Permalink Blame History

Download System v2 — Complete Redesign

Goal

Replace the 9500-line monolithic download-manager.ts with a clean, modular download system that fixes:

Downloads hanging without clean restart
Wrong error classification leading to wrong retry paths
Unreliable resume (corrupt files, unnecessary restarts)
Post-processing (extraction) breaking or looping

Constraints

Same IPC interface — drop-in replacement, no UI changes needed
Same external dependencies (debrid.ts, storage.ts, integrity.ts)
Same session/settings persistence format

Architecture

Module Structure

src/main/download/
├── download-manager.ts      # Orchestrator (~500 lines) — coordination only
├── scheduler.ts             # Queue management, slot allocation, priorities
├── pipeline.ts              # Single download flow: unrestrict → stream → verify
├── stream-writer.ts         # HTTP streaming, resume, buffered writing, NTFS
├── error-classifier.ts      # Typed error system (enums, not string matching)
├── retry-manager.ts         # Central retry logic, backoff, shelving, state
└── post-processor.ts        # Extraction queue, hybrid retry, cleanup

Module Responsibilities

1. download-manager.ts (Orchestrator)

Holds session state, packages, items
Exposes same IPC methods as current (startRun, stopRun, pauseItem, etc.)
Delegates to Scheduler for queue management
Delegates to Pipeline for individual downloads
Delegates to PostProcessor for extraction
Emits same events as current (progress, status changes)
Handles persistence (save/load session)

2. scheduler.ts

findNextItem(): priority-based queue with provider cooldown awareness
fillSlots(): start downloads up to maxParallel
Scheduler loop with generation guard (prevents stale schedulers)
Global stall watchdog
Provider cooldown tracking (circuit breaker)
AllDebrid paced-start / hoster-limit logic

3. pipeline.ts

runDownload(item, context): single download lifecycle
Step 1: Unrestrict link via debrid service
Step 2: Stream file via StreamWriter
Step 3: Verify integrity (CRC if available)
Step 4: Signal completion
Each step returns typed result or throws typed DownloadError
No retry logic here — just reports what happened

4. stream-writer.ts

streamToFile(url, targetPath, options): HTTP streaming
Resume support with pre-validation:
- Check existing file size against tracked downloadedBytes
- Truncate if sparse file detected (pre-allocated > actual)
- Send Range header only after validation
HTTP 416 handling (complete vs incomplete)
Server-ignored-range detection (200 instead of 206)
Buffered writing with NTFS 4KB alignment
Sparse file pre-allocation (Windows)
Content-Disposition filename override
Stall detection (configurable timeout, default 10s)
Drain timeout for slow disks (default 5min)
Progress reporting via callback

5. error-classifier.ts

DownloadErrorKind enum with all error categories
DownloadError class extending Error with .kind property
classifyError(error, context): takes raw error + context, returns DownloadError
- Classifies at point of origin (HTTP layer, fetch layer, debrid layer)
- No post-hoc string matching needed
classifyHttpStatus(status, headers): HTTP-specific classification
classifyFetchError(error): network-level classification
classifyUnrestrictError(error): debrid-specific classification

enum DownloadErrorKind {
  // Network
  NetworkReset,        // ECONNRESET, socket hang up, EPIPE
  Timeout,             // No data received within stall timeout
  DnsFailure,          // ENOTFOUND

  // HTTP
  RangeNotSatisfied,   // 416 — file may be complete or need restart
  RangeIgnored,        // Server sent 200 instead of 206
  ServerError,         // 500, 502, 503
  RateLimited,         // 429
  Forbidden,           // 403 — link expired
  NotFound,            // 404 — file removed from CDN

  // Provider/Debrid
  UnrestrictFailed,    // Provider can't convert link
  ProviderBusy,        // Concurrent download limit
  ProviderDown,        // Provider service unavailable
  HosterUnavailable,   // Hoster down (not provider issue)
  LinkDead,            // Permanent: file deleted at source
  QuotaExceeded,       // Daily traffic limit

  // Filesystem
  DiskFull,            // ENOSPC
  PermissionDenied,    // EACCES, EPERM
  FileLocked,          // EBUSY (Windows)

  // Integrity
  FileCorrupt,         // CRC/size mismatch after download
  FileTruncated,       // Downloaded less than expected

  // Extraction
  WrongPassword,       // Archive password incorrect
  ArchiveCorrupt,      // Archive header/data damaged
  ExtractorCrash,      // 7-Zip/WinRAR process crashed
  ExtractionLoop,      // Same archive failed extraction 3+ times
}

6. retry-manager.ts

RetryManager class holds all retry state per item
Deklarative retry policies per DownloadErrorKind:

interface RetryPolicy {
  maxRetries: number;          // 0 = no retry (permanent failure)
  backoff: "fixed" | "exponential" | "linear";
  baseDelayMs: number;
  maxDelayMs: number;
  resetFile: boolean;          // Delete partial file before retry
  switchProvider: boolean;     // Try different provider
  refreshLink: boolean;       // Get new direct link from debrid
  providerCooldownMs?: number; // Apply cooldown to current provider
}

shouldRetry(itemId, error): returns { retry: boolean, delayMs, actions[] }
recordFailure(itemId, error): tracks failure for shelving
Shelving: after N total failures (configurable, default 15), pause 90s + reset provider
State persists across stop/start (same format as current retryStateByItem)
resetItem(itemId): clear all retry state (manual reset)

7. post-processor.ts

PostProcessor class with extraction queue

State machine per package:

pending → extracting → done
              ↓
          retry (max 2) → failed

Tracks extraction attempts per archive (max 3 retries)
No infinite loops: hard cap on retry count
Hybrid extract retry: if archive corrupt + redownload suggested, queue redownload (max 1 time)
Cleanup: remove partial extracts on failure
Empty folder cleanup after successful extraction

Data Flow

User clicks Start
    ↓
DownloadManager.startRun()
    ↓
Scheduler.start() — begins loop
    ↓
Scheduler.findNextItem() — picks highest priority queued item
    ↓
Pipeline.runDownload(item)
    ├── debridService.unrestrict(item.link)
    │   └── error? → ErrorClassifier.classify() → DownloadError
    ├── StreamWriter.streamToFile(url, path, opts)
    │   ├── Resume validation
    │   ├── HTTP streaming with stall detection
    │   └── error? → ErrorClassifier.classify() → DownloadError
    └── integrityCheck(file)
        └── error? → DownloadError(FileCorrupt)
    ↓
Success → mark completed → Scheduler fills next slot
Error → RetryManager.shouldRetry(item, error)
    ├── retry: true → Scheduler.queueRetry(item, delay, actions)
    └── retry: false → mark failed
    ↓
All items done → PostProcessor.run(package)
    ├── Extract archives
    ├── Verify extracted files
    └── Cleanup

Resume Validation (Key Improvement)

Current problem: Resume trusts file size blindly, leading to corrupt files.

New approach:

Before sending Range header, validate existing file:
- stat.size must match item.downloadedBytes (±1KB tolerance for flush timing)
- If mismatch > 1MB: file is from sparse pre-allocation → truncate to downloadedBytes
- If mismatch < 1MB but > 1KB: suspicious → delete and restart fresh
After resume response, validate:
- 206 with correct Content-Range → continue
- 200 (range ignored) → classify as RangeIgnored, retry with fresh link
- 416 → check if file actually complete (existingBytes >= expectedTotal)
After download complete, validate:
- Final file size matches expected total
- CRC check if manifest available

Stall Detection (Key Improvement)

Current problem: Downloads hang and stall detection sometimes doesn't trigger properly.

New approach:

Per-download heartbeat: StreamWriter emits heartbeat every second with bytes received
Scheduler monitors heartbeats: if no heartbeat for stallTimeoutMs → abort + retry
Disk-write awareness: separate tracking for "blocked on disk write" vs "blocked on network"
Global watchdog: if ALL active downloads show zero progress for 60s (excluding disk-blocked), abort all and re-queue
Validating timeout: if unrestrict takes > 30s, abort and retry (prevents infinite hang in validation phase)

Post-Processing State Machine (Key Improvement)

Current problem: Extraction can loop infinitely if archive keeps failing.

New approach:

ExtractionState per archive:
{
  archivePath: string;
  status: "pending" | "extracting" | "done" | "failed";
  attempts: number;        // max 3
  lastError?: string;
  redownloaded: boolean;   // max 1 redownload
}

Rules:

Max 3 extraction attempts per archive
If ArchiveCorrupt + redownloaded === false → queue redownload, set redownloaded = true
If ArchiveCorrupt + redownloaded === true → fail permanently
If WrongPassword → try next password from list, fail after all exhausted
If ExtractorCrash → retry once, fail on second crash
Package marked as "completed with errors" if any archive fails permanently

Migration Strategy

New code lives in src/main/download/ directory
Old src/main/download-manager.ts stays untouched as reference
New download-manager.ts in src/main/download/ implements same class interface
Switch import in main.ts from old to new
Test with real downloads
Delete old file when stable

Testing Strategy

Unit tests for ErrorClassifier (classify every known error string)
Unit tests for RetryManager (policy application, shelving threshold)
Unit tests for StreamWriter resume validation logic
Unit tests for PostProcessor state machine
Integration test: Scheduler + Pipeline with mocked debrid/HTTP

10 KiB Raw Permalink Blame History

Download System v2 — Complete Redesign

Goal

Constraints

Architecture

Module Structure

Module Responsibilities

1. download-manager.ts (Orchestrator)

2. scheduler.ts

3. pipeline.ts

4. stream-writer.ts

5. error-classifier.ts

6. retry-manager.ts

7. post-processor.ts

Data Flow

Resume Validation (Key Improvement)

Stall Detection (Key Improvement)

Post-Processing State Machine (Key Improvement)

Migration Strategy

Testing Strategy

10 KiB

Raw Permalink Blame History