๐ŸŽ™ LEHJA โ€” Urdu voice-model training (live)

Updated: 2026-06-12 21:50:01 UTC ยท cron har 5 min ยท page 60s auto-refresh

OVERALL PIPELINE: 55%
GPU: 0% busy ยท jobs: 1 ยท VRAM: 0.0 GB used / 93.1 GB free / 93.6 GB total

๐Ÿ“‹ Tasks (main + sub)

1. H100 Setup โœ… 100%
   โœ… SSH access + box verification โ€” complete (2 min lagi)
   โœ… Python deps + CosyVoice repo install โ€” complete (6 min lagi)
   โœ… CosyVoice2-0.5B model download (5.3 GB) โ€” complete (5 min lagi)
   โœ… Whisper CUDA-12 libs fix (silent-fail pakra gaya tha) โ€” complete (14 min lagi)
2. Data Acquisition โœ… 100%
   โœ… Catalogued Urdu files VM101โ†’H100 (2,667 files) โ€” complete (3 min lagi)
   โœ… Full pool transfer (93,166 recordings, 17 GB) โ€” complete (9 min lagi)
   โœ… Catalog language-map (45,185 entries) โ€” complete (1 min lagi)
   โœ… Public FLEURS Urdu+English โ€” 3,637 clips โ€” complete (26 min lagi)
   โŠ˜ CommonVoice Urdu (HF-gated, zaroorat nahi) โ€” skipped
3. In-domain Urdu Mining (asal awaazein) โœ… 100%
   โœ… Known-Urdu transcription โ€” 2,667 files, 6 GPU workers โ€” complete (3 min lagi) ยท 2,668/2,668 files ยท 17,605 clips ยท 1340.2 min saaf Urdu
   โœ… Language-scan โ€” 53,054 unknown files, 16 GPU workers โ€” complete (18 min lagi) ยท 53,054/53,054 scanned ยท naye Urdu-candidates: 3,219
   โœ… Round-2: naye mile files ki transcription โ€” complete (14 min lagi) ยท 3,219/3,219
4. Training-Data Prep โฌœ 0%
   โฌœ Manifests merge + dedup + train/dev split โ€” pending
   โฌœ wav.scp / text / utt2spk / spk2utt โ€” pending
   โฌœ Speaker embeddings (campplus) โ€” pending
   โฌœ Speech tokens extraction โ€” pending
   โฌœ Parquet packaging โ€” pending
5. CosyVoice2 URDU TRAINING โฌœ 0%
   โฌœ LLM module fine-tune (Urdu phonetics) โ€” pending ยท ~10 epochs planned
   โฌœ Checkpoint averaging (best-5) โ€” pending
   โฌœ Inference smoke-test (Urdu bolta hai?) โ€” pending
6. Validation โฌœ 0%
   โฌœ Urdu test-set synthesis (50 jumlay) โ€” pending
   โฌœ Whisper-readback intelligibility gate โ€” pending
   โฌœ Zero-shot voice-clone spot test (apki awaaz) โ€” pending
7. Delivery โฌœ 0%
   โฌœ Sab artifacts app-server par download (model+data+logs) โ€” pending
   โฌœ COMPLETION EMAIL โ†’ info@ifcondition.com โ†’ AAP H100 BAND KAR DEN โ€” pending

๐Ÿ“Š Harvest

Saaf Urdu (training-ready): 1340.2 min in-domain (17,605 clips) + 3,637 public clips
Scan se naye Urdu-candidate files: 3,219 / 53,054 scanned ยท worker failures: 0

Kaam jari hai โ€” mukammal hone par email: info@ifcondition.com