YouTube Video to Text

Video → Transcript (Local + YouTube support)

Video → Transcript (Local + YouTube)

Works 100% for local files (client-side audio extraction + OpenAI Whisper). For YouTube links, a small backend is required — a sample Node.js backend is included below. Read the instructions and deploy the backend once, then paste its URL into the “Backend URL” field.

Idle

Notes / How it works

  • Local files: the page uses ffmpeg.wasm to extract audio in the browser and then sends audio to OpenAI’s transcription endpoint. This works fully client-side — your video data does not go to any server except OpenAI when transcribing.
  • YouTube links: browsers cannot reliably fetch YouTube audio because of CORS and ToS. For YouTube, deploy the sample backend (Node.js) provided below — it downloads audio using yt-dlp (server-side), sends audio to OpenAI, and returns transcript to this frontend.

Sample Node.js backend (optional — deploy once)

// Save as server.js
const express = require('express');
const fetch = require('node-fetch');
const { execFile } = require('child_process');
const fs = require('fs');
const path = require('path');
const multer = require('multer');
const upload = multer();

const app = express();
app.use(express.json());

// POST /api/transcribe with JSON: { youtubeUrl, openaiKey }
app.post('/api/transcribe', async (req, res) => {
  try{
    const { youtubeUrl, openaiKey, language } = req.body;
    if(!youtubeUrl || !openaiKey) return res.status(400).json({error:'youtubeUrl and openaiKey required'});

    const outFile = path.join(__dirname, 'out_audio.wav');
    // Use yt-dlp to get bestaudio and convert to wav (requires yt-dlp and ffmpeg installed)
    await new Promise((resolve, reject)=>{
      // yt-dlp -x --audio-format wav -o out_audio.%(ext)s 
      execFile('yt-dlp', ['-x','--audio-format','wav','-o',outFile, youtubeUrl], (err, stdout, stderr)=>{
        if(err) return reject(err);
        resolve();
      });
    });

    // send to OpenAI transcription
    const form = new (require('form-data'))();
    form.append('file', fs.createReadStream(outFile));
    form.append('model','whisper-1');
    if(language) form.append('language', language);

    const r = await fetch('https://api.openai.com/v1/audio/transcriptions', {
      method:'POST',
      headers: { 'Authorization': 'Bearer ' + openaiKey },
      body: form
    });
    const j = await r.json();
    // cleanup
    try{ fs.unlinkSync(outFile); }catch(e){}
    return res.json({ transcript: j.text || j.transcript || j });
  }catch(err){
    console.error(err);
    res.status(500).json({ error: String(err) });
  }
});

app.listen(process.env.PORT||3000, ()=>console.log('Server ready'));
    

Instructions for backend: install Node.js, install yt-dlp and ffmpeg on server, `npm install express node-fetch form-data multer`, run `node server.js`. Deploy to a VPS or serverless platform that supports running external binaries (VPS recommended).

Now: the frontend code below will work for local files 100%. For YouTube you must provide your deployed backend URL in the “Backend URL” field and then select “YouTube” mode.