Everything you need to know about using QueryTube
Click "Get Started" and enter your email. We'll send you a magic link, no password needed.
Paste any YouTube URL. We'll automatically extract the transcript, chunk it, and create searchable embeddings.
Type any question about the video. Get instant answers with exact timestamps.
QueryTube uses semantic search to find relevant sections of your video and answer questions accurately.
We extract the YouTube caption file or use Whisper AI if captions aren't available.
The transcript is split into 60-90 second chunks with overlap for context continuity.
Each chunk is converted into a 1024-dimension vector using Mistral AI embeddings.
When you ask a question, we find the top 20 most relevant chunks using cosine similarity in Pinecone.
Groq's Llama 3.3 70B model synthesizes the answer using only the retrieved chunks, no hallucinations.
Every answer includes timestamp citations showing exactly where the information comes from.
Example answer:
At [00:04–00:49], Phelps explains his training philosophy...
Timestamps are formatted as time ranges (start–end) to show the full context, not just a single moment.
Free plan: up to 2 hours. Pro plan: up to 10 hours.
Currently supports YouTube videos only. Other platforms coming soon.
English, French, and Hindi fully supported. Other languages in beta.
First-time processing takes 2-5 minutes depending on video length. Subsequent queries are instant.
Answers are grounded in the transcript but depend on transcription quality. Poor audio = less accurate answers.
✅ "What does Sam Altman say about AGI timelines?"
❌ "Tell me about AI"
If you remember specific terms or phrases, use them in your question for better results.
QueryTube remembers your conversation. Ask follow-up questions to dig deeper.