WorkEducation Technology · 2025

A 60-minute lecture, processed to study materials in under four minutes.

intelliQ

~4 mina 60-min lecture, upload to notes + flashcards + quiz
5 productsweb app · mobile · admin · microservices · AI toolchain

University students spend 3–5 hours per lecture on work that adds no understanding: transcribing recordings by hand, extracting key concepts, building flashcard decks, writing practice questions. intelliQ's founders wanted to automate the entire pipeline — but doing it right meant solving hard infrastructure problems, not just calling a transcription API.

Cloud speech-to-text services are slow, expensive at scale, and send student audio to third-party servers. The platform needed a transcription engine fast enough to feel instant, accurate enough to handle accented lecture speech, and private enough that student data never left the infrastructure. That ruled out every major cloud ASR vendor.

Beyond transcription, the system had to be intelligent — not just produce a raw transcript, but structure it into hierarchical notes, generate flashcards, produce graded quizzes, and power a tutor that could answer follow-up questions grounded in the actual lecture content. And it all had to work reliably under concurrent load across a web app, a mobile app, and an admin control plane.

The AI pipeline
intelliQ AI processing pipeline: audio/video/PDF inputs → Parakeet GPU transcription → Claude structuring → notes, flashcards, quiz

We deployed NVIDIA Parakeet — one of the fastest open ASR models available — on a dedicated GPU EC2 instance, wrapping it in a FastAPI service that accepts any audio or video file and returns a clean transcript via a single HTTP call. Parakeet processes a 60-minute lecture in approximately 2.5 minutes, running entirely within intelliQ's own AWS infrastructure. Student audio never touches a third-party transcription API.

Once a transcript exists, a Claude Sonnet pipeline takes over. A dedicated notes engine uses Claude's tool-use API to structure the raw transcript into hierarchical sections, extract definitions and key concepts, seed the spaced repetition flashcard deck (SM-2 algorithm), and generate a graded quiz. A separate recall-grading prompt evaluates student answers with a correct / partial / wrong verdict. Embeddings are computed locally via Ollama (mxbai-embed-large) and indexed in Pinecone for the AI tutor's evidence retrieval — so every tutor response cites the student's own lecture, not general knowledge.

The processing pipeline runs on BullMQ across six Redis-backed queues: audio merge, transcription, post-processing, notes indexing, preprocessing test, and a dead-letter queue for failed jobs. A worker process on EC2 picks up transcription jobs, calls the Parakeet service, and posts results back to the web API via a shared secret. Students see live progress via Server-Sent Events — uploading → processing → transcribing → structuring → saving → complete — with partial transcripts appearing in the UI as chunks finish. We measured this end-to-end pipeline in production Redis logs: a real university lecture completed in 3 minutes 58 seconds from job start to done.

BullMQ queue architecture · 6 queues · Redis-backedBullMQ 6-queue architecture: audio-merge, audio-transcribe, transcript-postprocess, notes-index, preprocess-test, dead-letter — EC2 worker with SSE live progress feed
The output — smart structured notes generated from a lectureintelliQ smart structured notes — AI-generated from a lecture recording
~4 minEnd-to-end pipelineupload to notes, flashcards & quiz — measured in production
2.5 minGPU transcriptionNVIDIA Parakeet, 60-minute lecture
5 productsSurfaces shippedweb · mobile · admin · image search · AI toolchain
44kLines of codeacross the full intelliQ system
NVIDIA Parakeet (GPU ASR)Anthropic APIBullMQ + RedisOllama / mxbai-embed-largePineconePostgreSQL (Neon)AWS S3 + EC2Next.js 15React Native (Expo)Prisma
More coming soonBack to work

Ready to build something similar?

Book a discovery meeting