Transcription viewer
The transcription viewer is the auditor's workspace. It combines audio playback, speaker-segmented transcription, emotion analysis, in-text search, and a qualitative speaker profile for each interlocutor.

Opening the viewer
- From the Dashboard → quick-access card View your history.
- In the task list, filter by Completed and click a row or ID.
- The viewer opens on the Transcription tab.
Direct URL
If you know the task_uuid, go directly to /transcription/{task_uuid}.
Anatomy
| Zone | Content |
|---|---|
| Header | Breadcrumb Dashboard > Transcription, Advanced mode toggle, avatar |
| Left column | Segments of the operator (agent) |
| Right column | Segments of the customer |
| Player | Play/pause, speed, scrubber, download |
| Side panel | Toggles for Word search and Speaker profile evaluation |
Two columns
Diarization splits the conversation into two columns so it reads like a chat. Each message includes timestamp and an emotion icon.
Segments and diarization
Each segment shows:
| Element | Description |
|---|---|
| Speaker | Speaker 00 (operator) and Speaker 01 (customer) by convention — renameable |
| Timestamp | Segment start and end (e.g. 0m 34.62s - 0m 39.28s) |
| Transcribed text | Output from Whisper/Deepgram/OpenAI |
| Emotion indicator | Visual label reflecting the dominant sentiment |
| Copy | Copy control to copy text to clipboard |
Rename speakers
Click Speaker 00 or Speaker 01 to assign descriptive names — "Agent Maria", "Customer", "Supervisor". Changes persist and apply to exports.
Clickable timestamps
Any timestamp plays the audio from that moment. The currently playing segment is highlighted with a glowing border so you can follow along.
Audio player
Persistent at the bottom of the viewer:
| Control | Action |
|---|---|
| Play / Pause | — |
| -10s / +10s | Quick jumps |
| Scrubber | Visual navigation |
| Volume | 0–100 slider |
| Speed | 0.5× / 0.75× / 1× / 1.25× / 1.5× / 2× |
| Download | Downloads the source file (permission required) |
The source file is served from MinIO via a short-lived signed URL (≤5 min) to prevent leaks.
Emotion analysis per segment
Each segment shows a visual indicator for the dominant emotion:
| Emotion | On-screen indicator | Typical reading |
|---|---|---|
| Happy | Positive state | Satisfaction, enthusiasm |
| Neutral | Neutral state | Informational tone |
| Surprise | Surprise state | Novelty, unexpectedness |
| Sad | Sad state | Disappointment, mild frustration |
| Angry | Angry state | Strong frustration |
| Fear | Concern state | Worry |
| Disgust | Rejection state | Rejection, discomfort |
Hover any indicator to see the probability breakdown. See Sentiment analysis.
Word search within the audio
The Word search side panel lets you detect specific terms and locate them in the transcription.

Flow
- Type a term and confirm with
Enteror ✓. - Edit any term with the edit icon or remove it with the delete icon.
- Click Search.
- The system highlights matches directly on the segments.

Result legend
| Icon | Meaning |
|---|---|
| Yes | Word found |
| Required not found | Required word not found (e.g. in Compliance) |
| Optional not mentioned | Optional word not mentioned |
Compliance use
For regulated audits, load the list of required phrases (greeting, close, legal disclaimers) and the viewer flags any that are missing.
Speaker profile evaluation
Next to word search, the panel includes Evaluate speaker profile. Generate a qualitative summary of each interlocutor from all their segments.

How it works
- Tabs: Speaker 00 / Speaker 01 (or your renamed labels).
- Click the tab for the interlocutor.
- The GPT model generates a behavior summary — style, tone, professionalism, empathy, clarity.
- Use the copy button to copy the evaluation to the clipboard.
Use cases
- Coaching: objective feedback for the agent.
- Compliance: evidence that the script was followed.
- Hiring: profile candidates in recorded interviews.
- Disputes: summarize each party's stance in seconds.
Task actions
Depending on your permissions:
| Action | Description | Permission |
|---|---|---|
| Approve | Mark the audit as ok | Supervisor, Quality |
| Reject | Mark the audit as failed | Supervisor, Quality |
| Edit transcription | Fix model errors | Quality, Admin |
| Export | Download in TXT/JSON/CSV | All |
| Re-run | Reprocess with another engine | Admin |
| Archive | Move to historical archive | Supervisor |
| Share | Generate signed URL (configurable TTL) | Supervisor, Admin |
Audit trail
Every edit is logged (who, what, when). Editing does not alter the source audio.
Export
| Format | Content |
|---|---|
| TXT | Plain transcription only |
| JSON | Transcription + full metadata (timestamps, emotions, tags, score) |
| CSV | One row per segment — great for Excel |
| Formatted report with cover, transcription, analysis | |
| SRT/VTT | Subtitles with timestamps |
Troubleshooting
| Problem | Diagnosis | Solution |
|---|---|---|
| Segments out of sync with audio | VBR or corrupted header | Re-encode with ffmpeg -i in.mp3 -b:a 192k out.mp3 |
| Diarization mixes speakers | Mono with lots of overlap | Try stereo transcription |
| Wrong proper nouns | Out-of-vocabulary terms | Add terms to tenant dictionary |
| Can't see the Approve button | Missing permission | Request Quality or Supervisor role |