Skip to main content

Upload audio

Uploading an audio file is the first step of any audit in AuditorIA. This guide walks through the form field by field, when to use each option, and how to interpret the system response.

AuditorIA upload form with Campaign ID, Operator ID, Call start, Direction, and drag-and-drop area

Reaching the form

Three equivalent entry points:

  1. Upload file item in the sidebar.
  2. Direct route /subir-archivo.
  3. Key N from any screen (if the shortcut is enabled in your tenant).

Transcription engine, language, and device are resolved automatically from the AI settings of the selected campaign (step 7 of the campaign wizard).


Form fields

Campaign ID *

Required selector with the campaigns available to your user. The campaign defines:

  • The applicable audit sheet template (wizard step 4).
  • Alert criteria (step 5).
  • Automatic processes to trigger on completion (step 6).
  • AI routing per functionality (step 7): transcription model, tag generation, speaker analysis, etc.
tip

Visible campaigns

The list is filtered by permissions. If you don't see the expected campaign, ask your admin to grant the relevant path from Team management > Roles and Permissions.

Operator ID *

Required selector for the operator linked to the audio. If your user has a personal operator_id, it comes pre-selected; otherwise you pick from the campaign's operator list. Determines which representative the call is imputed to for ranking, Daily Sample reports, and Cross & Skills.

Call start *

Start date and time of the actual call (not the upload time). Format DD/MM/YYYY HH:MM. If unknown, leave the default (current date/time).

info

Why it matters

Dashboard date ranges and reports use this timestamp as their axis. A mis-entered date drops the call out of reporting windows.

Direction

ValueWhen to use
InboundThe customer initiates the call into the contact center
OutboundThe agent dials the customer (sales, collections, reminders)

Direction influences diarization (which speaker is typically agent vs customer) and some sheet criteria.

Advanced options

Collapsible toggle with parameters that are only needed when overriding the campaign defaults:

FieldUse
Languagees · en · pt · auto. Default = campaign setting.
Transcription modelWhisperX / OpenAI Whisper API / Deepgram
DeviceCPU / CUDA (only for WhisperX)
NotesFree text attached to the task
Whisper paramsbeam_size, vad_filter, compute_type (float16/int8)
Diarization thresholdFine-tuning for the speaker splitter
tip

Leave Advanced options collapsed unless you have a concrete reason. The per-campaign configuration is already optimized by the admin for that use case.

Attach files

Drag-and-drop area with the caption "Drop files here — Multiple formats supported. Also supports ZIP and CSV for bulk upload."

Individual files

FormatExtensionNote
WAV.wav, .x-wavUncompressed, maximum quality
MP3.mp3Most common in contact centers
MPEG.mpegMP3 variant
AAC.aacModern compression, common on mobile
OGG.oggFree format
WebM.webmWeb captures, browser recordings
FLAC.flac, .x-flacLossless compression

Bulk upload via ZIP

  • Zip a batch of audios into a single .zip and drop it on the zone.
  • The backend unpacks it, creates one task per valid audio, and applies the same Campaign / Operator / Direction values to all of them.
  • Corrupt files inside the ZIP are skipped and listed in the final notification.

Bulk upload via CSV + audios

  • Button Download metadata CSV template (above the form) generates a metadata.csv with the expected columns:
    • filename · campaign_id · operator_id · start_datetime · direction · notes
  • Build a .zip containing the audio files + the metadata.csv at the root.
  • Each CSV row overrides the global form values for that file.
warning

Default limits

  • Maximum size: 500 MB per individual file or ZIP.
  • Maximum recommended duration: 120 minutes per audio.
  • Larger files must be split, or contact the admin to raise the tenant limit.

Submitting

  1. Verify required fields (Campaign ID, Operator ID, Call start, at least one file).
  2. Click Start task.
  3. The backend returns one task_id (UUID) per file and enqueues tasks in Redis.
  4. The UI redirects to All tasks with the tasks in Pending state.

Example response

{
"task_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "pending",
"created_at": "2026-04-16T14:23:00-03:00",
"campaign_id": 12,
"operator_id": 2608,
"direction": "inbound",
"language": "es",
"engine": "whisperx",
"device": "cuda"
}

What happens next (pipeline)

Stages and indicative times

StageTypical duration (30 min of audio)
Upload to backend5-15 s (network dependent)
Queueing<100 ms
Transcription with WhisperX GPU1-2 min
Transcription with OpenAI/Deepgram30-60 s
Diarization30-60 s (included in WhisperX)
GPT analysis (tags + sentiment)10-30 s
End-to-end total~2-5 min on GPU; 3-8 min on CPU

Bulk upload (4 paths)

  1. Multiple selection on the drop zone — each file creates its own task with the same parameters.
  2. ZIP — same as 1 but packaged.
  3. ZIP + metadata.csv — each task picks specific values from a CSV row.
  4. Automatic integrations:
    • Net2Phone — webhook that creates tasks on call end.
    • Anura — webhook for cloud telephony recordings.
    • SFTP — worker that syncs a remote folder.
    • External APIPOST /api/v1/transcribe with API Key, for custom systems.

Configure automatic sources from step 2 of the campaign wizard: Audio Sources.

warning

Bulk uploads with large files can saturate the queue. Prefer off-peak hours or use integrations to spread the load.


Troubleshooting

SymptomDiagnosisAction
"File too large"Exceeds 500 MBSplit the file (ffmpeg) or ask admin to raise the limit
"Unsupported format"Extension not listedConvert to WAV: ffmpeg -i input.xxx output.wav
"No permission for campaign"Your role lacks the campaign pathRequest access via admin in Team management
Task stays Pending >10 minNo active workers or saturated queueNotify admin; check logs in Settings > Logs
Task turns Error immediatelyCorrupt audio or 0 durationVerify the file locally with VLC or Audacity
Diarization mixes speakersMono channel with heavily overlapping partiesTry stereo transcription (see Stereo guide)
Transcription with hallucinationsVery noisy audio or long silencesEnable vad_filter in Advanced options
ZIP with CSV ignores rowsmetadata.csv not at the ZIP rootMake sure metadata.csv is at the top level of the ZIP

Next steps