Verint Financial Compliance from version 9.8 has a native, built-in integration with Intelligent Voice. Intelligent Voice can be self-hosted or delivered as a service by a Verint partner or by Intelligent Voice Ltd.
(Intelligent Voice is also available for Verint Workforce Management)
Overview
In simple terms, once VFC is configured according to these instructions and a policy is created, selected recordings are sent over HTTPS to the Intelligent Voice API, transcription and other analytics are sent back over HTTPS to VFC and stored. No data is retained on the IV server.
Low level architecture
The diagram below shows how the components within VFC and IV connect.
Verint Financial Compliance components
Speech Analytics Service
This is the component that sends recordings to IV and fetches results after processing.
The default configuration deletes all data from IV immediately after processing.
VFC database
When the Speech Analytics Service sends a recording to IV it creates a record in the speech_pending table, which is used to track the progress of the processing. When the processing is complete, the record is removed from the table.
Intelligent Voice components
API (vrx-servlet)
The Intelligent Voice application server (vrx-servlet container) provides the REST API which Verint Financial Compliance connects to. It also manages database records and queued jobs for items during processing.
IV database (mariadb)
Details of the item to be processed are stored in the items table.
Details of the each job are stored in the queue table
Job queue (gearman)
The IV application server creates jobs in the queue for each of the workers, allowing the work to distributes over many worker instances on multiple servers as needed.
Web application (JumpToWeb)
This is an optional component for IV which can be used by an administrator to view the current processing status.
Workers
Processing is done by different workers according the options selected. The default list of workers installed are:
-
Voice Activity Detection (vad-worker container) - identifies voice activity segments within a recording
-
Automatic Speech Recognition (asr-worker container) - transcribes speech to text
-
Diarization (diar-worker container) - separates speakers in a recording by their voice
-
Tagger (tagger-worker container) - identifies and links topics within a transcript
-
Sentiment (sentiment-worker container) - identifies positive and negative sentiment within a transcript
-
Summariser (transcript-summariser-worker container) - creates a summary of a transcript
-
Video OCR (ocr-worker) - extracts writing displayed in a video
-
Language Model Builder (lmbuilder-worker) - adapts ASR models to add custom words and phrases
Data flow
Processes
1.0 Record audio/video with metadata (VFC)
Audio or video is captured by the VFC system and records created in the VFC database.
2.0 Send for transcription and analytics
The Speech Analytics Service makes a REST API call to the IV Create Item endpoint. The item records are created in the IV database, the recording is converted to a standard audio format and cached, and jobs are queued for the IV system to process. The Speech Analytics Service records the item details in the speech_pending table.
3.0 processing
The IV system workers read their tasks from the queue and data from the IV API, then send completed work back to the IV API where it is stored in the IV database
4.0 Fetch transcript and analytics
The VFC server polls the IV API for completed items. When an item is completed, the results are stored in the VFC database and a final API call is made to delete the item from the IV system. Finally, the record is removed from the speech_pending table.