The Intelligent Voice system is a batch processing system designed to process hours of audio recordings as efficiently as possible. We also offer systems designed for real-time transcription, low latency key word spotting, IoT hub devices and embedded applications and more - for details on running these solutions on Azure please contact us.
Single VM installation
For a basic installation of Intelligent Voice 6 with GPU acceleration on AWS we recommend a single GPU instance type with 128GB RAM and 4 CPUs. The minimum storage requirement is 500GB. The recommended OS is Red Hat Enterprise Linux 8. Other supported OSs include RHEL 9, Ubuntu 20.04 and 22.04 and Oracle Linux 8.
At time of writing in August 2023, to install on a single VM the g4dn.8xlarge type is recommended.
A single VM is recommended for evaluation and lab use. Installing on multiple VMs is recommended for production use, to improve resilience and scalability, and to reduce costs.
Production deployment example
An example of a full production system deployment with autoscaling.
This system uses a single application server VM, with Autoscaling Groups deploying AMIs for all the compute instances. The current state of the IV job queue is sent to Cloudwatch and scaling rules created based on the number of jobs.
The database and file store can optionally use Amazon RDS for MariaDB and S3, to support high availability configurations and/or cross-region replication.
This solution is ideal to process 10,000 - 100,000 hours of audio per day. To scale up to 1,000,000 audio hours per day or more, multiple application servers can be run with a load balancer, or traffic can be sharded across multiple IV systems.
This diagram shows how to configure the system with 9 VM scale sets supporting all optional features:
If you don't use some of the features, they don't need to be installed. Note the following dependencies:
ASR, Diarization and Voice Biometrics require VAD
Summarization and Sentiment require ASR
Tagger requires ASR or VideoOCR
Voice Biometrics requires Elasticsearch
JumpToWeb requires Sphinxsearch (on app server)