The Intelligent Voice system is a batch processing system designed to process hours of audio recordings as efficiently as possible. We also offer systems designed for real-time transcription, low latency keyword spotting, IoT devices and embedded applications and more - for details on requirements for these please contact us.
Minimum Requirement: single Virtual Machine with no GPU acceleration
Intelligent Voice can be installed in a single Virtual machine without GPU acceleration. This system will be functionally identical to GPU accelerated systems but lower performance. Suitable applications include:
- Low volume batch processing:
- up to 500 hours of audio recordings per day using older models (NASRv4 and earlier)
- not recommended for newer models (NASRv5 and later)
- Functional system evaluation and compatibility testing
- Software development and integration testing
- Test and QA systems
The minimum requirements are:
- 4 or more x86 vCPUs (with AVX, Intel "Sandy Bridge" or later / AMD "Bulldozer" or later)
- at least 64Gb RAM
- at least 500GB storage*
*If the disk is partitioned, it must have sufficient space mounted under /var/lib for container images (320GB if all features are installed) and under /opt for language models (40GB for general data and approximately 1.5GB for each model).
Operating systems supported:
- Red Hat Enterprise Linux 9 (recommended) or 8
- Ubuntu 20.04 LTS, 22.04 LTS
- Oracle Linux 8
Additional vCPUs and storage are required to increase performance.
For help on sizing larger installations please see below or contact us.
High Performance GPU Single Server
To get the best performance from Intelligent Voice we recommend servers with NVIDIA GPU cards.
An example system specifications suitable for production systems:
- 1 x AMD EPYC 9124 3.0Ghz 16-core CPU
- 128 GB RAM
- 2 x 1TB SSD
- 2 x NVIDIA A2 GPU
An example server spec for larger installations:
- 2 x AMD EPYC 75F3 2.95GHz 32-core
- 512GB RAM
- 2 x 2TB NVMe SSD
- 4 x NVIDIA Tesla A100 80GB
Additional storage will be required for storage of audio data and outputs, with the size depending on audio / video file formats, and the required retention period.
High Performance GPU Multiple Servers
Intelligent Voice can scale over any number of GPU servers. An example system for higher volume processing:
1 x application server:
- 2 x AMD EPYC 75F3 2.95GHz 32-core
- 512GB RAM
- 2 x 2TB NVMe SSD
5 x GPU processing node servers:
- 1 x AMD EPYC 9124 3.0Ghz 16-core CPU
- 128 GB RAM
- 2 x 1TB SSD
- 8 x NVIDIA A2 GPU
Larger Clusters & Geographical Distribution
Intelligent Voice can scale to millions of audio hours per day and has options of synchronizing over multiple geographic regions - please contact us for information
Comments
0 comments
Article is closed for comments.