The Intelligent Voice system is a batch processing system designed to process hours of audio recordings as efficiently as possible. We also offer systems designed for real-time transcription, low latency key word spotting, IoT hub devices and embedded applications and more - for details on running these solutions on AWS please contact us.
A basic installation of Intelligent Voice with GPU acceleration on AWS requires an instance with 32GB RAM and 4 CPUs.
At time of writing in Dec 2019, all P2, P3 and G4 instances types meet these requirements, but the preferred instance type is g4dn.2xlarge. Intelligent Voice also offers pre-built AMIs which will run on any P2, P3 or G4 instance - please contact us for details.
The minimum storage requirements are for an EBS volume of 60GB. This will need to be increased if more than 100 hours of audio data is processed. The system can also be configured to use S3 for audio data.
A single GPU instance is suitable for evaluation of the Intelligent Voice system and performance benchmarks, and for batch processing of small-to-medium audio data sets.
Low volume batch processing
To provide the lowest cost per audio hour, we recommend that Intelligent Voice is installed as a minimal auto-scaling solution. This would comprise at least:
- 1 application server, online whenever the API or review platform are required
- 1 or more processing nodes (GPU optional), online only when processing new data
The minimum requirements for an application server are 16GB RAM and 2 CPUs. Low cost instances types such as t2.xlarge or m5.xlarge are suitable for this. The minimum storage requirements are for an EBS volume of 60GB.
The minimum requirements for a processing node without a GPU are 16GB RAM and 4 CPUs. c5.2xlarge is a suitable instance type for this. The minimum storage requirements are for an EBS volume of 60GB.
The minimum requirement for a processing node with a GPU is a P2, P3 or G4 instance type. The minimum storage requirements are for an EBS volume of 60GB.
Typically processing batches of audio under 4 hours in total size will be cheaper without a GPU. Batches over 4 hours will be cheaper with a GPU, and much faster.
High volume batch processing
More details to follow - please contact us for information