Deployment
The EXA4MIND AI Inference Service leverages the HEAppE middleware to manage the inference server jobs. Once the HEAppE server is ready, it is necessary to upload the backend inference engine container image to your HPC project storage.
Deploy any of the supported inference engines using one of the guides:
Deploy the service API by following this guide.
Prerequisites
- Access to the HPC login node via SSH
- Access to a computational project with GPU allocation and a large project storage. The storage will be used for inference engine container images and AI model cache.
- Apptainer (formerly Singularity) installed on the cluster.
- Docker available locally (for conversion step).