THE ROLE

At Mach9, ML infrastructure engineers build and maintain the systems that power production AI models for civil engineering and surveying. Our ML pipeline spans 10,000+ miles of labeled survey data, image segmentation networks, and 3D prediction models serving real-time inference to surveyors and engineers in the field.

This role is ideal for mid-career ML infrastructure engineers with experience building for both training and inference.

You'll build training pipelines that handle deep transformer models on hundreds of terabytes of 3D point cloud and image data. You'll also architect our inference infrastructure, delivering both heavy offline detection algorithms and real-time responsive inference that integrates directly with our CAD software.

RESPONSIBILITIES

Design and build a centralized system for versioning training data, generated datasets, and model artifacts, with full lineage tracking from raw source data through to trained model outputs.
Develop and maintain reliable, reproducible ML training and data generation pipelines.
Refactor and harden existing training and data generation scripts into composable, testable, and maintainable components.
Create CI/CD workflows for validating data pipelines and model training runs, including automated correctness checks and regression detection.
Build tooling that enables ML engineers to launch, monitor, and debug training jobs with minimal friction.
Optimize and scale real-time model inference services to meet latency and throughput requirements in production, including profiling, batching strategies, and resource-efficient serving.
Own the deployment path from trained model artifact to production endpoint, ensuring reliable rollouts, rollback, and monitoring.

REQUIREMENTS

3+ years of work experience in relevant fields.
Bachelor's or Master's degree in Computer Science, Engineering, or equivalent experience.
Strong communication skills and the ability to work closely with ML researchers and engineers to understand their workflows and translate them into robust systems.
Experience designing and building data versioning, artifact management, or dataset lineage systems (e.g., DVC, LakeFS, Weights & Biases, or custom solutions).
Hands-on experience with ML pipeline orchestration tools (e.g., Airflow, Prefect, Metaflow, or similar).
Experience with model serving and inference optimization — profiling latency, reducing memory footprint, or scaling serving infrastructure to meet real-time constraints.
Ability to read and refactor ML training code — you don't need to design model architectures, but you need to understand what training pipelines are doing well enough to make them reliable.
Proficient with Python, PyTorch.

BONUS QUALIFICATIONS

Familiarity with AWS infrastructure services.
Experience with containerized ML workflows and GPU-accelerated training environments.
Experience with model optimization techniques (e.g., quantization, TensorRT, ONNX Runtime, distillation).
Knowledge of infrastructure-as-code tools (e.g., AWS CDK, Terraform).
Experience building or operating ML systems that handle large unstructured datasets (imagery, 3D data, sensor data).

ML Infrastructure Engineer

Job description

Explore more

Career resources

Similar jobs

Staff Cloud Infrastructure Engineer (VMware)

Systems Engineer (Network / Storage / Systems)

Analytical Systems Engineer

Senior Solutions Architect

Forward Deployed Infrastructure Engineer

Senior Finance Systems Engineer -Zuora system