We are hiring an LLM Ops Engineer to join our AI Research team, a highly technical group working on cutting-edge advancements in the AI industry. The team focuses on building scalable, production-grade LLM systems, fine-tuning strategies, evaluation frameworks, and next-generation deployment architectures.

This role requires hands-on experience operating LLMs beyond simple API integration. The ideal candidate understands the architectural, operational, and evaluation complexities that differentiate LLMOps from traditional MLOps.

Responsibilities:-

Manage the end-to-end lifecycle of LLMs: registry, packaging, versioning, deployment, monitoring, and rollback.
Deploy and operate self-hosted / open-source LLMs (not limited to OpenAI API usage).
Design and manage scalable inference infrastructure, including GPU-aware deployments.
Implement CI/CD pipelines for LLM deployment and continuous evaluation.
Monitor system performance including latency, throughput, token usage, cost, drift (model and data), and hallucinations.
Ensure secure, compliant, and resilient cloud-based model deployments.
Collaborate with research and engineering for deployments.

Skills:-

Strong hands-on experience with LLM handling, hosting, and operationalization.
Clear understanding of how LLMOps differs from traditional MLOps (prompt management, non-deterministic outputs, semantic evaluation, token economics, guardrails etc.).
Experience with Kubernetes, Docker, and containerized deployments.
Cloud expertise (AWS / Azure / GCP) including compute, storage, IAM, networking, and monitoring.
Experience building scalable inference and model-serving architectures.
Familiarity with tools such as MLflow, Kubeflow etc. (good to have).
Understanding of vector databases, RAG systems, and evaluation frameworks (preferred).
Knowledge of GenAI security considerations (prompt injection, data leakage prevention).

Bachelor’s degree in Computer Science, Engineering, or related field.
DevOps certification (e.g., AWS DevOps Engineer, Azure DevOps, or equivalent).
3–5 years of experience in MLOps, LLMOps, ML Engineering, or related roles.
Bachelor’s or master’s degree in computer science, Artificial Intelligence, Data Science, or a related technical field.
Demonstrated experience deploying ML/LLM systems in production environments.

LLM Ops Engineer

Job description

Explore more

Career resources

Similar jobs

.Net & Cloud Lead

Cloud Engineer

Competitive Intelligence - Pharma & LifeSciences

Assistant Manager - WTS

Sales Advisor

Sales Advisor-Part Time