MCPNew: now works with Claude & AI assistants
Fnz

Fnz

AI Evaluation Engineer

Company

Fnz

Role

AI Evaluation Engineer

Location

India

Job type

Full-time

Found on Mokaru

Yesterday

Share this job

Salary

Not disclosed by employer

Job description

AI Evaluation Engineer
 

Location: Gurugram, India

Seniority: Mid-level (3-6 years)

Purpose: Build and execute FNZ's AI evaluation framework and tooling, working under the guidance of the AI Evaluations Team Lead to assess quality, safety, robustness, and operational suitability of AI solutions before release.

Key Responsibilities: 

  • Build, maintain, and evolve core components of FNZ's AI evaluation framework, including test structures, scoring approaches, reusable evaluation patterns, and supporting documentation.

  • Develop and improve evaluation tooling, harnesses, automation, and CI/CD integrations used to run repeatable assessments across AI agents and workflows.

  • Execute evaluations across FNZ's six-pillar framework, including Task Performance, Safety & Compliance, Efficiency, Groundedness & Reasoning, Robustness, and Suitability. 

  • Create and maintain test datasets, golden sets, rubrics, and scoring criteria that reflect expected agent behaviour and business requirements. 

  • Build and run test suites covering baseline behaviour, edge cases, failure modes, and adversarial scenarios for AI agents and workflows.

  • Assess multi-step agentic workflows, including planning, tool use, execution quality, recovery from errors, and adherence to controls 

  • Verify groundedness and output quality by checking responses against source content, expected reasoning patterns, and policy constraints. 

  • Document evaluation findings with clear evidence, communicate issues to AI solution teams, and support remediation and re-testing.

  • Develop deeper expertise in one or two evaluation domains while remaining effective as a generalist across the wider framework.

Skills and Experience: 

  • 3-6 years in software engineering, test engineering, AI/ML development, or data science.

  • Strong programming skills with hands-on experience building test automation, evaluation tooling, or developer productivity tooling; Python or .NET background required, with .NET preferred

  • Practical experience evaluating or building LLM applications, RAG systems, or AI agents 

  • Understanding of prompt engineering, retrieval-augmented generation, agent architectures, and common failure modes in probabilistic systems 

  • Ability to design structured test approaches, rubrics, and repeatable evaluation workflows for complex AI behaviours 

  • Analytical mindset to decompose complex agent behaviours and identify weaknesses, edge cases, and opportunities to improve the framework 

  • Strong documentation and communication skills, with the ability to explain findings and propose practical improvements to evaluation methods and tooling 

About FNZ

FNZ is committed to opening up wealth so that everyone, everywhere can invest in their future on their terms. We know the foundation to do that already exists in the wealth management industry, but complexity holds firms back. 

We created wealth’s growth platform to help. We provide a global, end-to-end wealth management platform that integrates modern technology with business and investment operations. All in a regulated financial institution. 

We partner with the world’s leading financial institutions, with over US$2.4 trillion in assets on platform (AoP).

Together with our clients, we empower nearly 30 million people across all wealth segments to invest in their future.

Resume ExampleCover Letter Example

Explore more