Adyen
Senior AI Research Scientist
Salary
Job description
This is Adyen
Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.
For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team. Together, we deliver innovative and ethical solutions that help businesses achieve their ambitions faster.
AI Research
Adyen is building a world-class AI team to redefine what intelligent systems can do in financial technology. As a Senior AI Research Engineer, you will take on some of the most technically demanding work in applied AI: designing agents that reason over complex, multi-step tasks; building the evaluation infrastructure that makes those systems trustworthy in production; and shaping how humans and AI collaborate at scale within a global payments company.
This is not a narrow research role. You will take full ownership of your work, from early research through deployed production systems, influence the team's technical direction, and act as a force multiplier for the broader AI organization — including contributing to custom model development for structured financial data, and working toward our longer-term ambition of defining how humans and AI collaborate at scale across the company.
What You'll Do
- Design and Deploy AI Agents for Complex Tasks: Lead the research, design, and deployment of AI agents built for long-horizon, multi-step tasks in real-world financial contexts — including data analysis pipelines, operational workflows, and integrity risk scenarios. Architect robust agentic systems covering multi-agent orchestration, tool dispatch, context and memory management, and error recovery for long-running workflows. Design human-in-the-loop mechanisms that define when agents act autonomously, when they surface uncertainty, and when they escalate or defer to humans.
- Own Evaluation and Benchmarking: Define and lead the evaluation strategy for the agentic systems and LLMs your team builds and deploys. Design internal benchmarks grounded in real domain complexity — probing for genuine capabilities, edge cases, and failure modes that standard metrics miss. Build reusable evaluation infrastructure that is embedded in the development process, not bolted on after the fact.
- Provide AI Expertise Across the Organization: Serve as a technical resource for AI initiatives across Adyen — evaluating agentic frameworks, retrieval and search strategies, or agent tool-use approaches across partner teams. Surface connections across initiatives and help teams avoid duplicating work or converging on the wrong approach.
- Raise the Bar: Set engineering standards for the team and company. Provide mentorship through problem decomposition, research methodology, and code review. Champion reproducibility, documentation, and rigorous evaluation practices across the AI organization.
Who You Are
- You have 6+ years of hands-on experience in applied AI/ML research or engineering, with a clear track record of shipping AI systems, including agentic or LLM-powered systems, in production environments.
- You have deep expertise in language models and Generative AI, with hands-on depth across several of: architecture, post-training (fine-tuning, RLHF), inference optimization, context engineering, and failure modes at scale.
- You have proven experience designing and operating agentic systems at scale, multi-agent orchestration, tool use, memory and context management, state handling for long-running workflows, and human-in-the-loop design. You understand what separates production agents from research prototypes.
- You are rigorous and systematic about evaluation. You have designed evaluation frameworks or internal benchmarks that go beyond standard metrics. You understand the failure modes of LLM-as-judge approaches and know how to measure what actually matters for a given system and use-case.
- You have a strong foundation in classical machine learning: supervised learning, ensemble methods, optimization, probabilistic modeling, and statistics. You reach for the appropriate tool for the problem.
- You write clean, well-structured, production-ready code, primarily Python, and you hold research code to an engineering standard.
- You have hands-on experience with at least one production-grade agentic framework.
Nice to Have (tell us about them!)
- Any experience with tabular deep learning architectures
- Familiarity with financial data, payments, fraud detection, or risk systems.
- Track record of external visibility: publications, conference presentations, or open-source contributions.
- Experience with observability and evaluation tooling.
- Familiarity with MLOps and model deployment pipelines in large-scale environments.
Our Diversity, Equity and Inclusion commitments
Our unique approach is a product of our diverse perspectives. This diversity of backgrounds and cultures is essential in helping us maintain our momentum. Our business and technical challenges are unique, and we need as many different voices as possible to join us in solving them - voices like yours. No matter who you are or where you’re from, we welcome you to be your true self at Adyen.
Studies show that women and members of underrepresented communities apply for jobs only if they meet 100% of the qualifications. Does this sound like you? If so, Adyen encourages you to reconsider and apply. We look forward to your application!
What’s next?
Ensuring a smooth and enjoyable candidate experience is critical for us. We aim to get back to you regarding your application within 5 business days. Our interview process tends to take about 4 weeks to complete, but may fluctuate depending on the role. Learn more about our hiring process here. Don’t be afraid to let us know if you need more flexibility.
This role is based out of our Madrid office. We are an office-first company and value in-person collaboration; we do not offer remote-only roles. We'll cover your relocation if you want to live in our wonderful, sunny, and bright city.


