Thomson Reuters

Thomson Reuters

Website

Staff Software Engineer — Search Platform, API & Infrastructure

Role

Staff Software Engineer — Search Platform, API & Infrastructure

Job type

Full-time

Posted

4 days ago

Share this job

Salary

$136k - $253k/YEAR

Job description

This posting is for proactive recruitment purposes and may be used to fill current openings or future vacancies within our organization.

Overview of the Role: Advanced Content Engineering (ACE) is seeking a Staff Software Engineer to lead the design and delivery of the search platform’s control-plane API and cloud infrastructure. The platform’s core promise is self-service: internal client teams must be able to create a search system, configure an ingestion topology, promote a new index to production, and monitor system health — entirely through APIs — without requiring direct involvement from the platform team. Building, operating, and continuously improving that self-service experience is the heart of this role. This is a high-ownership, high-leverage position at the intersection of platform engineering, API design, and cloud infrastructure. Staff Engineers on this team define, build, test, deploy, scale, and operate what they ship — full-stack ownership is the baseline, not a bonus. Delivery friction is treated as an urgent engineering problem: the team ships to production constantly, AI-assisted development is the norm, and removing obstacles to fast, safe delivery is everyone’s responsibility. The successful candidate brings enterprise-grade security instincts, deep AWS expertise, and a product-minded approach to developer experience — treating the platform’s API as a product in its own right.

About the Role In this position, you will focus on: Platform Control-Plane API

  • Plan, design, develop, and own the platform’s management API — the self-service interface through which client teams create and configure search systems, manage ingestion topologies, register reusable components, promote index versions, and monitor system health — resolving problems of diverse scope with innovative thinking and little or no precedent to guide solutions
  • Architect the platform’s multi-tenant access model: implement strict data isolation between client tenants, integrate with enterprise identity providers, establish role-based access control across all API endpoints, and define the governance framework that ensures the platform can make credible security commitments to enterprise customers
  • Establish API strategy and cross-system integration patterns — designing versioned, backward-compatible interfaces with clear contracts, comprehensive documentation, and developer-experience patterns drawn from best-in-class search platform providers — and set governance standards that the team follows for all future API surface
  • Design and expose the API surface required to support the platform’s evaluation and experimentation workflows — including endpoints that enable the search grading tool to consume experiment run outputs, query/result pairs, and relevance judgments, and that allow client teams to configure and trigger A/B search experiments through self-service interfaces
  • Design the configuration data model and persistence layer (DynamoDB and related services) that stores search system definitions, component registry entries, index lifecycle state, and audit logs — applying architectural patterns that scale to the platform’s multi-tenant and multi-region ambitions
  • Break down complex business requirements into functional and technical requirements with consideration for security, ethical AI implementation, and operational efficiency; contribute to recommendations where technology transformation can spark business growth

Cloud Infrastructure & DevOps

  • Own the platform’s AWS infrastructure as code — defining, provisioning, and maintaining ECS services, MSK clusters, OpenSearch/Vespa deployments, DynamoDB tables, networking (VPC, security groups, NAT), and IAM roles using Terraform or AWS CDK — establishing infrastructure governance standards and a cloud strategy for multi-environment and eventual multi-region operation
  • Design and own the CI/CD pipeline for platform services — establishing DevOps culture and toolchain strategy for the team, with a clear mandate to eliminate delivery friction: the team ships to production constantly, and any obstacle to doing so safely is an engineering problem to be solved, not a process to be accepted
  • Drive adoption of AI-assisted development practices across the team’s infrastructure and API work — establishing the tooling, patterns, and norms that enable engineers to leverage AI to move faster while maintaining the quality and reliability bar the platform demands
  • Own infrastructure cost management: monitor AWS spend across platform components, evaluate architectural trade-offs at the system level, and implement an enterprise performance and optimization framework that keeps the platform’s economics sustainable as it scales — including compute cost governance for inference workloads as custom model serving is introduced
  • Implement and operate customer-controlled encryption key (CMK) support — applying security strategy, risk assessment frameworks, and security governance to give enterprise clients control over their encryption keys while preserving multi-tenant reliability

Reliability Engineering

  • Define and own platform-level SLOs covering API availability, query latency, ingestion throughput, and end-to-end document freshness — and build the monitoring infrastructure (CloudWatch, distributed tracing, alerting) that makes SLO compliance continuously visible to the team and to client teams
  • Design the observability infrastructure for agentic retrieval paths — where standard request/response logging is insufficient: implement trace-level instrumentation that captures tool invocation sequences, per-hop latency, and retrieval inputs, enabling reliable diagnosis of failures and quality regressions in non-deterministic agent workflows
  • Take full operational responsibility for platform API and infrastructure — you built it, you own it, you run it: triage and resolve incidents, write thorough post-mortems, and drive systematic improvements that prevent recurrence
  • Design enterprise performance strategy for the platform’s API layer: load testing, capacity planning, performance profiling, and system-level optimization — ensuring the platform can handle planned growth in tenants, content volumes, and query traffic
  • Embed security architecture throughout the platform’s infrastructure: least-privilege IAM, secrets management, encryption at rest and in transit, audit logging, and compliance implementation aligned with TR’s enterprise security requirements

Technical Leadership

  • Establish architectural principles and cross-system design patterns for the platform’s control plane and infrastructure — functioning as the technical authority that other engineers and teams turn to for API and infrastructure guidance
  • Lead significant projects and business initiatives that span multiple engineers and interact with partner teams; determine work priorities and make adjustments to short-term priorities while maintaining strategic focus; provide specialist advice to senior management on complex infrastructure and security issues
  • Mentor and develop Senior and mid-level engineers — providing coaching, technical direction, and educational opportunities in cloud infrastructure, platform API design, reliability engineering, and AI-assisted development practices
  • Engage with client teams as a technical partner — understanding their integration experience and pain points, feeding structured requirements back into the platform API roadmap, and proactively reducing time-to-value for new platform adopters
  • Deliver effective presentations on complex infrastructure and security concepts to technical and non-technical stakeholders; champion ethical AI practices and responsible technology deployment across the team’s work

About You You’re an ideal fit if you have: Required Experience —

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
  • 8+ years of software engineering experience, with demonstrated progression to staff-level or equivalent technical leadership — including ownership of a functional area and leadership of significant cross-functional projects
  • Deep expertise in cloud-native platform and infrastructure engineering on AWS: VPC architecture, IAM, ECS, Lambda, DynamoDB, MSK, and related managed services — with hands-on infrastructure-as-code experience (Terraform and/or AWS CDK) and the ability to establish infrastructure governance frameworks
  • Production experience with OpenSearch, Vespa, or Elasticsearch at an operational level — cluster sizing, backup and restore, index lifecycle management, and multi-tenant access controls
  • Mastery of Python with strategic awareness of language selection and migration; strong software engineering fundamentals including testing architecture, security architecture, and system design
  • Demonstrated enterprise security practice: security strategy, risk assessment frameworks, least-privilege IAM, secrets management, encryption at rest and in transit, and compliance implementation in production cloud environments
  • Track record of establishing API governance frameworks, cross-system integration patterns, and documentation standards; experience designing multi-tenant SaaS-style platform APIs with versioning, access control, and first-class developer experience
  • Demonstrated reliability engineering ownership: SLO definition, observability implementation, on-call leadership, and a track record of improving platform reliability through data-driven retrospectives — with a clear philosophy that shipping frequently and operating reliably are complementary, not in tension
  • Comfort and fluency with AI-assisted development tools; you use them to move faster and produce higher-quality infrastructure and API code, and you actively help the team do the same

Preferred Experience —

  • Experience operating Kafka (MSK) or other distributed messaging infrastructure in production, including partition management, consumer group monitoring, and schema registry governance
  • Background in Kubernetes or ECS container orchestration, including service mesh, autoscaling, and health check patterns
  • Experience building developer-facing internal platforms where API quality and documentation are treated as first-class product concerns
  • Knowledge of enterprise encryption patterns, including customer-managed keys (AWS KMS) and their architectural implications for multi-tenant systems
  • Familiarity with distributed tracing infrastructure for non-deterministic or agentic workflows — where trace design must capture tool call sequences and per-hop context, not just request/response pairs
  • Familiarity with AI service architecture: evaluating AI vendors, cost-benefit analysis, and integrating AI API services with fallback strategies into production platform infrastructure.

What Success Looks Like In the first 90 days

  • Build a thorough understanding of the platform’s current infrastructure, API surface, and operational posture — including known gaps in reliability, security, and developer experience
  • Establish relationships with key client teams to understand their integration experience and pain points with the current platform
  • Take on-call ownership for your functional area and identify and begin delivering the highest-leverage near-term improvements to platform API or infrastructure reliability

In the first year

  • Deliver a materially improved self-service platform API — with strong multi-tenant isolation, documented governance standards, and measurably better developer experience for client teams
  • Establish end-to-end SLO coverage across platform services, with automated alerting, clear on-call runbooks, documented architectural decision records, and a track record of fast, high-quality incident resolution
  • Own and deliver a major infrastructure initiative — CMK support, multi-environment maturity, agentic observability infrastructure, or a comparable project — from architectural design through production, establishing the principles and patterns that guide the platform’s infrastructure evolution
  • Become the recognized technical authority for platform API and infrastructure — shaping team standards, influencing platform architecture, and providing specialist guidance to leadership on complex infrastructure and security challenges.

#LI-TH1

Resume ExampleCover Letter Example

Explore more