MCPNew: Mokaru MCP server is live
Secuvy

Secuvy

Website

Senior Python Developer — Data Engineering & ML/AI Pipelines

Company

Secuvy

Role

Senior Python Developer — Data Engineering & ML/AI Pipelines

Location

California, US

Job type

Full-time

Found on Mokaru

2 months ago

Share this job

Salary

Not disclosed by employer

Benefits

🏥Health Insurance🦷Dental Coverage

Job description

SecuvyAI · Remote (US) · Full-Time

> ⚠️ No recruiting agencies. Direct applicants only. Open to US Citizens and Green Card holders only. No visa sponsorship available.

About SecuvyAI

SecuvyAI is a cutting-edge Data Privacy and Security Intelligence platform trusted by enterprises to discover, classify, and govern sensitive data at scale. We combine AI-driven automation with deep compliance expertise to help organizations stay ahead of privacy regulations — GDPR, CCPA, HIPAA, and beyond. Our platform ingests and classifies petabytes of structured and unstructured data across cloud, on-prem, and hybrid environments, powered by a sophisticated ML/AI engine at its core.

The Opportunity

We're looking for a highly experienced Senior Python Developer with deep roots in data engineering and a strong track record building and operating ML/AI pipelines in production. You'll sit at the intersection of data infrastructure and applied machine learning — designing the systems that power SecuvyAI's intelligent data classification, PII detection, and privacy risk scoring capabilities.

This is a high-impact, hands-on engineering role. You'll work closely with data scientists, platform engineers, and product teams to take models from experimentation to production at scale — and keep them running reliably.

What You'll Do

Data Engineering

  • Design, build, and maintain large-scale data ingestion, transformation, and processing pipelines using Python-native and distributed frameworks
  • Architect reliable, scalable ETL/ELT workflows that handle structured, semi-structured, and unstructured data across cloud and on-prem sources
  • Optimize pipeline performance for throughput, latency, and cost at petabyte scale
  • Build and maintain data quality frameworks — validation, lineage tracking, anomaly detection, and alerting
  • Partner with platform engineers to ensure pipelines are observable, testable, and production-grade

ML/AI Pipeline Development

  • Build and operationalize end-to-end ML pipelines — data ingestion, feature engineering, model training, evaluation, deployment, and monitoring
  • Develop and maintain feature stores, training data pipelines, and model serving infrastructure
  • Collaborate with data scientists to productionize models for PII classification, sensitive data detection, entity recognition, and risk scoring
  • Implement MLOps best practices — experiment tracking (MLflow/W&B), model versioning, A/B testing, and automated retraining pipelines
  • Integrate LLM-based and NLP-based components into the SecuvyAI data intelligence engine
  • Monitor deployed models for drift, degradation, and data quality issues in production

Collaboration & Code Quality

  • Write clean, well-tested, production-grade Python with a focus on maintainability and performance
  • Participate actively in code reviews and contribute to engineering standards for data and ML code
  • Work cross-functionally with Data Science, Platform Engineering, and Product to align on data contracts and pipeline SLAs
  • Contribute to technical documentation, runbooks, and internal knowledge sharing

Required Qualifications

  • 8–10 years of professional software engineering experience with Python as your primary language
  • Deep, hands-on data engineering expertise — ETL/ELT, pipeline orchestration, data modeling, and distributed data processing at scale — this is a must
  • Proven experience building and maintaining ML/AI pipelines in production — not just experimentation, but reliable, monitored, production deployments
  • Strong experience with Apache Spark (PySpark) for large-scale batch and streaming data processing
  • Hands-on experience with workflow orchestration tools — Apache Airflow, Prefect, or Dagster
  • Solid understanding of stream processing (Kafka, Kinesis, or Flink) and real-time data architectures
  • Experience with ML frameworks and tooling: scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers, or equivalent
  • Familiarity with MLOps platforms and practices — MLflow, Weights & Biases, Kubeflow, or SageMaker Pipelines
  • Proficiency with cloud data platforms: AWS (Glue, EMR, S3, SageMaker), GCP (Dataflow, BigQuery, Vertex AI), or Azure (ADF, Synapse, Azure ML)
  • Strong command of SQL and experience with both relational (PostgreSQL, MySQL) and analytical (Redshift, BigQuery, Snowflake) databases
  • Experience with data quality and observability tooling (Great Expectations, Monte Carlo, or similar)
  • Comfortable working in a fully remote, async-first engineering environment
  • US Citizen or Permanent Resident (Green Card holder) required — no visa sponsorship

Preferred Qualifications

  • Experience in data privacy, security, or compliance domains — PII detection, data classification, or sensitive data governance
  • Hands-on experience with NLP pipelines — named entity recognition (NER), text classification, or document understanding at scale
  • Experience working with LLMs in production — prompt engineering, fine-tuning, RAG architectures, or LLM-integrated data pipelines
  • Familiarity with feature stores (Feast, Tecton, or Hopsworks)
  • Experience with dbt for data transformation and data modeling in analytical pipelines
  • Knowledge of data catalog or metadata management tools (Apache Atlas, DataHub, Collibra, or similar)
  • Prior experience mentoring junior data engineers or working with distributed remote teams
  • Contributions to open-source data or ML projects, or public technical writing

Compensation & Benefits

  • Base salary: $150,000 – $175,000 depending on experience
  • Meaningful equity in a high-growth AI startup
  • Fully remote with flexible hours — async-first culture
  • Comprehensive health, dental, and vision insurance

No recruiting agencies. We will not respond to agency outreach.

Resume ExampleCover Letter Example

Explore more