Start Carreiras is now Vetto: the new platform connecting people to the future of AI. Click here to learn more!

Vetto logo

Data Scientist (LLMs/Quality/Evals) — Experiments & Tools

$70/hr

Published 2 days ago

Remote
Freelance
Apply

Plan and conduct experiments with LLMs, create evaluation datasets, automate Evals/RLHF/SFT pipelines, and generate adversarial tests to measure and improve model performance.

What you'll do

• Define metrics, hypotheses, and experimental design (A/B, holdouts, basic power). • Build Evals pipelines and dataset generation/curation. • Develop tools for prompting, rationales, and rubrics. • Instrument metrics collection and dashboards; version data/results. • Collaborate with domain experts and QA team to close the quality loop. • Write technical documentation and executive reports.

Requirements

• Advanced degree (MSc/PhD/MBA in Data, Statistics, Computer Science or related). • Strong Python; SQL; applied statistics/causality fundamentals. • Practical experience with LLMs (API usage, evaluation, prompt design).

Tools/Platforms (desired)

• Pandas, NumPy, scikit-learn; PyTorch or TensorFlow. • LangChain/LlamaIndex; Weights & Biases or MLflow. • Orchestration (Airflow/Prefect), data versioning (DVC/Git). • Spark/Databricks/BigQuery; Docker; GCP/AWS.

Differentials

• Publications, repos with reproducible experiments, open-source contributions.

Selection process

Application → short analytical challenge → interview (30–45 min) → onboarding + NDA.