jobsearch v0.0.1

← togetherai / Forward Deployed Engineer (Inference & Post-Training)

tailored_resume_v2 / art_Wk_LQdqx3Dg

role
togetherai / Forward Deployed Engineer (Inference & Post-Training)
model
anthropic/claude-sonnet-4.6
created
2026-06-08T19:11

↓ Download .docx ↓ Download .pdf PDF requires LibreOffice installed

What changed for togetherai

changewhy it matters
Summary rewritten to lead with 'Forward Deployed Engineer' identity and RL Workbench as primary proof point JD's role title and first hard requirement is post-training pipeline expertise (GRPO/DPO/RLHF); leading with this directly mirrors the ideal candidate profile
Projects section moved before Experience The RL Workbench and aeval projects are the strongest direct proof of inference/post-training expertise — more relevant than any single work role for this JD
RL Workbench reordered to lead the projects section It is the single most relevant credential: 12 RL algorithms, multi-framework benchmarking (TRL/VeRL/OpenRLHF/NeMo RL), GPU Docker passthrough, Apple Silicon MPS + CUDA — maps to nearly every hard requirement
Intuit lead bullet reframed around 50K TPS / sub-25ms TP99 / 675M engagements as 'production inference infrastructure' JD requires hitting throughput and latency targets in production; this is the strongest enterprise-scale proof point on the resume
Splunk lead bullet reframed around 10x performance improvement and 'winning critical POCs and benchmarks' JD explicitly calls out winning critical POCs and benchmarks as a core responsibility
Fintellect lead bullet reframed to emphasize multi-provider LLM orchestration and model landscape awareness JD requires broad knowledge of open-source models and judgment on model selection; multi-provider orchestration with fallback routing demonstrates this
BRAIN project bullets consolidated to include NeurIPS publication inline Space optimization; NeurIPS credential is important for Together AI's research-driven culture but doesn't need a standalone entry
Kaiser Permanente condensed to 1 bullet Low relevance to inference/post-training role; retained for career continuity and Redis/scale signal
Bank of America retained as 1 bullet Completeness and Monte Carlo / quantitative analysis signal; minimal space cost
Streamio OpenClaw and MCP SDK bullets led the Streamio role LLM orchestration and production AI deployment are most relevant to Together AI's customer-facing AI platform context
JD analysis (20 key phrases)

Key phrases: inference engine optimizationpost-training pipelinesKV cache tuningspeculative decodingtensor parallelismquantization strategyLoRA, SFT, DPO, RLHF, GRPOthroughput and latency targetsforward deployed engineerproduction AI teamsopen-source LLM deploymentfine-tuning pipelinesstrategic customer alignmenttime-to-valueproduct feedback loopbenchmarkingGPU passthroughApple Siliconframework benchmarkinghands-on RL training runs

Hard requirements:

Preferred qualifications:

Per-role mapping (10 roles scored)
rolescorereframe angleJD phrases that map
RL Workbench — Post-Training RL Platform 5/5 Lead project — direct proof of GRPO/DPO/RLHF pipeline expertise and multi-framework inference benchmarking GRPO, DPO, RLHF, post-training pipelines, benchmarking, throughput and latency targets, GPU passthrough, Apple Silicon, hands-on RL training runs, framework benchmarking
aeval — AI Model Evaluation Platform 4/5 Model evaluation infrastructure — maps to model landscape awareness and production quality gates open-source LLM deployment, production environments, benchmarking, throughput and latency targets
Intuit — Staff PM Developer Frameworks & Platform Infrastructure 4/5 Platform infrastructure at scale — throughput/latency optimization, developer tooling, and strategic onboarding throughput and latency targets, time-to-value, production environments, strategic customer alignment, product feedback loop, opinionated onboarding
Streamio AI — Founder & CEO 3/5 Production AI deployment and multi-agent orchestration — demonstrates hands-on LLM integration in production production AI teams, open-source LLM deployment, hands-on
Fintellect AI — Founder & CEO 3/5 Multi-provider LLM orchestration and production AI deployment open-source LLM deployment, production AI teams, model landscape awareness
BRAIN — Protein Structure Prediction ML Platform 4/5 Deep ML research credentials — NeurIPS publication, transformer architectures, production ML serving post-training pipelines, open-source LLM deployment, production environments, hands-on
Splunk — Senior PM Search Orchestration 2/5 Performance optimization and distributed systems — condense to 2 bullets throughput and latency targets, production environments
Kaiser Permanente — SOA Technical PM 1/5 Enterprise infrastructure scale — condense to 1 bullet
IBM — Software Engineer 1/5 Keep 1 bullet for completeness
AutoEval — Automated Visual Evaluation for Robot Model Training 3/5 Automated model evaluation — maps to model quality and production deployment validation open-source LLM deployment, production AI teams, benchmarking

Tailored summary

Forward Deployed Engineer and Technical Product Leader with 12+ years in production AI systems — from hand-coding BPTT in C++ (2004) to building a full RLHF/DPO/GRPO post-training workbench benchmarking TRL, VeRL, OpenRLHF, and NeMo RL across Apple Silicon (MPS) and CUDA today. Hands-on expertise in post-training pipelines (PPO, GRPO, DPO, RLHF, SFT), inference optimization, and open-source LLM deployment at scale. Scaled production inference infrastructure to 675M+ engagements and 50K TPS with sub-25ms TP99 at Intuit; NeurIPS published researcher in neural architectures.