About The Role

Support human-in-the-loop AI training workflows for large language models and multimodal systems. You will improve training data quality by completing data labeling, RLHF preference ranking, prompt evaluation, and QA evaluation tasks using structured annotation tools and written guidelines.

Key Responsibilities

Create and review labeled datasets for NLP, computer vision, and multimodal use cases
Perform RLHF preference ranking and rubric-based scoring
Run prompt and response evaluations for helpfulness, factuality, and safety
Execute content safety labeling (policy, harassment, self-harm, sensitive categories)
Follow annotation guidelines and document rationales for edge cases
Conduct QA evaluation via sampling plans, audits, and error taxonomy tracking
Participate in calibration sessions to reduce rater variance and improve agreement
Track throughput and quality metrics that impact LLM training pipelines

Required Qualifications

Strong written English and structured reasoning
Ability to follow detailed annotation guidelines with consistent judgment
Comfort using spreadsheets, web tools, or labeling interfaces
Familiarity with model evaluation and prompt/response patterns
Attention to detail and evidence-based QA evaluation approach
Comfort handling sensitive content as part of content safety labeling

Work Details

Remote, full-time work delivered through online tooling and written annotation guidelines.

Quality is measured via audits, disagreement rates, rubric adherence, and training data quality outcomes.

Compensation

Competitive hourly rate: $30–$50/hr (USD).
Show more Show less

Entry Level AI Jobs in India

Similar Jobs