Open Role: Remote AI/ML Data Annotation & RLHF Specialist (Brazil)
Title: Remote AI/ML Data Annotation & RLHF Specialist (Brazil) Date: 25-02-2026 Company: Rexzone Country: US Remote Type: Remote Employment Type: FULL_TIME Experience Level: Mid-Senior Industry: Technology Job Function: Engineering Skills: RLHF, data labeling, prompt evaluation, QA evaluation, LLM evaluation, named entity recognition, content safety labeling, annotation guidelines, training data quality Salary Currency: USD Salary Min: 63360 Salary Max: 126720 Pay Period: YEAR You will annotate and evaluate AI training data used to improve large language model behavior, instruction following, and safety. This role blends data labeling, RLHF-style preference ranking, and model response evaluation to drive model performance improvement across real product workflows. Responsibilities: - Perform data labeling for LLM training datasets, including intent labeling, classification, and structured extraction - Execute RLHF tasks such as preference ranking, rubric-based grading, and comparative evaluations - Conduct prompt evaluation and response evaluation for helpfulness, honesty, and harmlessness - Apply named entity recognition and entity linking guidelines for high-precision text annotation - Complete QA evaluation, audit sampled work, and resolve disagreements via calibration sessions - Follow annotation guidelines compliance standards and provide feedback to improve specs and edge-case handling - Support content safety labeling for policy categories (self-harm, hate, harassment, sexual content, illicit behavior) - Contribute to continuous improvement of training data quality, including error taxonomy and root-cause notes Required Qualifications: - 3+ years in data annotation, LLM evaluation, QA, trust & safety, or related AI data operations - Strong English reading comprehension and ability to apply detailed rubrics consistently - Experience with training data quality processes: inter-annotator agreement, sampling plans, and audit workflows - Comfortable working with ambiguity, documenting decisions, and escalating guideline gaps Preferred Qualifications: - Experience with RLHF, prompt engineering evaluation, or model evaluation frameworks - Familiarity with NLP tasks (NER, sentiment, intent, summarization) and dataset formatting - Exposure to computer vision annotation (bounding boxes, polygons) is a plus but not required What You’ll Work On: - Large language model evaluation across conversation, reasoning, and tool-use scenarios - Policy-aligned content safety labeling and safety evaluation sets - Training data operations that enable reliable model performance improvement How to Apply: - Apply through Rex.zone with a brief summary of relevant annotation, RLHF, and QA evaluation experience - Be prepared for a paid qualification task focused on annotation guidelines compliance and edge cases



