Open Role: Remote AI/ML Data Annotation & RLHF Specialist (Brazil)
Title: Remote AI/ML Data Annotation & RLHF Specialist (Brazil) Date: 25-02-2026 Company: Rexzone Country: US Remote Type: Remote Employment Type: FULL_TIME Experience Level: Mid-Senior Industry: Technology Job Function: Engineering Skills: RLHF, data labeling, prompt evaluation, QA evaluation, LLM evaluation, named entity recognition, content safety labeling, annotation guidelines, training data quality Salary Currency: USD Salary Min: 63360 Salary Max: 126720 Pay Period: YEAR You will annotate and evaluate AI training data used to improve large language model behavior, instruction following, and safety. This role blends data labeling, RLHF-style preference ranking, and model response evaluation to drive model performance improvement across real product workflows. Responsibilities: • Perform data labeling for LLM training datasets, including intent labeling, classification, and structured extraction • Execute RLHF tasks such as preference ranking, rubric-based grading, and comparative evaluations • Conduct prompt evaluation and response evaluation for helpfulness, honesty, and harmlessness • Apply named entity recognition and entity linking guidelines for high-precision text annotation • Complete QA evaluation, audit sampled work, and resolve disagreements via calibration sessions • Follow annotation guidelines compliance standards and provide feedback to improve specs and edge-case handling • Support content safety labeling for policy categories (self-harm, hate, harassment, sexual content, illicit behavior) • Contribute to continuous improvement of training data quality, including error taxonomy and root-cause notes Required Qualifications: • 3+ years in data annotation, LLM evaluation, QA, trust & safety, or related AI data operations • Strong English reading comprehension and ability to apply detailed rubrics consistently • Experience with training data quality processes: inter-annotator agreement, sampling plans, and audit workflows • Comfortable working with ambiguity, documenting decisions, and escalating guideline gaps Preferred Qualifications: • Experience with RLHF, prompt engineering evaluation, or model evaluation frameworks • Familiarity with NLP tasks (NER, sentiment, intent, summarization) and dataset formatting • Exposure to computer vision annotation (bounding boxes, polygons) is a plus but not required What You’ll Work On: • Large language model evaluation across conversation, reasoning, and tool-use scenarios • Policy-aligned content safety labeling and safety evaluation sets • Training data operations that enable reliable model performance improvement How to Apply: • Apply through Rex.zone with a brief summary of relevant annotation, RLHF, and QA evaluation experience • Be prepared for a paid qualification task focused on annotation guidelines compliance and edge cases



