Research — AnnotRift

Featured Publications

Our latest research

Peer-reviewed papers from top venues including NeurIPS, ICML, ACL, and CVPR.

NeurIPS 2025

Consensus-Weighted Annotation: A Framework for Scalable Label Quality

We introduce a novel consensus mechanism that dynamically weights annotator contributions based on demonstrated expertise and agreement patterns, achieving 12% improvement in label accuracy.

Chen, Williams, Patel, Okonkwo • December 2025

ICML 2025

Beyond Binary Preferences: Multi-Dimensional RLHF for Complex Reasoning

This paper presents a multi-axis preference framework that captures nuanced human judgments across helpfulness, accuracy, safety, and style dimensions simultaneously.

Rodriguez, Kim, Nakamura, Larsson • July 2025

ACL 2026

Synthetic Data Validation: When AI-Generated Data Outperforms Human Curation

We demonstrate conditions under which carefully validated synthetic data achieves superior downstream performance compared to purely human-generated datasets across 8 NLP benchmarks.

Park, Okafor, Singh, Martinez • March 2026

CVPR 2025

Active Learning for 3D Point Cloud Annotation: Reducing Labeling Cost by 40%

We propose an active learning strategy specifically designed for LiDAR annotation that intelligently selects the most informative frames for human labeling.

Thompson, Kapoor, Chen • June 2025

EMNLP 2025

Annotator Disagreement as Signal: Learning from Human Uncertainty

Rather than treating annotator disagreement as noise, we show how modeling disagreement distributions improves model calibration and uncertainty estimation.

Larsson, Williams, Park • November 2025

NeurIPS 2025

Fair Annotation: Detecting and Mitigating Demographic Bias in Training Data

We present a comprehensive framework for identifying demographic biases introduced during the annotation process and propose mitigation strategies that preserve data utility.

Okonkwo, Martinez, Nakamura • December 2025

Research Areas

Our focus areas

📐

Annotation Methodology

Developing better frameworks for task design, annotator training, quality measurement, and consensus mechanisms that scale to millions of labels.

🎯

RLHF & Alignment

Advancing preference learning, reward modeling, and human feedback collection methods for safer, more helpful AI systems.

🧬

Synthetic Data Science

Understanding when and how synthetic data can augment or replace human-generated training data while maintaining quality and diversity.

⚖️

Fairness & Bias

Detecting, measuring, and mitigating biases that arise during data collection and annotation processes across different demographic groups.

🤝

Human-AI Collaboration

Designing optimal workflows where AI assists human annotators, studying the effects of AI pre-labeling on human judgment and efficiency.

📊

Quality Metrics

Developing new metrics and statistical methods for measuring data quality, annotator reliability, and label confidence at scale.

Open Source

Contributing to the community

We believe in open science. Our tools and datasets are available for the research community.

annotrift-metrics

Open-source library for computing annotation quality metrics including inter-annotator agreement, consensus scores, and label confidence estimation.

Python 2.4K ★

preference-bench

A standardized benchmark for evaluating RLHF preference data quality, including human agreement baselines and automated quality predictors.

Python 1.8K ★

synth-validator

Toolkit for validating synthetic training data quality. Includes decontamination checks, diversity scoring, and automated quality filtering.

Python 1.2K ★

fair-annotation

Framework for detecting and mitigating demographic biases in annotation data. Includes bias metrics, visualization tools, and mitigation strategies.

Python 950 ★

Advancing the science of AI data