-
Corpus Pairs Dataset
Corpus pairs dataset for LABDet, a robust and language-agnostic bias probing method to quantify intrinsic bias in monolingual PLMs. -
Minimal Pairs Dataset
Minimal pairs dataset for LABDet, a robust and language-agnostic bias probing method to quantify intrinsic bias in monolingual PLMs. -
Sentiment Training Dataset
Sentiment training dataset for LABDet, a robust and language-agnostic bias probing method to quantify intrinsic bias in monolingual PLMs. -
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Pers...
Human annotation plays a core role in machine learning — annotations for supervised models, safety guardrails for generative models, and human feedback for reinforcement... -
ChatGPT: A conversational AI model
The dataset used in the paper ChatGPT: A conversational AI model. -
Latent Distance Guided Alignment Training for Large Language Models
Ensuring alignment with human preferences is a crucial characteristic of large language models (LLMs). Presently, the primary alignment methods, RLHF and DPO, require extensive... -
ParSEL: Parameterized Shape Editing with Language
ParSEL: Parameterized Shape Editing with Language, a system that enables controllable editing of 3D assets with natural language. -
Temporal Sentence Grounding in Videos
Temporal sentence grounding in videos (TSGV) is a task to retrieve a video segment that semantically corresponds to a query in natural language. -
GTA: A Benchmark for General Tool Agents
GTA is a benchmark for General Tool Agents, featuring three main aspects: real user queries, real deployed tools, and real multimodal inputs. -
RateMyProfessor Dataset
RateMyProfessor dataset, a dataset of student-written reviews for professors. -
Bias in Bios Dataset
Bias in Bios dataset, a personal biography dataset with information extracted from Wikipedia. -
Language Agency Classification (LAC) Dataset
Language Agency Classification (LAC) dataset for training accurate language agency classifiers. -
Reference Letter Dataset
Reference letter dataset generated under the Context-Based Generation (CBG) setting.