Dataset Groups Activity Stream Anthropic-HH-RLHF Dataset The dataset used in the paper is the Anthropic-HH-RLHF dataset, which is used for reinforcement learning from human feedback. BibTex: @dataset{Anthropic_2024, abstract = {The dataset used in the paper is the Anthropic-HH-RLHF dataset, which is used for reinforcement learning from human feedback.}, author = {Anthropic}, doi = {10.57702/vwhenrmb}, institution = {No Organization}, keyword = {'Human Feedback', 'RLHF', 'Reinforcement Learning'}, month = {dec}, publisher = {TIB}, title = {Anthropic-HH-RLHF Dataset}, url = {https://service.tib.eu/ldmservice/dataset/anthropic-hh-rlhf-dataset}, year = {2024} }