Human-computer interaction - Groups

Lunar Lander

The dataset used in this paper is a collection of data points from a lunar lander, which is used to test the proposed APG algorithm for task switching.
- Dataset
- JSON
Training a helpful and harmless assistant with reinforcement learning from hu...

The authors propose a novel approach that incorporates parameter-efficient tuning to better optimize control tokens, thus benefitting controllable generation.
- Dataset
- JSON

2 datasets found