Background & References
Interactive machine learning [IML; 7] studies algorithms that learn from data collected through interaction with a computational agent or human in a shared environment, through feedback on model decisions. In contrast to the common paradigm of supervised learning, IML does not assume access to pre-collected labeled data, thereby decreasing data costs. Instead, it allows systems to improve over time, empowering non-expert users to provide feedback. IML has seen wide success in areas such as video games [24] and recommendation systems [13].
Although most downstream applications of NLP involve interactions with humans—e.g., via labels, demonstrations, corrections, or evaluation—common NLP models are not built to learn from or adapt to users through interaction. There remains a large research gap that must be closed to enable NLP systems that adapt on-the-fly to the changing needs of humans and dynamic environments through interaction. While still understudied, there is growing interest in NLP models that learn through interaction, especially when considering most recent developments with large language models [8]. Some systems learn with computational agent or human feedback in the form of low-level labels using techniques such as active learning [12], imitation learning [3, 23], pairwise feedback [23, 5], preference learning [25], contextual bandits [10, 9, 26], and reinforcement learning [17, 27, 22]. The most popular form of feedback recently is reinforcement learning from human feedback (RLHF) [21, 1, 2, 20, 17, 30, 27, 31], which has lead to the development of ChatGPT [20]. In contrast to these low-level label feedback, natural language feedback systems aim to leverage the full expressivity of natural language to handle nuanced and expressive feedback [14]. Types of natural language feedback studied include explanations [16], advice [15, 18, 28], instructions [29, 11], descriptions [6, 19], and prompts [4].