Reinforcement Learning from Human Feedback - A Review
DOI:
https://doi.org/10.32628/IJSRSET2411211Keywords:
Reinforcement Learning, Human Feedback, Machine Learning, Reward Signals, Interactive LearningAbstract
Reinforcement Learning from Human Feedback (RLHF) is a burgeoning field at the intersection of artificial intelligence and human interaction. This approach involves training models to make decisions in dynamic environments by iteratively receiving feedback from human evaluators. In this process, initial models interact with the environment, and human evaluators provide feedback on the model's actions. The model is then updated based on this feedback, enhancing its decision-making capabilities over time. RLHF is particularly valuable in scenarios where predefine d rules may be inadequate, emphasizing adaptability and learning from real-world experiences. This abstract explores the applications, advantages, and challenges of RLHF, highlighting its promising results in domains such as robotics, gaming, and natural l anguage processing. The collaboration between machine learning algorithms and human intuition in RLHF presents a compelling synergy that addresses complex problems more effectively than traditional methods. As technology advances, RLHF is poised to unlock new possibilities and drive innovations across diverse fields.
Downloads
References
N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235, 2023. DOI: https://doi.org/10.1038/s42256-023-00626-4
O. Muzurura, T. Mzikamwi, T. G. Rebanowako, and D. Mpini, "APPLICATION OF ARTIFICIAL INTELLIGENCE FOR VIRTUAL TEACHING ASSISTANCE (Case study: Introduction to Information Technology)," 2023.
R. Zheng, S. Dou, S. Gao, W. Shen, B. Wang, Y. Liu, et al., "Secrets of rlhf in large language models part i: Ppo," arXiv preprint arXiv:2307.04964, 2023.
Y. Zhao, R. Joshi, T. Liu, M. Khalman, M. Saleh, and P. J. Liu, " SLICHF: Sequence likelihood calibration with human feedback," arXiv preprint arXiv:2305.10425, 2023.
B. Singh, R. Kumar, and V. P. Singh, "Reinforcement learning in robotic applications: a comprehensive survey," Artificial Intelligence Review, pp. 1 -46, 2022.
L. C. Garaffa et al., "Reinforcement learning for mobile robotics exploration: A survey," IEEE Transactions on Neural Networks and Learning Systems, 2021.
J. Lin, Z. Ma, R. Gomez, K. Nakamura, B. He, and G. Li, "A review on interactive reinforcement learning from human soc ial feedback," IEEE Access, vol. 8, pp. 120757 - 120765, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3006254
L. Guan, M. Verma, and S. Kambhampati, "Explanation augmented feedback in human - in-the-loop reinforcement learning," arXiv preprint arXiv:2006.14804, 2020.