Reinforcement Learning from Human Feedback - A Review

Prof. (Dr) Satya Singh; Ratnesh Kumar Sharma

doi:10.32628/IJSRSET2411211

Authors

Prof. (Dr) Satya Singh Department of Computer Science and Applications, M.G. Kashi Vidyapith, Varanasi (U.P.), India Author
Ratnesh Kumar Sharma Department of Computer Science and Applications, M.G. Kashi Vidyapith, Varanasi (U.P.), India Author

DOI:

https://doi.org/10.32628/IJSRSET2411211

Keywords:

Reinforcement Learning, Human Feedback, Machine Learning, Reward Signals, Interactive Learning

Abstract

Reinforcement Learning from Human Feedback (RLHF) is a burgeoning field at the intersection of artificial intelligence and human interaction. This approach involves training models to make decisions in dynamic environments by iteratively receiving feedback from human evaluators. In this process, initial models interact with the environment, and human evaluators provide feedback on the model's actions. The model is then updated based on this feedback, enhancing its decision-making capabilities over time. RLHF is particularly valuable in scenarios where predefine d rules may be inadequate, emphasizing adaptability and learning from real-world experiences. This abstract explores the applications, advantages, and challenges of RLHF, highlighting its promising results in domains such as robotics, gaming, and natural l anguage processing. The collaboration between machine learning algorithms and human intuition in RLHF presents a compelling synergy that addresses complex problems more effectively than traditional methods. As technology advances, RLHF is poised to unlock new possibilities and drive innovations across diverse fields.

📊 Article Downloads

References

N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235, 2023. DOI: https://doi.org/10.1038/s42256-023-00626-4

O. Muzurura, T. Mzikamwi, T. G. Rebanowako, and D. Mpini, "APPLICATION OF ARTIFICIAL INTELLIGENCE FOR VIRTUAL TEACHING ASSISTANCE (Case study: Introduction to Information Technology)," 2023.

R. Zheng, S. Dou, S. Gao, W. Shen, B. Wang, Y. Liu, et al., "Secrets of rlhf in large language models part i: Ppo," arXiv preprint arXiv:2307.04964, 2023.

Y. Zhao, R. Joshi, T. Liu, M. Khalman, M. Saleh, and P. J. Liu, " SLICHF: Sequence likelihood calibration with human feedback," arXiv preprint arXiv:2305.10425, 2023.

B. Singh, R. Kumar, and V. P. Singh, "Reinforcement learning in robotic applications: a comprehensive survey," Artificial Intelligence Review, pp. 1 -46, 2022.

L. C. Garaffa et al., "Reinforcement learning for mobile robotics exploration: A survey," IEEE Transactions on Neural Networks and Learning Systems, 2021.

J. Lin, Z. Ma, R. Gomez, K. Nakamura, B. He, and G. Li, "A review on interactive reinforcement learning from human soc ial feedback," IEEE Access, vol. 8, pp. 120757 - 120765, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3006254

L. Guan, M. Verma, and S. Kambhampati, "Explanation augmented feedback in human - in-the-loop reinforcement learning," arXiv preprint arXiv:2006.14804, 2020.

Reinforcement Learning from Human Feedback - A Review

Authors

DOI:

Keywords:

Abstract

📊 Article Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

IssueDate

RightSideBlock

Latest publications