Reinforcement Learning from Human Feedback - A Review

Authors

  • Prof. (Dr) Satya Singh Department of Computer Science and Applications, M.G. Kashi Vidyapith, Varanasi (U.P.), India Author
  • Ratnesh Kumar Sharma Department of Computer Science and Applications, M.G. Kashi Vidyapith, Varanasi (U.P.), India Author

DOI:

https://doi.org/10.32628/IJSRSET2411211

Keywords:

Reinforcement Learning, Human Feedback, Machine Learning, Reward Signals, Interactive Learning

Abstract

Reinforcement Learning from Human Feedback (RLHF) is a burgeoning field at the intersection of artificial intelligence and human interaction. This approach involves training models to make decisions in dynamic environments by iteratively receiving feedback from human evaluators. In this process, initial models interact with the environment, and human evaluators provide feedback on the model's actions. The model is then updated based on this feedback, enhancing its decision-making capabilities over time.  RLHF  is  particularly  valuable  in  scenarios  where  predefine d  rules  may  be inadequate, emphasizing adaptability and learning from real-world experiences. This abstract explores the applications, advantages, and challenges of RLHF, highlighting its promising results in domains such as robotics, gaming, and natural l anguage processing. The collaboration between machine learning algorithms and human intuition in RLHF presents a compelling synergy that addresses complex problems more effectively than traditional methods. As technology advances, RLHF is poised to unlock new possibilities and drive innovations across diverse fields.

Downloads

Download data is not yet available.

References

N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235, 2023. DOI: https://doi.org/10.1038/s42256-023-00626-4

O. Muzurura, T. Mzikamwi, T. G. Rebanowako, and D. Mpini, "APPLICATION OF ARTIFICIAL INTELLIGENCE FOR VIRTUAL TEACHING ASSISTANCE (Case study: Introduction to Information Technology)," 2023.

R. Zheng, S. Dou, S. Gao, W. Shen, B. Wang, Y. Liu, et al., "Secrets of rlhf in large language models part i: Ppo," arXiv preprint arXiv:2307.04964, 2023.

Y. Zhao, R. Joshi, T. Liu, M. Khalman, M. Saleh, and P. J. Liu, " SLICHF: Sequence likelihood calibration with human feedback," arXiv preprint arXiv:2305.10425, 2023.

B. Singh, R. Kumar, and V. P. Singh, "Reinforcement learning in robotic applications: a comprehensive survey," Artificial Intelligence Review, pp. 1 -46, 2022.

L. C. Garaffa et al., "Reinforcement learning for mobile robotics exploration: A survey," IEEE Transactions on Neural Networks and Learning Systems, 2021.

J. Lin, Z. Ma, R. Gomez, K. Nakamura, B. He, and G. Li, "A review on interactive reinforcement learning from human soc ial feedback," IEEE Access, vol. 8, pp. 120757 - 120765, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3006254

L. Guan, M. Verma, and S. Kambhampati, "Explanation augmented feedback in human - in-the-loop reinforcement learning," arXiv preprint arXiv:2006.14804, 2020.

Downloads

Published

30-03-2024

Issue

Section

Research Articles

How to Cite

[1]
Prof. (Dr) Satya Singh and Ratnesh Kumar Sharma, “Reinforcement Learning from Human Feedback - A Review”, Int J Sci Res Sci Eng Technol, vol. 11, no. 2, pp. 133–141, Mar. 2024, doi: 10.32628/IJSRSET2411211.

Similar Articles

1-10 of 79

You may also start an advanced similarity search for this article.