A Case Study: Characterization of Performance Inconsistency for Reinforcement Learning on Flappy Bird Game

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One of the serious problems in Reinforcement Learning (RL) algorithms is that their performance usually varies when the same experiment is repeated or reproduced. Although RL results are hard to reproduce due to algorithms' intrinsic variance, which was not investigated systematically. Through this case study on Flappy Bird environment, we introduce and characterize four important factors on performance inconsistency of RL algorithms: 1) level of environment randomness, 2) order of action-value updates process, 3) exploration rate strategy, and 4) selection between on- and off-policy algorithms. Using a quantitative metric (coefficient of variation), we compare, analyze and investigate the results and the effects of each factor on the performance inconsistency/variance in RL. We believe our experimental results and analysis will provide opportunities to obtain an efficient agent that repeats/reproduces more consistent performance results.

Original languageEnglish
Title of host publication2021 International Conference on Information and Communication Technology Convergence (ICTC)
Subtitle of host publicationBeyond the Pandemic Era with ICT Convergence Innovation
PublisherIEEE Computer Society
Pages611-615
Number of pages5
ISBN (Electronic)9781665423830
DOIs
Publication statusPublished - 2021
Event12th International Conference on Information and Communication Technology Convergence, ICTC 2021 - Jeju Island, Korea, Republic of
Duration: Oct 20 2021Oct 22 2021

Publication series

NameInternational Conference on ICT Convergence
Volume2021-October
ISSN (Print)2162-1233
ISSN (Electronic)2162-1241

Conference

Conference12th International Conference on Information and Communication Technology Convergence, ICTC 2021
Country/TerritoryKorea, Republic of
CityJeju Island
Period10/20/2110/22/21

Keywords

  • Performance Inconsistency
  • Q-learning
  • Reinforcement Learning (RL)
  • Sarsa algorithm
  • State Discretization

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'A Case Study: Characterization of Performance Inconsistency for Reinforcement Learning on Flappy Bird Game'. Together they form a unique fingerprint.

Cite this