管理信息与决策科学杂志

1532-5806

抽象的

Investigating the mechanism of the initial belief effect in the learning process: A simulation study using the multi-armed bandit model

Lee, J.

In the multi-armed bandit model to determine an optimal balance between exploration and exploitation, changing initial belief from the average success probability of alternatives might improve performance. However, the detailed mechanism of the initial belief effect has not been explained sufficiently. Therefore, this study aims to reveal that mechanism using simulation experiments. We demonstrate that changing initial belief can improve performance by mitigating the exploration–exploitation tradeoff. Increasing initial belief and decreasing the exploration level lead to gathering more knowledge in the early stage and exploiting that knowledge in the latter stage, improving performance. We provide explanations based on the concepts of the probability of exploring, the quality of knowledge, and the proportion of deviating from superior alternatives.

: