Does knowledge transfer always help to learn a better policy?

Fei Feng, Wotao Yin, Lin F. Yang

Submitted:

Overview

One of the key approaches to save samples when learning a policy for a reinforcement learning problem is to use knowledge from an approximate model such as a simulator. However, does such knowledge that we transfer from an approximate model to the true model always help to learn a better policy? Despite numerous empirical studies of transfer reinforcement learning, an answer to this question is still elusive.

In this paper, we provide a strong negative result, showing that even the full knowledge of an approximate model may not help reduce the number of samples for learning an accurate policy of the true model. Our result is based on a constructed reinforcement learning model, and we show that the sample complexity with or without knowledge transfer has the same order.

On the bright side, effective knowledge transferring is still possible under additional assumptions. In particular, we demonstrate that knowing the (linear) bases of the true model significantly reduces the number of samples for learning an accurate policy.

Citation

F. Feng, W. Yin, and L. Yang, Does knowledge transfer always help to learn a better policy? arXiv:1912.02986, 2019.


« Back