Kwee, Ivo and Hutter, Marcus and Schmidhuber, Juergen (2001) Gradient-based Reinforcement Planning in Policy-Search Methods. Technical Report UNSPECIFIED