The current trends in next-generation exascale systems go towards integrating a wide range of specialized (co-)processors into traditional supercomputers. Due to the efficiency of heterogeneous systems in terms of Watts and FLOPS per surface unit, opening the access of heterogeneous platforms to a range of users as wide as possible is an important problem to be tackled. However, the integration of heterogeneous, specialized devices increases programming complexity, restricting it to a few experts, and makes porting applications onto different computational infrastructures extremely costly. In order to bridge the gap between the programming needs of heterogeneous systems and the expertise of programmers, program transformation has been proposed elsewhere as a means to ease program generation and adaptation. This brings about several issues such as how to plan a transformation strategy which eventually generates code with increased performance. In this paper we propose a machine learning-based approach to learn heuristics for defining transformation strategies of a program transformation system. Our approach proposes a novel combination of reinforcement learning and classification methods to efficiently tackle the problems inherent to this type of systems. Preliminary results demonstrate the suitability of this approach.