Efficient resource allocation is a complex and dynamic task in business process management. Although a wide variety of mechanisms are emerging to support resource allocation in business process execution, these approaches do not consider performance optimization. This paper introduces a mechanism in which the resource allocation optimization problem is modeled as Markov decision processes and solved using reinforcement learning. The proposed mechanism observes its environment to learn appropriate policies which optimize resource allocation in business process execution. The experimental results indicate that the proposed approach outperforms well known heuristic or hand-coded strategies, and may improve the current state of business process management.