Multi-processor systems-on-chips are widely adopted in implementing modern streaming applications to satisfy the ever increasing computation requirements. To take advantage of this kind of platform, it is necessary to map tasks of the application properly to different processors, so as to fully exploit the inherent task-level parallelism and satisfy the stringent timing requirements. We propose the Parallelism Graph to capture the task-level parallelism of the application and transform the mapping problem to a graph partitioning problem. The graph partitioning problem is formulated as an Integer Linear Programming problem, which is solved optimally using the ILP solver. To reduce the complexity, a two-step local search algorithm, i.e., the greedy partition and refinement algorithm, is proposed. Since one-shot heuristics cannot guarantee the solution quality, evolutionary algorithms are widely used to search the solution space such that better results can be found. We also integrate the idea of parallelism enhancement into the genetic algorithm and propose a hybrid genetic algorithm to improve the performance. Sets of synthesized Synchronous Data Flow Graphs and some practical applications are used to evaluate the performance of the proposed algorithms. Experiment results demonstrate that the proposed algorithms outperform available algorithms.