The investigation of cell proliferation can provide useful insights for the comprehension of cancer progression, resistance to chemotherapy and relapse. To this aim, computational methods and experimental measurements based on in vivo label-retaining assays can be coupled to explore the dynamic behavior of tumoral cells. ProCell is a software that exploits flow cytometry data to model and simulate the kinetics of fluorescence loss that is due to stochastic events of cell division. Since the rate of cell division is not known, ProCell embeds a calibration process that might require thousands of stochastic simulations to properly infer the parameterization of cell proliferation models. To mitigate the high computational costs, in this paper we introduce a parallel implementation of ProCell's simulation algorithm, named cuProCell, which leverages Graphics Processing Units (GPUs). Dynamic Parallelism was used to efficiently manage the cell duplication events, in a radically different way with respect to common computing architectures. We present the advantages of cuProCell for the analysis of different models of cell proliferation in Acute Myeloid Leukemia (AML), using data collected from the spleen of human xenografts in mice. We show that, by exploiting GPUs, our method is able to not only automatically infer the models' parameterization, but it is also 237x faster than the sequential implementation. This study highlights the presence of a relevant percentage of quiescent and potentially chemoresistant cells in AML in vivo, and suggests that maintaining a dynamic equilibrium among the different proliferating cell populations might play an important role in disease progression.