This article presents a way to implement many-sorted term rewriting on a GPU. This is done by letting the GPU repeatedly perform a massively parallel evaluation of all subterms. Innermost many-sorted term rewriting is experimentally compared with a relaxed form of innermost many-sorted term rewriting, and two different garbage collection mechanisms, to remove terms that are no longer needed, are discussed and experimentally compared. It is concluded that when the many-sorted term rewrite systems exhibit sufficient internal parallelism, GPU rewriting substantially outperforms the CPU. Both relaxed innermost many-sorted rewriting and garbage collection further improve this performance. Since the implementation can probably be even further optimised, and because in any case GPUs will become much more powerful in the future, this suggests that GPUs are an interesting platform for (many-sorted) term rewriting. As term rewriting can be viewed as a universal programming language, this also opens a route towards programming GPUs by term rewriting, especially for irregular computations.
Bibliographical noteFunding Information:
This work is carried out in the context of the NWO AVVA project 612.001751.
- Parallel computing
- Term rewriting