Processors used in embedded systems have specific requirements which are not always met by off-the-shelf processors. A templated processor architecture, which can easily be tuned towards a certain application (domain) offers a solution. The transport triggered architecture (TTA) template presented in this paper has a number of properties that make it very suitable for embedded system design. Key to its success is to give the compiler more control; it has to schedule all data transports within the processor. This paper highlights two important TTA-related issues. First a new code generation method for TTAs is discussed; it integrates scheduling and register allocation, thereby avoiding the notorious phase ordering problem between these two steps. Secondly, we discuss how to tune the instruction repertoire for an embedded processor. A tool is described which automatically detects frequent patterns of operations. These patterns can then be implemented on special function units.