A common approach to enhance the performance of processors is to increase the number of function units which operate concurrently. We observe this development in all recent general purpose superscalar processors, and in VLIW (very long instruction word) processors used for more dedicated application domains, like the multi-media domain. This paper analyzes the data path complexity of ILP processors (in particular VLIWs), and shows that they soon may hit the complexity wall; their complexity gets out of control when scaling to very high performance. Several methods are investigated for reducing this complexity. Essentially these methods trade hardware for software complexity, i.e., performing as much as possible at compile time. Combining these methods results in a new architecture, called transport triggered architecture or TTA. The concept of transport triggering is outlined together with its characteristics. It will be shown that the application of this concept results in a number of hardware advantages, and introduces a number of new scheduling optimizations. Together they substantially reduce the ILP complexity bottleneck, which will be demonstrated by a number of experiments.