One of the challenges of engineering is to make the best possible use of the available resources, or in other words allocating the resources in such a way as to maximize the overall profit. In the context of networks on chip the resources are represented by the communication band- width and the final profit is the performance of an application supported by the network on chip. In this thesis we focus on networks on chip providing guaranteed performance, i.e. guaranteeing for each application the delivery of a requested bandwidth. In these networks, hardware resources are allocated and assigned to each application for its entire lifetime. We discuss several solutions for delivering the allocated bandwidth, and we propose models which allow us to evaluate the performance of these solutions. Starting from a general, rate-based allo- cation model we gradually add more architectural restrictions that lower the implementation cost, but at the same time sacrifice some performance. NoCs with allocation based on discrete rates are very common and include priority-based, TDM, SDM, FDM, and other NoCs. They all partition the bandwidth available on the network links into discrete units. In the case of TDM NoCs these units are called time slots. The problem of resource alloca- tion in TDM NoCs consists of finding paths through the network between the nodes that wish to communicate, and selecting along these paths a set of free time slots that is sufficiently large to fulfill the application requirements. After allocation the bandwidth is guaranteed. In this thesis, we propose, implement and evaluate allocation algorithms for all the proposed performance models. Particular effort is dedicated to allocation algorithms for the contention-free routing model, a restrictive, but low-cost form of TDM where allocation is particularly challenging. Our allocation algorithms deal both with spatial allocation, i.e., the selection of a specific path out of the available paths through the network, and temporal allocation, i.e., along the time axis. The latter is used for optimizing bandwidth usage and latency which we will both discuss in depth. We propose two algorithms for the allocation of slots in the time domain, both of which we show to be optimal. We also demonstrate how the TDM schedule can be computed at run time, with low computational requirements. We demonstrate a system performing run-time allocation in FPGA and we implement hardware acceleration for the more expensive operations used by the allocation algorithm. We propose a synthesizable NoC implementation based on the contention-freerouting model, called dAElite. Our proposal uses existing design flows but has better performance and reduced hardware cost. The network supports some of the less restrictive models that we have previously introduced thus allowing a better allocation of resources. Finally, we present how the communication requests of the IPs are handled by the interconnect. We propose optimizations such as write coalescing and latency hiding techniques at the interface between IPs and the NoC and we demonstrate the performance benefits of the proposed approach in real applications. The main conclusions of this thesis are that, compared to an ideal rate-based NoC offering guaranteed bandwidth, introducing fixed discrete allocation units causes a performance loss of 18% while using headers loses another 15%, under the considered, realistic scenarios. Other factors, such as topology, inorder delivery, etc. cause only a minor performance loss. We find Æthereal to lose 46% compared to an ideal rate-based network, while the dAElite network introduced here loses less than 26% and is at the same time less expensive to implement.
|Kwalificatie||Doctor in de Filosofie|
|Datum van toekenning||24 apr. 2012|
|Plaats van publicatie||Delft|
|Status||Gepubliceerd - 2012|