Nowadays, most processing platforms make use of cache memories to improve the execution speed of the tasks running on the processors. However, when a processor switches from a task to another, the caches must be reloaded with the context of the upcoming task. This is time consuming and is usually not predictable and thus affects the worst-case execution time of the task. Such unpredictability should be avoided in real-time systems in which the instant at which a result is available is as important as the result itself. In this paper, we present a hardware component named hardware context switch (HwCS) which replaces the standard L1 cache controller of a processor. It divides the cache in two interchangeable layers and enables to save or restore the content of one layer while the second is simultaneously used as a usual cache by the processor. Saving the cache content after a preemption and restoring this content before resuming the execution of the preempted task, makes the preemption overheads negligible in comparison to the task worst-case execution times. It is theoretically proven that the existing scheduling theory can be used 'as is' with the HwCS by simply reducing the task deadlines, thereby bridging the gap between theory and practice. The HwCS has been implemented in an uniprocessor system as a proof of concept. The first results show a neat improvements on the processor utilisation for a small cost in silicon surface.