About Transport-Triggered Architectures
The TTA Principle
Transport Triggered Architecture (TTA) is a processor design philosophy where the processors internal datapaths are exposed in the instruction set. All operation parameter reads and result writes are explicitly stated in the instruction set.
TTA processor datapath consists of execution units, register files and buses that connect them. The processor is programmed by controlling directly the buses, and operations are executed as side-effect of these moves, by moving data to a specific trigger port of an function unit.
One operation typically consist of multiple moves, one for each operand and one for each result value. The instruction word for a TTA processor typically consists of a move slot for every datapath bus, so that every instruction word can contain multiple moves, one for each bus. Every instruction word can contain moves from multiple operations, so the instruction set states explicit instruction level parallelism, like VLIW processors. On TTA processors, however, all moves needed to perform an operation typically span multiple instruction words.
TTA Compared to VLIW
Compared to VLIW, TTA exposes more of the internal architecture in the instruction set. In VLIW register file bypassing is either not done, or done implicitly, but on TTA bypassing is done explicitly by programming a move directly from one function unit to another.
Because all the datapath buses inside the processor are visible on the instruction set, they can be optimized to contain only the necessary connections, allowing such bypasses that give most performance benefits but not allowing some rarely used non-performance-critical bypass opportunities. Having less connections on datapath buses allows the interconnection network to consume smaller area, and the processor to become faster and more energy-efficient.
As moves of one operation can span multiple instruction words, all of the processor state cannot be represented by the registers alone, and also includes state inside execution units. This makes supporting interrupts difficult and costly to implement, and most TTA processors do not have interrupt support. For the same reason also hardware-based pre-emptive multi-threading cannot be easily implemented on a TTA processor. This limits the applicability of TTA to applications that need not be interrupted, but can run to completion, which is often the case with accelerators and application specific processors.
TTAs as Application-Specific Processors
TTA processors can be easily customized, by just adding new function units and register files, and connecting them to existing buses, and by adding new transport buses to improve parallel data transport capabilities. Thus, TTA is a good processor template for the design of Application Specific Processors (ASP) where custom hardware operations are often important for performance. The drawbacks of TTAs often are not issues with ASPs where there is often no need to run multitasking operating systems or third party code.