This tutorial illustrates how you can run your TTA designs on a FPGA board. Tutorial consists of two simple example sections and a more general case description section.
Download the tutorial file package from:
Unpack it to a working directory and cd to tce_tutorials/fpga_tutorial
As stated in the introduction the application is coded in assembly. If you went through the assembly tutorial the code is probably easy to understand. The code is in file blink.tceasm. The same application is also written in C code in file blink.c.
As you can see it is a simple one bus architecture without a LSU. There are also 2 ``new'' function units: rtimer and leds. Rtimer is a simple tick counter which provides real time clock or countdown timer operations. Leds is function unit that can write '0' or '1' to FPGA output port. If those ports are connected to leds the FU can control them.
Leds FU requires a new operation definition and the operation is defined in led.opp and led.cc. You need to build this operation defintion:
Now you can compile the assembly code:
tceasm -o asm.tpef tutorial1.adf blink.tceasm
If you wish you can simulate the program with proxim and see how it works but the program runs in endless loop and most of the time it stays in the ``sleep'' loop.
Now you need to select implementations for the function units. This can be done in ProDe. See TCE tour section 3.1.9 for more information. Implementations for leds and rtimer are found from the fpga.hdb shipped with the tutorial files. Notice that there are 2 implementations for the rtimer. ID 3 is for 50 MHz clock frequency and ID 4 for 100 MHz. All other FUs are found from the default hdb.
Save the implementation configuration to tutorial1.idf.
Next step is to generate the VHDL implementation of the processor:
generateprocessor -i tutorial1.idf -o asm_vhdl/proge-output tutorial1.adf
Then create the proram image:
generatebits -f vhdl -p asm.tpef -x asm_vhdl/proge-output tutorial1.adf
Notice that the instruction image format is ``vhdl'' and we request generatebits to not create data image at all. Now, move the generated asm_imem_pkg.vhdl to the asm_vhdl directory and cd there.
mv asm_imem_pkg.vhdl asm_vhdl/
Next step is to connect TTA toplevel core to the memory component and connect the global signals out from that component. This has also been done for you in file tutorial_processor1.vhdl. If you are curious how this is done open the file with your preferred text editor. All the signals coming out of this component are later connected to FPGA pins.
Now you need to open your FPGA tool vendor's FPGA design/synthesis program and create a new project for your target FPGA. Add the three files in asm_vhdl-directory (toplevel file tutorial_processor1.vhdl, inst_mem_logic.vhd and asm_imem_pkg.vhdl) and all the files in proge-output/gcu_ic/ and proge-output/vhdl directories to the project. The toplevel entity name is 'tutorial_processor1'.
Then connect the toplevel signals to appropriate FPGA pins. The pins are most probably described in the FPGA board's user manual. Signal 'clk' is obviously connected to the pin that provides clock signal. Signal 'rstx' is the reset signal of the system and it is active low. Connect it to a switch or pushbutton that provides '1' when not pressed. Signal bus 'leds' is 8 bits wide and every bit of the bus should be connected to an individual led. Don't worry if your board doesn't have 8 user controllable leds, you can leave some of them unconnected. In that case all of the leds are off some of the time.
Compile and synthesize your design with the FPGA tools, program your FPGA and behold the light show!
In this tutorial we will implement the same kind of system as above but this time we include data memory and use C coded application. Application has the same functionality but the algorithm is a bit different. This time we read the led pattern from a look up table and to also test store operation the pattern is stored back to the look up table. Take a look at file blink_mem.c to see how the timer and led operations are used in C code.
The architecture for this tutorial is tutorial2.adf. This architecture is the same as tutorial1.adf with the exception that now it has a load store unit to interface it with data memory.
You need to compile the operation behaviour for the led function unit if you already haven't done it:
Then compile the program:
tcecc -O3 -a tutorial2.adf -o blink.tpef blink_mem.c
Before you can generate processor vhdl you must select implementations for the function units. Open the architecture in ProDe and select Tools->Processor Implementation...
It is important that you choose the implementation for LSU from the fpga.hdb shipped with the tutorial files. This implementation has more FPGA friendly byte enable definition. Also the implementations for leds and timer FUs are found from fpga.hdb. As mentioned in the previous tutorial, timer implementation ID 3 is meant for 50 MHz clock frequency and ID 4 for 100 MHz clock. Other FUs are found from the default hdb.
Generate the processor VHDL:
generateprocessor -i tutorial2.idf -o c_vhdl/proge-output tutorial2.adf
Next step is to generate binary images of the program. Instruction image will be generated again as a VHDL array package. But the data memory image needs some consideration. If you're using an Altera FPGA board the Program Image Generator can output Altera's Memory Initialization Format (mif). Otherwise you need to consult the FPGA vendor's documentation to see what kind of format is used for memory instantiation. Then select the PIG output format that you can convert to the needed format with the least amount of work. Of course you can also implement a new image writer class to PIG. Patches are welcome.
Image generation command is basically the following:
generatebits -f vhdl -d -w 4 -o mif -p blink.tpef -x c_vhdl/proge-output tutorial2.adf
Switch '-d' tells PIG to generate data image. Switch '-o' defines the data image output format. Change it to suit your needs if necessary. Switch '-w' defines the width of data memory in MAUs. By default MAU is assumed to be 8 bits and the default LSU implementations are made for memories with 32-bit data width. Thus the width of data memory is 4 MAUs.
Move the created images to the vhdl directory:
mv blink_imem_pkg.vhdl c_vhdl/
mv blink_data.mif c_vhdl/
TTA vhdl codes are in the proge-output directory. Like in the previous tutorial file inst_mem_logic.vhd holds the instruction memory component which uses the created blink_imem_pkg.vhdl. File tutorial_processor2.vhdl is the toplevel design file and again the TTA core toplevel is connected to the instruction memory component and global signals are connected out from this design file.
Creating data memory component
Virtually all FPGA chips have some amount of internal memory which can be used in your own designs. FPGA design tools usually provide some method to easily create memory controllers for those internal memory blocks. For example Altera's Quartus II design toolset has a MegaWizard Plug-In Manager utility which can be used to create RAM memory which utilizes FPGA's internal resources.
There are few points to consider when creating a data memory controller:
How this all shows in TCE is that data memory address width defined in ADF is 2 bits wider than the actual address bus coming out of LSU. When you are creating the memory component you should consider this.
When you are creating the memory controller you should add support for byte enable signals.
Connecting the data memory component
Next step is to interface the newly generated data memory component to TTA core. LSU interface is the following:
fu_lsu_data_in : in std_logic_vector(fu_lsu_dataw-1 downto 0); fu_lsu_data_out : out std_logic_vector(fu_lsu_dataw-1 downto 0); fu_lsu_addr : out std_logic_vector(fu_lsu_addrw-2-1 downto 0); fu_lsu_mem_en_x : out std_logic_vector(0 downto 0); fu_lsu_wr_en_x : out std_logic_vector(0 downto 0); fu_lsu_bytemask : out std_logic_vector(fu_lsu_dataw/8-1 downto 0);
Meanings of these signals are:
|fu_lsu_data_in||Data from the memory to LSU|
|fu_lsu_data_out||Data from LSU to memory|
|fu_lsu_addr||Address to memory|
|fu_lsu_mem_en_x||Memory enable signal which is active low. LSU asserts this signal to '0' when memory operations are performed. Otherwise it is '1'. Connect this to memory enable or clock enable signal of the memory controller.|
|fu_lsu_wr_en_x||Write enable signal which is active low. During write operation this signal is '0'. Read operation is performed when this signal '1'. Depending on the memory controller you might need to invert this signal.|
|fu_lsu_bytemask||Byte mask / byte enable signal. In this case the signal width is 4 bits and each bit represents a single byte. When the enable bit is '1' the corresponding byte is enabled and value '0' means that the byte is ignored.|
Open file tutorial_processor2.vhdl with your preferred text editor. From the comments you can see where you should add the memory component declaration and component instantiation. Notice that those LSU signals are connected to wires (signals with appendix '_w' in the name). Use these wires to connect the memory component.
After you have successfully created the data memory component and connected it you should add the rest of the design VHDL files to the design project. All of the files in proge-output/gcu_ic/ and proge-output/vhdl/ directories need to be added.
Next phase is to connect toplevel signals to FPGA pins. Look at the final section of the previous tutorial for more verbose instructions how to perform pin mapping.
Final step is to synthesize the design and configure the FPGA board. Then sit back and enjoy the light show.
Whenever you change the source code you need to recompile your program and generate the binary images again. And move the images to right folder if it's necessary.
In addition you can compile the code without optimizations. This way the compiler leaves function calls in place and uses stack. The compilation command is then:
tcecc -O0 -a tutorial2.adf -o blink.tpef blink_mem.c
Pekka Jääskeläinen 2018-03-12