Subsections

1 Architecture Simulation and Debugging

TTA Processor Simulator simulates the process of running a TTA program on its target TTA processor. Provides profiling, utilization, and tracing data for Explorer, Estimator and Compiler Backend. Additionally, it offers debugging capabilities.

Input: TPEF, [ADF]

Output: TraceDB

There are two user interfaces to the simulating and debugging functionalities. One for the command line more suitable for scripting, and another with more user-friendly graphical interface more suitable for program debugging sessions. Both interfaces provide a console which supports the Tcl scripting language.

1 Processor Simulator CLI (ttasim)

The command line user interface of TTA Simulator is called 'ttasim'. The command line user interface is also visible in the graphical user interface in form of a console window. This manual covers the simulator control language used to control the command line simulator and gives examples of its usage.

1 Usage

The usage of the command line user interface of the simulator is as follows:

ttasim <options>

In case of a parallel simulation, a machine description file can be given before giving the simulated program file. Neither machine file or the program file are mandatory; they can also be given by means of the simulator control language.

The possible options for the application are as follows:

Short Name Long Name Description

a adf Sets the architecture definition file (ADF).

d debugmode Start simulator in interactive "debugging mode". This is enabled by default. Use --no-debugmode to disable.

e execute-script Executes the given string as a simulator control language script. For an examples of usage, see later in this section.

p program Sets the program to be simulated. Program must be given as a TTA program exchange format file (.TPEF)

q quick Simulates the program as fast as possible using the compiled simulation engine.

Short Name	Long Name	Description
a	adf	Sets the architecture definition file (ADF).
d	debugmode	Start simulator in interactive "debugging mode". This is enabled by default. Use `--no-debugmode` to disable.
e	execute-script	Executes the given string as a simulator control language script. For an examples of usage, see later in this section.
p	program	Sets the program to be simulated. Program must be given as a TTA program exchange format file (.TPEF)
q	quick	Simulates the program as fast as possible using the compiled simulation engine.

1 Example: Simulating a Parallel Program Without Entering Interactive Mode

The following command simulates a parallel program until the program ends, without entering the debugging mode after simulation.

ttasim --no-debugmode -a machine.adf -p program.tpef

2 Example: Simulating a Program Until Main Function Without Entering Interactive Mode

The following command simulates a program until its main function and prints consumed clock cycles so far. This is achieved by utilizing the simulator control language and the '-e' option, which allows entering scripts from the command line.

ttasim --no-debugmode -e "until main; puts [info proc cycles];" -a machine.adf -p program.tpef

3 Using the Interactive Debugging Mode

Simulator is started in debugging mode by default. In interactive mode, simulator prints a prompt "(ttasim)" and waits for simulator control language commands. This example uses simulator control language to load a machine and a program, run the simulation, print the consumed clock cycles, and quit simulation.

ttasim
(ttasim) mach machine.adf
(ttasim) prog program.tpf
(ttasim) run
(ttasim) info proc cycles
54454
(ttasim) quit

2 Fast Compiled Simulation Engine

The command line version of the Simulator, 'ttasim', supports two different simulation engines. The default simulation engine interprets each instruction and then simulates the processor behavior accordingly. While this is good for many cases, it can be relatively slow when compared to the computer it is being simulated on. Therefore, the Simulator also has a highly optimized mode that uses compiled simulation techniques for achieving faster simulation execution. In this simulation, the TTA program and machine are compiled into a single binary plug-in file which contains functions for simulating basic blocks directly in native machine code, allowing as fast execution as possible.

1 Usage

1 Example: Simulating a Parallel Program Using The Compiled Simulation Engine

The following command simulates a parallel program using the compiled simulation engine. (``-q'')

ttasim -a machine.adf -p program.tpef -q

Currently, the behaviour of the compiled simulation can only be controlled with a limited set of Simulator commands (such as 'stepi', 'run', 'until', 'kill'). Also, the simulation runs only at an accuracy of basic blocks so there is no way to investigate processor components between single cycles.

The following environment variables can be used to control the compiled simulation behavior:

Environment variable Description Default value

TTASIM_COMPILER Specifies the used compiler. ``gcc''

TTASIM_COMPILER_FLAGS Compile flags given to the compiler. ``-O0''

TTASIM_COMPILER_THREADS Number of threads used to compile. ``3''

Environment variable	Description	Default value
TTASIM_COMPILER	Specifies the used compiler.	``gcc''
TTASIM_COMPILER_FLAGS	Compile flags given to the compiler.	``-O0''
TTASIM_COMPILER_THREADS	Number of threads used to compile.	``3''

2 ccache

http://ccache.samba.org/

The compiled simulator can benefit quite a bit from different third party software. The first one we describe here is a compiler cache software called ccache. Ccache works by saving compiled binary files into a cache. When ccache notices that a file about to be compiled is the same as file found in the cache, it simply reloads file from the cache, thus eliminating recompilation of unmodified files and saving time. This can be very useful when running the same simulation program again, due to drastically reduced compilation times.

3 distcc

http://distcc.samba.org/

Another useful tool to use together with the compiled simulator is a distributed compiler called distcc. Distcc works by distributing the compilation of simulation engine to multiple computers and compiling the generated source files in parallel.

After installing distcc, you can set ttasim to use the distcc compiler using the following environment variable:

export TTASIM_COMPILER="distcc"

or if ccache is also installed, use:

export TTASIM_COMPILER="ccache distcc"

Also, remember to set the amount of used threads high enough. A good number of threads to use would be approximately the amount of CPU cores available. For example, setting 6 compiler threads can be done like following:

export TTASIM_COMPILER_THREADS=6

3 Remote Debugger

When a TTA has been implemented to FPGA (or ASIC), ttasim can be used as a remote debug interface to the processor. 'ttasim' can connect to the TCE built-in debugger (under construction) with

ttasim -a machine.adf -p program.tpef -r

A convenience stub implementation for user-implemented debugging support in the TTA is given in 'tce/src/applibs/Simulator/CustomDBGController.cc'

ttasim -a machine.adf -p program.tpef -c

Integration with 'proxim' is currently missing.

4 Simulator Control Language

This section describes all the Simulator commands that can be entered when the Simulator runs in debug mode. The Simulator displays a new line with the prompt string only when it is ready to accept new commands (the simulation is not running). The running simulation can be interrupted at any time by the key combination CTRL-c. The simulator stops simulation and prompts the user for new commands as if it had been stopped by a breakpoint.

The Simulator control language is based on the Toolset Control Language. It extends the predefined set of Tcl commands with a set of commands that allow to perform the functions listed above. In addition to predefined commands, all basic properties of Tcl (expression evaluation, parameter substitution rules, operators, loop constructs, functions, and so on) are supported.

1 Initialization

When the Simulator is run in debug mode, it automatically reads and executes the initialization command file `.ttasim-init' if found in the user home directory. The `.ttasim-init' file allows user to define specific simulator settings (described in section 6.1.4.2) which are enabled everytime ttasim is executed.

After the initialization command sequence is completed, the Simulator processes the command line options, and then reads the initialization command file with the same name in current working directory.

After it has processed the initialization files and the command line options, the Simulator is ready to accept new commands, and prompts the user for input. The prompt line contains the string `(ttasim)'.

2 Simulation Settings

Simulation settings are inspected and modified with the following commands.

setting variable value: Sets a new value of environment variable variable.
setting variable: Prints the current value contained by environment variable variable.
setting: Prints all settings and their current values.

Currently, the following settings are supported.

bus_trace boolean: Enables writing of the bus trace. Bus trace stores values written to each bus in each simulated clock cycle.
execution_trace boolean: Enables writing of the basic execution trace. Basic execution trace stores the address of the executed instruction in each simulated clock cycle.
history_filename string: The name of the file to store the command history, if command history saving is enabled.
history_save boolean: Enables saving command history to a file.
history_size integer: Maximum count of last commands stored in memory. This does not affect writing of the command history log, all commands are written to the log if logging is enabled.
next_instruction_printing boolean: Print the next executed instruction when simulation stops, for example, after single-stepping or at a breakpoint.
procedure_transfer_tracking boolean: Enables procedure transfer tracking. This trace can be used to easily observe which procedures were called and in which order. The trace is saved in 'procedure_transfer' table of Trace DB. This information could be derived from 'execution_trace', but simulation is very slow when it is enabled, this type of tracking should be faster.
profile_data_saving boolean: Save program profile data to trace database after simulation.
rf_tracking boolean: Enables concurrent register file access tracking. This type of tracking makes the simulation speed much worse, so it is not enabled by default. The produced statistics can be browsed after simulation by using the command 'info proc stats'.
simulation_time_statistics boolean: Prints time statistics for the last command ran (run, until, nexti, stepi).
simulation_timeout integer: Stops the simulation after specified timeout. Value of zero means no timeout.
static_compilation boolean: Switch between static and dynamic compilation when running compiled simulation.
utilization_data_saving boolean: Save processor utilization data to trace database after simulation.

3 Control of How the Simulation Runs

The commands described in this section allow to control the simulation process.

Before simulation can start, a program must be loaded into the Simulator. If no program is loaded, the command run causes the following message:

  Simulation not initialized.

run: Starts simulation of the program currently loaded into the Simulator. The program can be loaded by prog command (see Section 6.1.4.6) or may be given directly as argument, on the command line. Simulation runs until either a breakpoint is encountered or the program terminates.
resume [count]: Resume simulation of the program until the simulation is finished or a breakpoint is reached. The count argument gives the number of times the continue command is repeated, that is, the number of times breakpoints should be ignored.
stepi [count]: Advances simulation to the next machine instructions, stepping into the first instruction a new procedure if a function call is simulated. The count argument gives the number of machine instruction to simulate.
nexti [count]: Advances simulation to the next machine instructions in current procedure. If the instruction contains a function call, simulation proceeds until control returns from it, to the instruction past the function call. The count argument gives the number of machine instruction to simulate.
until [arg]: Continue running until the program location specified by arg is reached. Any valid argument that applies to command break (see Section 6.1.4.5) is also a valid argument for until. If the argument is omitted, the implied program location is the next instruction. In practice, this command is useful when simulation control is inside a loop and the given location is outside it: simulation will continue for as many iterations as required in order to exit the loop (and reach the designated program location).
kill: Terminate the simulation. The program being simulated remains loaded and the simulation can be restarted from the beginning by means of command run. The Simulator will prompt the user for confirmation before terminating the simulation.
quit: This command is used to terminate simulation and exit the Simulator.

4 Examining and modifying Program Code and Data

The Simulator allows to examine and modify the program being simulated and the data it uses

x [/anfu][addr]

This low-level command prints the data in memory starting at specified addresses addr. The optional parameters n and u specify how much memory to display and how to format it.

a: Parameter [/a address_space] can be used to select the address space if there are multiple address spaces in the target machine.
n: Repeat count: how many data words (counting by units u) to display. If omitted, it defaults to 1.
f: Target filename. Setting this causes the memory contents to be printed as binary data to the given file.
u: Unit size: `b' (MAU, a byte in byte-addressed memories), `h' (double MAU), `w' (quadruple word, a `word' in byte-addressed 32-bit architectures). The unit size is ignored for formats `s' and `i'.

If addr is omitted, then the first address past the last address displayed by the previous x command is implied. If the value of n or u is not specified, the value given in the most recent x command is maintained.

The values printed by command x are not entered in the value history (see Section 6.1.4.9).

load_data [/a address space] address file [size]

Reads binary data from filename to the specified address in memory. Optional parameter /a address_space can be used to select the address space if there are multiple address spaces in the target machine. Optional parameter size specifies read size in bytes.

symbol_address datasym

Returns the address of the given data symbol (usually a global variable).

disassemble [addr1 [addr2]

] Prints a range of memory addresses as machine instructions. When two arguments addr1, addr2 are given, addr1 specifies the first address of the range to display, and addr2 specifies the last address (not displayed). If only one argument, addr1, is given, then the function that contains addr1 is disassembled. If no argument is given, the default memory range is the function surrounding the program counter of the selected frame.

5 Control Where and When to Stop Program Simulation

A breakpoint stops the simulation whenever the Simulator reaches a certain point in the program. It is possible to add a condition to a breakpoint, to control when the Simulator must stop with increased precision. There are two kinds of breakpoints: breakpoints (proper) and watchpoints. A watchpoint is a special breakpoint that stops simulation as soon as the value of an expression changes.

where num is a unique number that identifies the breakpoint or watchpoint and description describes the properties of the breakpoint. The properties include: whether the breakpoint must be deleted or disabled after it is reached; whether the breakpoint is currently disabled; the program address of the breakpoint, in case of a program breakpoint; the expression that, when modified by the program, causes the Simulator to stop, in case of a watchpoint.

bp address: Sets a breakpoint at address address. Argument can also be a code label such as global procedure name (e.g. 'main').
bp args if: Sets a conditional breakpoint. The arguments args are the same as for unconditional breakpoints. After entering this command, Simulator prompts for the condition expression. Condition is evaluated each time the breakpoint is reached, and the simulation only when the condition evaluates as true.
tbp args: Sets a temporary breakpoint, which is automatically deleted after the first time it stops the simulation. The arguments args are the same as for the bp command. Conditional temporary breakpoints are also possible (see command condition below).
watch: Sets a watchpoint for the expression expr. The Simulator will stop when the value of given expression is modified by the program. Conditional watchpoints are also possible (see command condition below).
condition [num] [expr]: Specifies a condition under which breakpoint num stops simulation. The Simulator evaluates the expression expr whenever the breakpoint is reached, and stops simulation only if the expression evaluates as true (nonzero). The Simulator checks expr for syntactic correctness as the expression is entered.
When condition is given without expression argument, it removes any condition attached to the breakpoint, which becomes an ordinary unconditional breakpoint.
ignore [num] [count]: Sets the number of times the breakpoint num must be ignored when reached. A count value zero means that the breakpoint will stop simulation next time it is reached.
enablebp [delete|once] [num ...]: Enables the breakpoint specified by num. If once flag is specified, the breakpoint will be automatically disabled after it is reached once. If delete flag is specified, the breakpoint will be automatically deleted after it is reached once.
disablebp [num ...]: Disables the breakpoint specified by num. A disabled breakpoint has no effect, but all its options (ignore-counts, conditions and commands) are remembered in case the breakpoint is enabled again.
deletebp [num ...]: Deletes the breakpoint specified by num. If no arguments are given, deletes all breakpoints currently set, asking first for confirmation.
info breakpoints [num]: Prints a table of all breakpoints and watchpoints. Each breakpoint is printed in a separate line. The two commands are synonymous.

6 Specifying Files and Directories

The Simulator needs to know the file name of the program to simulate/debug and, usually, the Architecture Definition File (ADF) that describes the architecture of the target processor on which the program is going to run.

prog [filename]

Load the program to be simulated from file filename. If no directory is specified with set directory, the Simulator will search in the current directory.

If no argument is specified, the Simulator discards any information it has on the program.

mach [filename]

Load the machine to be simulated from file filename. If no directory is specified with set directory, the Simulator will search in the current directory.

In case a parallel program is tried to be simulated without machine, an error message is printed and simulation is terminated immediately. In some cases the machine file can be stored in the TPEF file.

conf [filename]

Load the processor configuration to be simulated from file filename. If no directory is specified with set directory, the Simulator will search in the current directory.

Simulator expects to find the simulated machine from the processor configuration file. Other settings are ignored. This can be used as replacement for the mach command.

7 Examining State of Target Processor and Simulation

The current contents of any programmer visible state, which includes any programmable register, bus, or the last data word read from or written to a port, can be displayed. The value is displayed in base 10 to allow using it easily in Tcl expressions or conditions. This makes it possible, for example, to set a conditional breakpoint which stops simulation only if the value of some register is greater than some constant.

info proc cycles

Displays the total execution cycle count and the total stall cycles count.

info proc mapping

Displays the address spaces and the address ranges occupied by the program: address space, start and end address occupied, size.

info proc stats

In case of parallel simulation, displays current processor utilization statistics. In case 'rf_tracking' setting is enabled and running parallel simulation, also lists the detailed register file access information.

info regfiles

Prints the name of all the register files of the target processor.

info registers regfile [regname]

Prints the value of register regname in register file regfile, where regfile is the name of a register file of the target processor, and regname is the name of a register that belongs to the specified register file.

If regname is omitted, the value of all registers of the specified register file is displayed.

info funits

Prints the name of all function units of the target processor.

info iunits

Prints the name of all immediate units of the target processor.

info immediates iunit [regname]

Prints the value of immediate register regname in immediate unit iunit, where iunit is the name of an immediate unit of the target processor, and regname is the name of a register that belongs to the specified unit.

If regname is omitted, the value of all registers of the specified immediate unit is displayed.

info ports unit [portname]

Prints the last data word read from or written to port portname of unit unit, where unit may be any function unit, register file or immediate unit of the target processor. The value of the data word is relative to the selected stack frame.

If portname is omitted, the last value on every port of the specified unit is displayed.

info busses [busname]

Displays the name of all bus segments of transport bus busname. If the argument is omitted, displays the name of the segments of all busses of the target processor.

info segments bus [segmentname]

] Prints the value currently transported by bus segment segmentname of the transport bus busname.

If no segment name is given, the Simulator displays the contents of all segments of transport bus bus.

info program

Displays information about the status of the program: whether it is loaded or running, why it stopped.

info program is_instruction_reference ins_addr move_index

Returns 1 if the source of the given move refers to an instruction address, 0 otherwise.

info stats executed_operations

Prints the total count of executed operations.

info stats register_reads

Prints the total count of register reads.

info stats register_writes

Prints the total count of register writes.

8 Miscellaneous Support Commands and Features

The following commands are facilities for finer control on the behaviour of the simulation control language.

help [command]: Prints a help message briefly describing command command. If no argument is given, prints a general help message and a listing of supported commands.

9 Command and Value History Logs

All commands given during a simulation/debugging session are saved in a command history log. This forms a complete log of the session, and can be stored or reloaded at any moment. By loading and running a complete session log, it is possible to resume the same state in which the session was saved.

It is possible to run a sequence of commands stored in a command file at any time during simulation in debug mode using the source command. The lines in a command file are executed sequentially and are not printed as they are executed. An error in any command terminates execution of the command file.

commands [num]: Displays the last num commands in the command history log. If the argument is omitted, the num value defaults to 10.
source filename: Executes the command file filename.

5 Traces

Simulation traces are stored in a SQLite 3 binary file and multiple pure ascii files. The SQLite file is named after the program file by appending '.trace' to its end. The additional trace files append yet another extension to this, such as '.calls' for the call profile and '.profile' for the instruction execution profile. The SQLite file can be browsed by executing SQL queries using the sqlite client and the pure text files can be browsed using any text viewer/editor.

By default simulation traces are dumped in the same directory as loaded program file. It is possible to override that directory by setting an environment variable TTASIM_TRACE_DIR pointing to desired location.

1 Profile Data

The simulator is able to produce enough data to provide an inclusive call profile. In order to visualize this data in a call graph, the kcachegrind GUI tool can be used.

First, produce a trace by running the following commands before initializing the simulation by loading a machine and a program:

 setting profile_data_saving 1
 setting procedure_transfer_tracking 1

It's recommend to produce an assembly file from the program to make the profile data contain information about the program and the kcachegrind able to show the assembly lines for the cost data:

 tcedisasm -F mymachine.adf myprogram.tpef

After this, the simulation should collect the information necessary to build a kcachegrind compatible trace file. The file can be produced with an helper script shipped with TCE as follows:

 generate_cachegrind myprogram.tpef.trace

This command generates a file myprogram.tpef.trace.cachegrind which can be loaded to the kcachegrind for visualized inclusive call profile:

 kcachegrind myprogram.tpef.trace.cachegrind

Alternatively, the call profile can be dumped to the command line using the 'callgrind_annotate' tool from the 'valgrind' package:

 callgrind_annotate myprogram.tpef.trace.cachegrind --inclusive=yes

In case --inclusive=yes is not given, exclusive call profile is printed. Exclusive profile shows the cycles for each function itself and does not include the cost from the called functions.

6 Processor Simulator GUI (Proxim)

Processor Simulator GUI (Proxim) is a graphical frontend for the TTA Processor Simulator.

1 Usage

This section is intended to familiarize the reader to basic usage of Proxim. This chapter includes instructions to accomplish only the most common tasks to get the user started in using the Simulator GUI.

The following windows are available:

Machine State window Displays the state of the simulated processor.
Disassembly window for displaying machine level source code of the simulated application.
Simulator console for controlling the simulator using the simulator control language.
Simulation Control Window: Floating tool window with shortcut buttons for items in the Program menu.

1 Console Window

Textual output from the simulator and all commands sent to the simulator engine are displayed in the Simulator Console window, as well as the input and output from the simulated program. Using the window, the simulator can be controlled directly with Simulator Control Language accepted also by the command line interface of the simulator. For list of available commands, enter 'help' in the console.

Most of the commands can be executed using graphical dialogs and menus, but the console allows faster access to simulator functionality for users familiar with the Simulator Control Language. Additionally, all commands performed using the GUI are echoed to the console, and appended to the console command history.

The console keeps track of performed commands in command history. Commands in the command history can be previewed and reused either by selecting the command using up and down arrow keys in the console window, or by selecting the command from the Command History.

The Command menu in the main window menubar contains all GUI functionality related to the console window.

2 Simulation Control Window

Running simulation can be controlled using the Simulation Control window.

Consequences of the window buttons are as follows:

Run/Stop: If simulation is not running, the button is labeled 'Run', and it starts simulation of the program loaded in the simulator. If simulation is running, the button is labled 'Stop', and it will stop the simulation.
Stepi: Advances simulation to the next machine instructions.
Nexti: Advances simulation to the next machine instructions in current procedure.
Continue: Resumes simulation of the program until the simulation is finished or a breakpoint is reached.
Kill: Terminates the simulation. The program being simulated remains loaded and the simulation can be restarted from the begining.

3 Disassembly Window

The disassembly window displays the machine code of the simulated program. The machine code is displayed one instruction per line. Instruction address and instruction moves are displayed for each line. Clicking right mouse button on an instruction displays a context menu with the following items:

Toggle breakpoint: Sets a breakpoint or deletes existing breakpoint at the selected instruction.
Edit breakpoint...: Opens selected breakpoint in Breakpoint Properties dialog.

4 Machine State Window

The Machine State Window displays the state of the processor running the simulated program. The window is split horizontally to two subwindows. The window on the left is called Status Window, and it displays general information about the state of the processor, simulation and the selected processor block. The subwindow on the right, called Machine Window, displays the machine running the simulation.

The blocks used by the current instruction are drawn in red color. The block utilization is updated every time the simulation stops.

Blocks can be selected by clicking them with LMB. When a block is selected, the bottom of the status window will show the status of the selected block.

2 Profiling with Proxim

Proxim offers simple methods for profiling your program. After you have executed your program you can select ``Source'' -> ``Profile data'' -> ``Highlight top execution count'' from the top menu. This opens a dialog which shows execution counts of various instruction address ranges. The list is arranged in descending order by the execution count.

If you click a line on the list the disassembly window will focus on the address range specified on that line. You can trace in which function the specific address range belongs to by scrolling the disassembly window up until you find a label which identifies the function. You must understand at least a little about assembly coding to find the actual spot in C code that produces the assembly code.

Pekka Jääskeläinen 2018-03-12