For next time: Edit the instructions to include...
Here is what I propose you may want do for lab 3:
(1) Recompile without optimization to get a .s file.
(2) One loop will have one fewer instruction. Insert that
missing instruction.
(3) Compile the .s into an executable. I found I had to do so on a
P3 to get machine code which would run on either platform without
seg faulting. (Not sure why yet.)
(4) When you take data, TAKE ONLY MINIMAL DATA. In particular,
you might just take data on the two values n=100 and n=200.
You can get a clear slope from that. (If you want to confirm the
slope, you can take n=300 as well.)
Now you can focus your attention on content rather than on the
presentation data; most of the data was tangential to your main points
anyway.
We've composed a list of processors
found in the lab which are Pentium III's. You guys are usually
working on Pentium 4's. In any case, be sure to double-check by
looking at /proc/cpuinfo
You should be able to finish this lab quite quickly, and I therefore expect each lab group to find a way to enhance their lab report by conducting an additional experiment. I recommend that the experiment be simple and that you investigate other differences between the Intel Pentium II (or III) and Pentium 4 pipelines.
j instruction,
while the other jumps to an address stored in a register, like the
MIPS jr instruction. (The register is initialized before
the loop with the address of the label at the top of the loop.)
The two computers you will use are not variants of the MIPS architecture,
but rather of the IA-32 architecture:
You can find out what kind of processor a computer has by in a shell window giving the command
cat /proc/cpuinfoYour lab report should specify the particular processor model name.
It would be unreasonable for me to expect you to learn Intel's assembly language after having just struggled with MIPS's. So instead, you'll use a different approach: we'll write the programs in C and compiling them with the gcc compiler. You'll then look at the assembly language output produced by the compiler so that you can see the difference between the two versions, but it is much easier to read an unfamiliar assembly language than to write it.
loop.cTo compile this program using gcc, you can use the following two commands:
gcc -S loop.c gcc -o loop loop.sThe first command compiles the C program loop.c and stops at assembly language (that is what the -S option does). This assembly language version is left in the file loop.s. The second command says to compile (actually assemble) the loop.s assembly language program the rest of the way to an executable machine language program, which it should put in the output file loop. (The -o option just says that what follows it is the output file.)
In order to run the loop program, choose a number of iterations n, and issue the command
./loop nThe program will then print out the number of cycles the loop took. You can repeat this several times with the same value of n to get an idea how repeatable it is, and repeat it with several different values of n to see how the number of cycles grows with n, in particular, how many additional cycles are done for each additional trip through the loop. If there is some variation in the cycle counts for a particular n, you should use the smallest rather than taking an average. This is because we only want to pay attention to cycles actually spent in the processor, not the extra cycles that can sometimes be needed because of memory delays or other effects outside of chapter 6's scope.
i to 0. Therefore, you
should calculate marginal, not average, values in this lab.
goto loop;This is what gets translated by the C compiler into a jump to a label. We want to change this to the C equivalent of jumping to an address specified by a register, which is using goto with the target pointed to by a variable. This can't be done in standard C, but can using a non-standard extension to C that the gcc compiler provides. To learn about this non-standard extension, you'll need to look at the gcc documentation. Here are two ways to locate the documentation:
C-h i. In emacs
info, you can select a menu item by moving the cursor to it and
hitting return, or by typing m, followed by the beginning
of the name of the menu item. (Hit space for name completion.)
Select menu gcc, submenu C Extensions, and
finally Labels as Values.
foo is used in the example without explanation. The idea
is that this is a label appearing in part of the example program that
isn't shown. That is, some statement would be labeled with
foo: (note the colon here). Then after assigning
&&foo to ptr, the statement
goto *ptr; has the same effect as goto foo;
would.
You can compile and run the new version of the program the same way as
the original. You should also compare the two assembly language
versions (the .s files) to make sure that the loop is really the same
except for the jump instruction at the end, and possibly which
specific register has been assigned to hold each variable. The jump
instruction in the IA-32 assembly language is jmp. This
is used for both jumping to a label (MIPS's j) and
jumping via a register (MIPS's jr). The difference is
that if you are jumping to a label, the label appears after the
jmp, while if you are jumping to a location specified by
a register, the jmp is followed by a * and
then the register name. The labels generated by gcc start with .L,
and the register names all start with % (which is like $ for MIPS).
If you have trouble identifying what lines of assembly language are
the loop, or understanding the difference, please ask for help.
You may observe another difference between the loops in the
.s files. While the loops should have different jump
instructions, they should not differ in other ways, since that
would introduce too many variables. You'd like to just compare how the
two jump instructions affect the pipelines. Consequently, you should
edit one the shorter .s files so that it includes the
additional instruction in the loop of the larger .s file.
Note, by the way, that neither version of the program is good C programming style, and there certainly is no reason why a programmer would use the second version, where a variable (or register) is used to jump to the fixed address at the top of the loop. However, these programs let us try out in a simple context the two forms of jump instructions that real programs do use. For example, jumping to an address contained in a register is an important part of how modern object-oriented languages are implemented.
If one goal is to compare data, it's best of all the data to be compared is in one table or graph so that the reader doesn't have to glance back and forth between two tables or graphs.
Graphs can be useful for showing your data, but you need to be careful about some common pitfalls, particularly if you use a computer to generate the graphs rather than doing it by hand:
In your lab report, you should show what the difference in assembly language code is, explain how you determined the cycles-per-iteration and CPI of each version on each processor, and provide your results. You should also try to draw some bigger picture conclusions from these results. What can you say about the pipelines of the two processors? How coping with control hazards is influenced by the kind of jump?