ILP Instruction – Level Parallelism

ILP Instruction – Level Parallelism
Mara Cristina Maia da Silva
Departamento de Informática
Universidade do Minho
4710 Braga, Portugal
Abstract. In the quest to develop high-speed performance computers, Intel incorporated into its
processors, technologies that aim to improve performance. Technologies such as the structure
pipeline, super architecture to scale and super pipeline have been incorporated in its architecture,
as a technique for the creation of fast machines, capable of processing and executing various
instructions simultaneously in different levels, hence the appearance of the approach of
parallelism on an instructional level which will be demonstrated in this article.
1. Introduction
The increase of computer performance is, without a doubt, one of the biggest concerns
of the technological world already for quite some time. The performance of processors
is a topic ardently studied within this context, with the execution of parallel instructions
being used more to reach such an objective. The best known approach is the use of
pipelines. A great percentage of processors use this structure to overlap the execution of
instructions and to improve its performance. Aiming at machines of high performance,
Intel developed processors with a complex set of instructions. It incorporated in each
new version sophisticated projects. Perhaps the use and pipelines has been the biggest
step in the evolution of the architecture of processors. This structure has been
reorganized and extended to what currently is known as super architecture to scale.
2. Pentium
Characteristics
•
•
2 pipelines of whole numbers of 5 periods, and one comma-floating platform, of
8 periods
Super-scale Architecture
9 It can execute, simultaneously, an instruction in pipeline u and another one
in pipeline v. In some circumstances whole number and floating comma
pipelines also can work in parallel. [1 ]
•
Vectorial Architecture: Executes vectorial instructions on matrices of data.
In this way, each instruction involves a chain of repeated operations (ideal for the
use of pipelines). In computation, vector is a data set, all of the same type, stored in
memory in commanded form. A vectorial processor possesses appropriate resources
for the execution of vectorial instructions, as the vectorial registers and functional
1
pipelines, which execute the operations (arithmetical, logical, storage, etc.) on the
vectors. [ 1 ]
Organization of the Pipeline.
•
The main characteristics of the Pentium Pipeline are: Capacity of forecasting
line shunting; Support in the hardware for optimization of the code; Logic for
detecting and treatment of auto-modifiable code;
•
Each one of the two pipelines consists basically of 5 periods:
•
Search of instructions (from cache of instructions); decoding of the
instruction; address generation; execution; writing (write-back).
The superscalar organization of the two pipelines U and V allows the Pentium to search
for and codify instructions simultaneously. This operation follows the following steps:
• verify if two consecutive instructions can be executed in parallel;
• If they will be simple instructions, or in other words, they do not require
microcode and they can be executed in a machine cycle.
• If the first instruction does not depend on the result of the first.
• Being possible, each instruction is sent to a pipeline (U and V);
• Not being possible, the first instruction goes to pipeline U; and the second
instruction is compared with a third instruction to identify possible parallelism.
If it is not be possible, the second instruction also goes to pipeline U.
3. Pentium Pro
Characteristics
This architecture introduced a more consistent superscalar project.
• Dynamic execution with:
o prediction of various jumps (multiple branch prediction), with >90%
success
o analysis of data flow, allows to the processor non sequential execution
(out of order) of instructions independent of its original order.
o Speculative execution of instructions
• Super-pipeline of 14 levels divided into 3 sections, superscalar architecture.
Superscalar Architecture to level 3 "3x Pipelines". The Pentium Pro internally functions
as if there were three processors in parallel, being capable of executing up to three
instructions per internal impulse of the clock.
4. Pentium II
Characteristics
Basically the same as the Pentium Pro.
Pipeline of 3 parts:
2
•
Fetch & Decode Unit
•
Dispatch / Execute Unit
•
Retirement Unit
5. Pentium III
Characteristic
What basically differs from Pentium II is an attempt of profit of speed with an increase
of the instructions:
• Micro architecture with dynamic execution:
o Speculative execution
o Multiple prediction of jumps
o Analysis of the flow of data.
It creates a rearranged version of the instructions form to optimize the performance,
through the analysis of the dependencies of data between instructions.
6. Pentium 4
Characteristic
Hyper-pipeline Technology
•
•
•
With at least 20 periods
Some micro-operations need periods of multiple executions.
Project is more complex than the one used in pipeline of 5 periods in x86 up to
the Pentium [3]
Because the pipelines had many periods, a stall provoked considerable loss of
performance, by losing sufficient clock cycles (so much that stall compelled a stop).
The biggest advantage was the clock speed, these losses compromised its performance
[4].
7.
Conclusion
An increase in the numbers of pipeline in the Intel processors meant an increase in its
performance, in other words, a simultaneous increase in the number of executed
instructions, but in parallel in a single clock cycle. Therefore the various instructions are
executed at a higher speed within a smaller time frame.
3
References
[1] Architecture and Organization of Computer, STALLINGS. William, 5 Editions,
Prentice Hall, 2002
[2]www.scholar.google.com.brhttp://gppd.inf.ufrgs.br/projects/apse/papers/simman.pd),
1999
[3] Architecture of computers. Cesar A.F of Rose, (artigo publicado na Net), 2004
[4] Processador Pentium 4, Sandro Rogério Pereira. (artigo publicado na NET), 2000
[5] Architecture
of
Computers
www.ic.unicamp.br/~ducatte) 2002
I,
Paulo
4
Cesar
Centoducatte
(artigo