Division and square-root are not pipelined and are executed in separate units that share the FPU's ports. Addition and multiplication are pipelined and have a latency of three and five cycles, respectively. The FPU executes floating-point operations. The second integer unit, which is connected to port 1, does not have these facilities and is limited to simple operations such as add, subtract, and the calculation of branch target addresses. Of the two integer units, only the one that shares the path with the FPU on port 0 has the full complement of functions such as a barrel shifter, multiplier, divider, and support for LEA instructions. One of the integer units shares the same ports as the FPU, and therefore the Pentium Pro can only dispatch one integer micro-op and one floating-point micro-op, or two integer micro-ops per a cycle, in addition to micro-ops for the other three execution units. The Pentium Pro has a total of six execution units: two integer units, one floating-point unit (FPU), a load unit, store address unit, and a store data unit. In each clock cycle, up to five micro-ops can be dispatched to five execution units. Micro-ops exit the re-order buffer (ROB) and enter a reserve station (RS), where they await dispatch to the execution units. The Pentium Pro was the first processor in the x86-family to support upgradeable microcode under BIOS and/or operating system (OS) control. Instructions that require more micro-ops than four are translated with the assistance of a sequencer, which generates the required micro-ops over multiple clock cycles. Likewise, the simple decoders are limited to instructions that can be translated into one micro-op. Thus, x86 instructions that operate on the memory (e.g., add this register to this location in the memory) can only be processed by the general decoder, as this operation requires a minimum of three micro-ops. The general decoder can generate up to four micro-ops per cycle, whereas the simple decoders can generate one micro-op each per cycle. The micro-ops are reduced instruction set computer (RISC)-like that is, they encode an operation, two sources, and a destination. x86 instructions are decoded into 118-bit micro-operations (micro-ops). This restricts the Pentium Pro's ability to decode multiple instructions simultaneously, limiting superscalar execution. The decoders are unequal in ability: only one can decode any x86 instruction, while the other two can only decode simple x86 instructions. The Pentium Pro has an 8 KB instruction cache, from which up to 16 bytes are fetched on each cycle and sent to the instruction decoders. It also had a wider 36-bit address bus, usable by Physical Address Extension (PAE), allowing it to access up to 64 GB of memory. The Pentium Pro thus featured out of order execution, including speculative execution via register renaming. The Pentium Pro pipeline had extra decode stages to dynamically translate IA-32 instructions into buffered micro-operation sequences which could then be analysed, reordered, and renamed in order to detect parallelizable operations that may be issued to more than one execution unit at once. The Pentium Pro ( P6) implemented many radical architectural differences mirroring other contemporary x86 designs such as the NexGen Nx586 and Cyrix 6x86. It has a decoupled, 14-stage superpipelined architecture which used an instruction pool. The Pentium Pro incorporated a new microarchitecture, different from the Pentium's P5 microarchitecture. The lead architect of Pentium Pro was Fred Pollack who was specialized in superscalarity and had also worked as the lead engineer of the Intel iAPX 432. Microarchitecture Block Diagram of the Pentium Pro's Microarchitecture 200 MHz Pentium Pro with a 512 KB L2 cache in PGA package 200 MHz Pentium Pro with a 1 MB L2 cache in PPGA package. The Pentium Pro was succeeded by the Pentium II Xeon in 1998. It only came in one form factor, the relatively large rectangular Socket 8. The Pentium Pro was capable of both dual- and quad-processor configurations. : 12 Later, it was reduced to a more narrow role as a server and high-end desktop processor and was used in supercomputers like ASCI Red, the first computer to reach the trillion floating point operations per second (tera FLOPS) performance mark in 1996. While the Pentium and Pentium MMX had 3.1 and 4.5 million transistors, respectively, the Pentium Pro contained 5.5 million transistors. : D-2 It introduced the P6 microarchitecture (sometimes termed i686) and was originally intended to replace the original Pentium in a full range of applications. The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |