========================================================================== PARC Instruction Set Architecture ========================================================================== # Author : Christopher Batten, Ji Kim, Berkin Ilbeyi, Shreesha Srinath # Date : August 26, 2015 The PARC ISA is a subset of the MIPS32 ISA with some modifications to match the PARC architecture. It is categorized into several versions, each of which builds off of the previous version as it increases in complexity. It has several differences from MIPS32, in addition to having a different coprocessor 0 (cp0) register space. A system that implements the full PARC ISA will be able to run real parallel C++ programs on a multicore architecture. Table of Contents 1. Differences from MIPS32 2. Architectural State 3. PARC Instruction Overview 4. PARC Instruction Encoding 5. PARC Instruction Details 5.1. Read-Write Coprocessor Register Instructions 5.2. Register-Register Arithmetic Instructions 5.3. Multiply/Divide Instructions 5.4. Register-Immediate Arithmetic Instructions 5.5. Memory Instructions 5.6. Unconditional Jump Instructions 5.7. Conditional Branch Instructions 5.8. Conditional Moves 5.9. Concurrency Instructions 5.10. Exception Instructions 5.11. Accelerator Instructions -------------------------------------------------------------------------- 1. Differences from MIPS32 -------------------------------------------------------------------------- The PARC ISA has several important differences from the MIPS32 ISA. * Little-Endian Although MIPS32 supports both big- and little-endian architectures, the PARC ISA is strictly little-endian. This means that the least significant bytes in a word are stored in the lower-order addresses in memory. * No branch delay slot This means that link address for jal/jalr instructions needs to be PC + 4 not PC + 8. Technically, without a branch delay slot there is no reason to keep using PC + 4 for the PC relative branch and jump targets, but it simplifies the compiler so for now the following instructions all use PC + 4 as tbe base address for determining their target: jal, jalr, bne, beq, blez, bgtz, bltz, and bgez. * No HI/LO registers MIPS32 uses HI and LO registers to store the 64-bit results of mult, multu, div, and divu instructions. The PARC ISA has its own set of multiply/divide instructions: mul, div, divu, rem, and remu which all target a general purpose register. The PARC ISA does not have HI and LO registers. The multiply instruction only has a signed variant, whereas the divide and remainder instructions have both a signed and unsigned variant. Notice that the names for div and divu are the same as the MIPS32 variants, but the functionality is different. The PARC multiply/divide/remainder instructions always return a 32-bit result into a general purpose register -- this means that mul will only return the lower half of the 64-bit product as the result. * Atomic instructions PARC support atomic instructions in PARCv3. Atomic instructions embody multiple operations that complete atomically with respect to other memory operations. Atomic instructions are important in multicore systems for efficient synchronization. * Address translation PARC does not yet have a virtual memory space, thus does not use any address translation to access memory. Memory addresses used by processor requests are essentially direct mappings to the physical memory, except that the higher order bits are truncated to the length of the physical memory address. * Other features not included from MIPS32 - Branch likely instructions (b*l) - Branch and link instructions (b*al) - Test and trap instructions (teq, tge, tlt, ...) - Unaligned loads and stores (lwl, lwr, swl, swr) - Merged multiply accumulates (madd, maddu, msub, msubu) - Rotate instructions (rotr, rotrv) - Bit manipulation instructions (clz, clo, ext, ins, seb, seh) - Load-link and store-conditional instructions (ll, sc) -------------------------------------------------------------------------- 2. Architectural State -------------------------------------------------------------------------- * General Purpose Registers - 32 GPRs: PARC uses the same symbolic register names as MIPS32. + r0 : $zero the constant value 0 + r1 : $at assembler temporary register + r2 : $v0 function return value + r3 : $v1 " + r4 : $a0 function argument register + r5 : $a1 " + r6 : $a2 " + r7 : $a3 " + r8 : $a4 " + r9 : $a5 " + r10 : $a6 " + r11 : $a7 " + r12 : $t4 temporary registers (callee saved) + r13 : $t5 " + r14 : $t6 " + r15 : $t7 " + r16 : $s0 saved registers (caller saved) + r17 : $s1 " + r18 : $s2 " + r19 : $s3 " + r20 : $s4 " + r21 : $s5 " + r22 : $s6 " + r23 : $s7 " + r24 : $t8 temporary registers (callee saved) + r25 : $t9 " + r26 : $k0 kernel registers + r27 : $k1 " + r28 : $gp global pointer + r29 : $sp stack pointer + r30 : $fp stack frame pointer + r31 : $ra return address - epc: exception PC (PARCv3 and higher) + Stores return address from exception * Coprocessor 0 Registers - mngr2proc: cpr1 (PARCv1 and higher) Used to communicate data from the manager to the processor. This register has register-mapped FIFO-dequeue semantics meaning reading the register essentially dequeues the data from the head of a FIFO. Reading the register will stall if the FIFO has no valid data. Writing the register is undefined. - proc2mngr: cpr2 (PARCv1 and higher) Used to communicate data from the processor to the manager. This register has register-mapped FIFO-enqueue semantics meaning writing the register essentially enqueues the data on the tail of a FIFO. Writing the register will stall if the FIFO is not ready. Reading the register is undefined. - stats_en: cpr21 (PARCv2 and higher) Used to enable or disable the statistics tracking feature of the processor (i.e. counting cycles and instructions) - numcores: cpr16 (PARCv2 and higher) Used to store the number of cores present in a multi-core system. Writing the register is undefined. - coreid: cpr17 (PARCv2 and higher) Used to communicate the core id in a multi-core system. Writing the register is undefined. * Reset Vector - The reset vector for PARC points to the memory address 0x00001000, which is where assembly tests should reside, as well as user code in PARCv2, and the kernel bootstrap code for PARCv3. -------------------------------------------------------------------------- 3. PARC ISA Overview -------------------------------------------------------------------------- Here is a brief list of the instructions which make up each version of the PARC ISA. * PARCv1 PARCv1 contains a very small subset of the full PARCv3 ISA suitable for illustrating how small assembly sequences execute on various microarchitectures in lecture, problem sets, and exams. - addu, addiu, mul - nop - lw, sw - j, jal, jr - bne - mfc0, mtc0 (proc2mngr, mngr2proc) * PARCv2 PARCv2 contains the subset of the full PARCv3 ISA suitable for executing simple C programs that do not use system calls. - subu, and, or, slt - lui, ori, sra, sll - xor, nor, sltu - srav, srlv, sllv - andi, xori, slti, sltiu, srl - beq, bgtz, bltz, bgez, blez - mfc0, mtc0 (stats_en, core_id, num_cores) * PARCv3 PARCv3 is the full PARC ISA and includes the additional instructions required to compile arbitrary user-level C programs (jalr, div/rem, subword load/stores, conditional moves), atomically update memory, handle exceptions, perform floating-point arithmetic, and communicate with custom accelerators. - jalr - div, divu, rem, remu - lb, lbu, lh, lhu, sb, sh - movn, movz - amo.add, amo.and, amo.or, sync - syscall, eret - floating-point - mtx, mfx, mtxr, mfxr -------------------------------------------------------------------------- 4. PARC Instruction Encoding -------------------------------------------------------------------------- The 32-bit PARC instructions have different fields depending on the format of the instruction used. The following are the various instruction encoding formats used in the PARC ISA. * R-Type: 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | +--------+-------+-------+-------+-------+--------+ * I-Type: 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | +--------+-------+-------+------------------------+ * J-Type: 31 26 25 0 +--------+----------------------------------------+ | op | target | +--------+----------------------------------------+ * FR-Type: 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | fmt | ft | fs | fd | cmd | +--------+-------+-------+-------+-------+--------+ * FCMP-Type: 31 26 25 21 20 16 15 11 10 6 5 4 3 0 +--------+-------+-------+-------+-------+----+------+ | op | | ft | fs | fd | | cmp | +--------+-------+-------+-------+-------+----+------+ * COP2-Type: 31 26 25 21 20 16 15 11 10 0 +--------+-------+-------+-------+----------------+ | op | rs | rt | mt | imm | +--------+-------+-------+-------+----------------+ -------------------------------------------------------------------------- 5. PARC Instruction Details -------------------------------------------------------------------------- For each instruction we include a brief summary, assembly syntax, instruction semantics, encoding format, and the actual encoding for the instruction. We use the following conventions when specifying the instruction semantics: - R[r_a] : general-purpose register value for register specifier r_a - CP0[r_a] : coprocessor0 register value for register specifier r_a - zext : zero extend to 32 bits - sext : sign extend to 32 bits - M_4B[addr] : 4-byte memory value at address addr - M_2B[addr] : 2-byte memory value at address addr - M_1B[addr] : 1-byte memory value at address addr - PC : current program counter - PC_next : next program counter - atomic {} : atomic with respect to memory - s : signed greater-than comparison - u : unsigned greater-than comparison Unless otherwise specified assume instruction updates PC_next with PC+4. -------------------------------------------------------------------------- 5.1. Read-Write Coprocessor Register Instructions -------------------------------------------------------------------------- * mfc0 - Summary : Move value in coprocessor 0 register to GPR - Assembly : mfc0 r_dst, r_src - Semantics : R[r_dst] = CP0[r_src] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | mf | rt | rd | sa | cmd | | 010000 | 00000 | dst | src | 00000 | 000000 | +--------+-------+-------+-------+-------+--------+ * mtc0 - Summary : Move value in GPR to coprocessor 0 register - Assembly : mtc0 r_src, r_dst - Semantics : CP0[r_dst] = R[r_src] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | mt | rt | rd | sa | cmd | | 010000 | 00100 | src | dst | 00000 | 000000 | +--------+-------+-------+-------+-------+--------+ -------------------------------------------------------------------------- 5.2. Register-Register Arithmetic Instructions -------------------------------------------------------------------------- * addu - Summary : Signed addition with 3 GPRs, no overflow exception - Assembly : addu r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] + R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100001 | +--------+-------+-------+-------+-------+--------+ The 'unsigned' keyword in the instruction name is a misnomer in most cases. The 'unsigned' variant of an instruction simply means that the operation will not trap on an overflow and does *not* imply that operands will be treated as unsigned values. The exceptions to this are the mul/div instructions, included in PARCv2. The PARC ISA, in general, does not support any instructions that use traps. * subu - Summary : Signed subtraction with 3 GPRs, no overflow exception - Assembly : subu r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] - R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100011 | +--------+-------+-------+-------+-------+--------+ The 'unsigned' keyword in the instruction name is a misnomer in most cases. The 'unsigned' variant of an instruction simply means that the operation will not trap on an overflow and does *not* imply that operands will be treated as unsigned values. The exceptions to this are the mul/div instructions, included in PARCv2. The PARC ISA, in general, does not support any instructions that use traps. * and - Summary : Bitwise logical AND with 3 GPRs - Assembly : and r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] & R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100100 | +--------+-------+-------+-------+-------+--------+ * or - Summary : Bitwise logical OR with 3 GPRs - Assembly : or r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] | R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100101 | +--------+-------+-------+-------+-------+--------+ * xor - Summary : Bitwise logical XOR with 3 GPRs - Assembly : xor r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] ^ R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100110 | +--------+-------+-------+-------+-------+--------+ * nor - Summary : Bitwise logical NOR with 3 GPRs - Assembly : nor r_dst, r_src0, r_src1 - Semantics : R[r_dst] = !( R[r_src0] | R[r_src1] ) - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | src0 | src1 | dst | 00000 | 100111 | +--------+-------+-------+-------+-------+--------+ * slt - Summary : Record result of signed less-than comparison with 2 GPRs - Assembly : slt r_dst, r_src0, r_src1 - Semantics : R[r_dst] = ( R[r_src0] >> R[r_shamt][4:0] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | shamt | src | dst | 00000 | 000111 | +--------+-------+-------+-------+-------+--------+ Note that we should ensure that the sign-bit of the source is extended to the right as we do the right shift. We only use the bottom five bits of the shift ammount. * srlv - Summary : Shift right logical by register value (append zeroes) - Assembly : srlv r_dst, r_src, r_shamt - Semantics : R[r_dst] = R[r_src] >> R[r_shamt][4:0] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | shamt | src | dst | 00000 | 000110 | +--------+-------+-------+-------+-------+--------+ Append zeros to the left as we do the right shift. We only use the bottom five bits of the shift ammount. * sllv - Summary : Shift left logical by register value (append zeroes) - Assembly : sllv r_dst, r_src, r_shamt - Semantics : R[r_dst] = R[r_src] << R[r_shamt][4:0] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | shamt | src | dst | 00000 | 000100 | +--------+-------+-------+-------+-------+--------+ Append zeros to the right as we do the left shift. We only use the bottom five bits of the shift ammount. -------------------------------------------------------------------------- 5.3. Multiply/Divide Instructions -------------------------------------------------------------------------- * mul - Summary : Signed multiplication with 3 GPRs - Assembly : mul r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] * R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 011100 | src0 | src1 | dst | 00000 | 000010 | +--------+-------+-------+-------+-------+--------+ * div - Summary : Signed division with 3 GPRs - Assembly : div r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] / R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 100111 | src0 | src1 | dst | 00000 | 000101 | +--------+-------+-------+-------+-------+--------+ * divu - Summary : Unsigned division with 3 GPRs - Assembly : divu r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] / R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 100111 | src0 | src1 | dst | 00000 | 000111 | +--------+-------+-------+-------+-------+--------+ * rem - Summary : Signed remainder with 3 GPRs - Assembly : rem r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] % R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 100111 | src0 | src1 | dst | 00000 | 000110 | +--------+-------+-------+-------+-------+--------+ * remu - Summary : Unsigned remainder with 3 GPRs - Assembly : remu r_dst, r_src0, r_src1 - Semantics : R[r_dst] = R[r_src0] % R[r_src1] - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 100111 | src0 | src1 | dst | 00000 | 001000 | +--------+-------+-------+-------+-------+--------+ -------------------------------------------------------------------------- 5.4. Register-Immediate Arithmetic Instructions -------------------------------------------------------------------------- * addiu - Summary : Add constant with no overflow exception - Assembly : addiu r_dst, r_src, i_val - Semantics : R[r_dst] = R[r_src] + sext(i_val) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 001001 | src | dst | val | +--------+-------+-------+------------------------+ The 'unsigned' keyword in the instruction name is a misnomer in most cases. The 'unsigned' variant of an instruction simply means that the operation will not trap on an overflow and does *not* imply that operands will be treated as unsigned values. The exceptions to this are the mul/div instructions, included in PARCv2. The PARC ISA, in general, does not support any instructions that use traps. Note that the 16-bit immediate value is sign-extended before being used in the unsigned comparison. * lui - Summary : Load constant into upper half of word - Assembly : lui r_dst, i_val - Semantics : R[r_dst] = i_val << 16 - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 001111 | 00000 | dst | val | +--------+-------+-------+------------------------+ * ori - Summary : Bitwise logical OR with constant - Assembly : ori r_dst, r_src, i_val - Semantics : R[r_dst] = R[r_src] | zext(i_val) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 001101 | src | dst | val | +--------+-------+-------+------------------------+ * andi - Summary : Bitwise logical AND with constant - Assembly : andi r_dst, r_src, i_val - Semantics : R[r_dst] = R[r_src] & zext(i_val) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 001100 | src | dst | val | +--------+-------+-------+------------------------+ * xori - Summary : Bitwise logical XOR with constant - Assembly : xori r_dst, r_src, i_val - Semantics : R[r_dst] = R[r_src] ^ zext(i_val) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 001110 | src | dst | val | +--------+-------+-------+------------------------+ * slti - Summary : Set GPR if source GPR < constant, signed comparison - Assembly : slti r_dst, r_src, i_val - Semantics : R[r_dst] = ( R[r_src] >> i_shamt - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | 00000 | src | dst | shamt | 000011 | +--------+-------+-------+-------+-------+--------+ Note that we should ensure that the sign-bit of the source is extended to the right as we do the right shift. * srl - Summary : Shift right logical by constant (append zeroes) - Assembly : srl r_dst, r_src, i_shamt - Semantics : R[r_dst] = R[r_src] >> i_shamt - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | 00000 | src | dst | shamt | 000010 | +--------+-------+-------+-------+-------+--------+ Append zeros to the left as we do the right shift. * sll - Summary : Shift left logical constant (append zeroes) - Assembly : sll r_dst, r_src, i_shamt - Semantics : R[r_dst] = R[r_src] << i_shamt - Format : R-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | 00000 | src | dst | shamt | 000000 | +--------+-------+-------+-------+-------+--------+ Append zeros to the right as we do the left shift. -------------------------------------------------------------------------- 5.5. Memory Instructions -------------------------------------------------------------------------- * lw - Summary : Load word from memory as signed value - Assembly : lw r_dst, i_offset(r_base) - Semantics : R[r_dst] = M_4B[ R[r_base] + sext(i_offset) ] - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 100011 | base | dst | offset | +--------+-------+-------+------------------------+ * lh - Summary : Load a halfword from memory as signed value - Assembly : lh r_dst, i_offset(r_base) - Semantics : R[r_dst] = sext( M_2B[ R[r_base] + sext(i_offset) ] ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 100001 | base | dst | offset | +--------+-------+-------+------------------------+ * lhu - Summary : Load a halfword from memory as unsigned value - Assembly : lhu r_dst, i_offset(r_base) - Semantics : R[r_dst] = zext( M_2B[ R[r_base] + sext(i_offset) ] ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 100101 | base | dst | offset | +--------+-------+-------+------------------------+ * lb - Summary : Load a byte from memory as signed value - Assembly : lb r_dst, i_offset(r_base) - Semantics : R[r_dst] = sext( M_1B[ R[r_base] + sext(i_offset) ] ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 100000 | base | dst | offset | +--------+-------+-------+------------------------+ * lbu - Summary : Load a byte from memory as unsigned value - Assembly : lbu r_dst, i_offset(r_base) - Semantics : R[r_dst] = zext( M_1B[ R[r_base] + sext(i_offset) ] ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 100100 | base | dst | offset | +--------+-------+-------+------------------------+ * sw - Summary : Store word into memory - Assembly : sw r_src, i_offset(r_base) - Semantics : M_4B[ R[r_base] + sext(i_offset) ] = R[r_src] - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 101011 | base | src | offset | +--------+-------+-------+------------------------+ * sh - Summary : Store a halfword to memory - Assembly : sh r_src, i_offset(r_base) - Semantics : M_2B[ R[r_base] + sext(i_offset) ] = R[r_src] - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 101001 | base | src | offset | +--------+-------+-------+------------------------+ * sb - Summary : Store a byte to memory - Assembly : sb r_src, i_offset(r_base) - Semantics : M_1B[ R[r_base] + sext(i_offset) ] = R[r_src] - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 101000 | base | src | offset | +--------+-------+-------+------------------------+ -------------------------------------------------------------------------- 5.6. Unconditional Jump Instructions -------------------------------------------------------------------------- * j - Summary : Jump to address - Assembly : j i_targ - Semantics : PC_plus4 = PC + 4; PC_next = { PC_plus4[31:28], i_targ << 2 } - Format : J-Type 31 26 25 0 +--------+----------------------------------------+ | op | imm | | 000010 | targ | +--------+----------------------------------------+ i_targ is shifted to the left by 2 bits and the resulting 28 bits are combined with the 4 msb of PC+4 to generate the effective target address. * jr - Summary : Jump to address in register - Assembly : jr r_src - Semantics : PC_next = R[r_src] - Format : J-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | | cmd | | 000000 | src | 00000 | 00000 | 00000 | 001000 | +--------+-------+-------+-------+-------+--------+ The target address in r_src must be naturally aligned. * jalr - Summary : Jump to address and place return address in GPR - Assembly : jalr r_ret, r_targ - Semantics : R[r_ret] = PC + 4; PC_next = R[r_targ] - Format : J-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | rs | rt | rd | sa | cmd | | 000000 | targ | 00000 | ret | 00000 | 001001 | +--------+-------+-------+-------+-------+--------+ The return address should be the instruction immediately following the branch instruction. Keep in mind that this is different from the MIPS ISA in which the return address is 2 instructions after the branch instruction to account for the branch delay slot. If r_ret is not defined in the assembly, the return address will be stored in GPR 31 by default. The target address in r_targ must be naturally aligned. r_targ and r_ret should not be equal, as it will cause behavior that is non-idempotent. * jal - Summary : Jump to address and place return address in GPR 31 - Assembly : jal i_targ - Semantics : R[31] = PC + 4; PC_plus4 = PC + 4; PC_next = { PC_plus4[31:28], i_targ << 2 } - Format : J-Type 31 26 25 0 +--------+----------------------------------------+ | op | imm | | 000011 | targ | +--------+----------------------------------------+ i_targ is shifted to the left by 2 bits and the resulting 28 bits are combined with the 4 msb of PC+4 to generate the effective target address. -------------------------------------------------------------------------- 5.7. Conditional Branch Instructions -------------------------------------------------------------------------- * beq - Summary : Branch if 2 GPRs are equal - Assembly : beq r_src0, r_src1, i_offset - Semantics : if ( R[r_src0] == R[r_src1] ) PC_next = PC + 4 + ( sext(i_offset) << 2 ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 000100 | src0 | src1 | offset | +--------+-------+-------+------------------------+ The target address offset is relative to the PC of the instruction *after* the actual branch. * bne - Summary : Branch if 2 GPRs are not equal - Assembly : bne r_src0, r_src1, i_offset - Semantics : if ( R[r_src0] != R[r_src1] ) PC_next = PC + 4 + ( sext(i_offset) << 2 ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 000101 | src0 | src1 | offset | +--------+-------+-------+------------------------+ The target address offset is relative to the PC of the instruction *after* the actual branch. * bgtz - Summary : Branch if GPR is greater than zero - Assembly : bgtz r_src, i_offset - Semantics : if ( R[r_src] >s 0 ) PC_next = PC + 4 + ( sext(i_offset) << 2 ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 000111 | src | 00000 | offset | +--------+-------+-------+------------------------+ The target address offset is relative to the PC of the instruction *after* the actual branch. * bltz - Summary : Branch if GPR is less than zero - Assembly : bltz r_src, i_offset - Semantics : if ( R[r_src] s 0 ) || ( R[r_src] == 0 ) ) PC_next = PC + 4 + ( sext(i_offset) << 2 ) - Format : I-Type 31 26 25 21 20 16 15 0 +--------+-------+-------+------------------------+ | op | rs | rt | imm | | 000001 | src | 00001 | offset | +--------+-------+-------+------------------------+ The target address offset is relative to the PC of the instruction *after* the actual branch. * blez - Summary : Branch if GPR is less than or equal to zero - Assembly : blez r_src, i_offset - Semantics : if ( R[r_src] .s - Summary : Comparison with single-precision floating-point values - Assembly : c..s r_dst, r_src0, r_src1 - Description : R[r_dst] = R[r_src0] R[r_src1] - Format : FCMP-Type 31 26 25 21 20 16 15 11 10 6 5 4 3 0 +--------+-------+-------+-------+-------+----+------+ | cop1 | | ft | fs | fd | | cmp | | 010001 | 10000 | src0 | src1 | dst | 11 | cond | +--------+-------+-------+-------+-------+----+------+ The type of comparison can be specified in the field. Possible functions and encodings of all possible comparisons are below. - 0000 f : false - 0001 un : unordered - 0010 eq : equal - 1011 ngl : not greater than or less than - 1100 lt : less than - 1101 nge : not greater than or equal - 1110 le : less than or equal - 1111 ngt : not greater than Floating-point values are stored in the same GPR as integer values. The fmt field defines the precision format of the operands. Currently, only the single-precision format is supported. Note that the positions of the fs and fd fields are different from the rs and rd fields in the R-Type instruction format. * cvt.w.s - Summary : Convert single-precision floating-point value to integer value - Assembly : cvt.w.s r_dst, r_src - Description : R[r_dst] = (int)R[r_src] - Format : FR-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | fmt | ft | fs | fd | cmd | | 010001 | 00000 | 00000 | src0 | dst | 100100 | +--------+-------+-------+-------+-------+--------+ Behavior is unpredictable if the source value represents Infinity, NaN, or out of integer range. Floating-point values are stored in the same GPR as integer values. The fmt field defines the precision format of the operands. Currently, only the single-precision format is supported. Note that the positions of the fs and fd fields are different from the rs and rd fields in the R-Type instruction format. * cvt.s.w - Summary : Convert integer value to single-precision floating-point value - Assembly : cvt.s.w r_dst, r_src - Description : R[r_dst] = (float)R[r_src] - Format : FR-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | fmt | ft | fs | fd | cmd | | 010001 | 00000 | 00000 | src0 | dst | 100000 | +--------+-------+-------+-------+-------+--------+ Floating-point values are stored in the same GPR as integer values. The fmt field defines the precision format of the operands. Currently, only the single-precision format is supported. Note that the positions of the fs and fd fields are different from the rs and rd fields in the R-Type instruction format. * trunc.w.s - Summary : Convert single-precision floating-point value to integer value, round toward zero - Assembly : trunc.w.s r_dst, r_src - Description : R[r_dst] = (int)R[r_src] - Format : FR-Type 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+-------+--------+ | op | fmt | ft | fs | fd | cmd | | 010001 | 00000 | 00000 | src0 | dst | 001101 | +--------+-------+-------+-------+-------+--------+ Behavior is unpredictable if the source value represents Infinity, NaN, or out of integer range. Floating-point values are stored in the same GPR as integer values. The fmt field defines the precision format of the operands. Currently, only the single-precision format is supported. Note that the positions of the fs and fd fields are different from the rs and rd fields in the R-Type instruction format. --------------------------------------------------------------------------- 5.12. Accelerator Instructions --------------------------------------------------------------------------- * mtx - Summary : Move word to accelerator from GP register file - Assembly : mtx rt, rs, accel_id - Description : XCEL_R[r_dst] = R[r_src] where XCEL is identified by 'accel-id' bits - Format : COP2 31 26 25 21 20 16 15 11 10 0 +--------+-------+-------+-------+----------------+ | op | rs | rt | mt | imm | | 010010 | dst | src | 00000 | accel-id | +--------+-------+-------+-------+----------------+ The 'accel-id' is used to identify the accelerator the control processor wants to move values to. The 'accel_id' is an immediate field. * mfx - Summary : Move word from accelerator to GP register file - Assembly : mfx rt, rs, accel_id - Description : R[r_dst] = XCEL_R[r_src] where XCEL is identified by 'accel-id' bits - Format : COP2 31 26 25 21 20 16 15 11 10 0 +--------+-------+-------+-------+----------------+ | op | rs | rt | mf | imm | | 010010 | src | dst | 00001 | accel-id | +--------+-------+-------+-------+----------------+ The 'accel-id' is used to identify the accelerator the control processor wants to move values to. The 'accel_id' is an immediate field. * mtxr - Summary : Move word to accelerator from GPR (register-based) - Assembly : mtxr rt, rs, r_accel - Description : XCEL_R[r_dst] = R[r_src] where XCEL is identified by R[r_accel] - Format : COP2 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+---------+--------+ | op | rs | rt | mt | | | | 010010 | dst | src | 00010 | r_accel | 000000 | +--------+-------+-------+-------+---------+--------+ * mfxr - Summary : Move word from accelerator to GPR (register-based) - Assembly : mfxr rt, rs, r_accel - Description : R[r_dst] = XCEL_R[r_src] where XCEL is identified by R[r_accel] - Format : COP2 31 26 25 21 20 16 15 11 10 6 5 0 +--------+-------+-------+-------+---------+--------+ | op | rs | rt | mf | | | | 010010 | src | dst | 00011 | r_accel | 000000 | +--------+-------+-------+-------+---------+--------+