Analyzing SPARC assembly
 
  Analyzing SPARC code
A step by step analysis of RISC assembly code done on a sample of SPARC code.
Copyright © 1999 Eric Laroche.
 
Contents
  Introduction
Synthetic
Delay slots
Conditionals
Branches
Final
References
 
Introduction

Analyzing SPARC assembly code is more conveniently done by following some preliminary, formal steps, before figuring out which algorithms are implemented by the code that is to be analyzed.

These formal steps are explained in introductory comments and side-comments, while analyzing one code sample.

Phase #1: Replace opcodes through synthetic instructions

A very first step is to replace operation codes with synthetic instructions, where applicable. Code that contains synthetic instructions is easier to read.

In the same step, groups of pseudo ops can be replaced with complex ones (complex ones being e.g. .indent).

Original sample
 
    .section ".rodata1",#alloc
    .align  4

.l0:
    .ascii  "%u %u\n\000"
identify replaceable pseudo-ops;
asciz

    .section ".text",#alloc,#execinstr
    .align  4

    .global main
main:
    save    %sp,-96,%sp
    sethi   %hi(.l0),%g2
    or      %g0,1,%i1
replace operation codes with synthetic instructions;
mov
    or      %g0,0,%o1
    add     %g2,%lo(.l0),%i0
    or      %g0,1,%o0
    or      %g0,0,%i2
.l1:
    subcc   %i1,1,%g0
cmp
    be      .l7
    andcc   %o0,1,%g0
btst
.l2:
    bne     .l3
    srl     %o0,1,%g2
    add     %i2,1,%i2
inc
    srl     %o0,1,%o0
    ba      .l4
    subcc   %o0,1,%g0
.l3:
    add     %o0,%g2,%g2
    add     %i2,2,%i2
    add     %g2,1,%o2
    subcc   %o2,%o0,%g0
    bcs     .l6
    or      %g0,%o2,%o0
    subcc   %o0,1,%g0
.l4:
    bne     .l2
    andcc   %o0,1,%g0
.l5:
cut unused local labels (labels could be replaced with anonymous labels)
    ba      .l8
    subcc   %i2,%o1,%g0
.l6:
    jmpl    %i7+8,%g0
ret
    restore %g0,0,%o0
.l7:
    subcc   %i2,%o1,%g0
.l8:
    bleu,a  .l9
    add     %i1,1,%i1
    or      %g0,%i0,%o0
    or      %g0,%i1,%o1
    add     %i1,1,%i1
    call    printf
call is already an opcode
    or      %g0,%i2,%o2
    or      %g0,%i2,%o1
.l9:
    or      %g0,%i1,%o0
    ba      .l1
    or      %g0,0,%i2
    .type   main,2
    .size   main,(.-main)

anonymous labels: 1, 2, ... which are referenced by 1f, 2b, ...

call: call printf is already represented by an opcode and can't be expressed by jmpl printf,%o7 since jmpl only offers a 13 bits immediate operand (as used e.g. with jmpl %i7+8,%g0) instead of the 30 bits needed to access all 4 byte aligned locations in a 32 bit address space.

increment: not all high level programming languages have a notion for increments (like C's or C++'s i++ and i += 2).

synthetic instructions: Synthetic instructions are assembler aliases for seemingly more complex opcode constructs, that are needed on RISC processors to implement simple operations. A SPARC sample is the mov synthetic instruction, which is implemented as an or %g0,... or add %g0,... opcode. Other SPARC samples are cmp, btst, inc. Also ret is a synthetic instruction: jmpl %i7+8,%g0. Note that e.g. both of or %g0,... and and %g0,... can be seen as mov ... and e.g. both of or %g0,%g0,... and or %g0,0,... can be seen as clr ... although the reverse translation is somewhat more defined.

   next
 
Copyright © 1999 Eric Laroche December 19, 1999