xcode - ROL / ROR on variable using inline assembly only in Objective-C -
this question has answer here:
a few days ago, asked question below. because in need of quick answer, added:
the code not need use inline assembly. however, haven't found way using objective-c / c++ / c instructions.
today, learn something. ask question again, looking answer using inline assembly.
i perform ror , rol operations on variables in objective-c program. however, can't manage – not assembly expert.
here have done far:
uint8_t v1 = ....; uint8_t v2 = ....; // v2 either 1, 2, 3, 4 or 5 asm("ror v1, v2");
the error is:
unknown use of instruction mnemonic unknown size suffix
how can fix this?
a rotate 2 shifts - bits go left, others right - once see rotating easy without assembly. pattern recognised compilers , compiled using rotate instructions. see wikipedia code.
update: xcode 4.6.2 (others not tested) on x86-64 compiles double shift + or rotate 32 & 64 bit operands, 8 & 16 bit operands double shift + or kept. why? maybe compiler understands performance of these instructions, maybe didn't optimise - in general if can avoid assembler so, compiler invariably knows best! using static inline
on functions, or using macros defined in same way standard macro max
(a macro has advantage of adapting type of operands), can used inline operations.
addendum after op comment
here i86_64 assembler example, full details of how use asm
construct start here.
first non-assembler version:
static inline uint32 rotl32_i64(uint32 value, unsigned shift) { // assume shift in range 0..31 or subtraction wrong // know compiler spot pattern , replace // expression single roll , there no subtraction // if compiler changes may break without: // shift &= 0x1f; return (value << shift) | (value >> (32 - shift)); } void test_rotl32(uint32 value, unsigned shift) { uint32 shifted = rotl32_i64(value, shift); nslog(@"%8x <<< %u -> %8x", value & 0xffffffff, shift, shifted & 0xffffffff); }
if @ assembler output profiling (so optimiser kicks in) in xcode (product > generate output > assembly file, select profiling in pop-up menu bottom of window) see rotl32_i64
inlined test_rotl32
, compiles down rotate (roll
) instruction.
now producing assembler directly bit more involved arm code frankh showed. because take variable shift value specific register, cl
, must used, need give compiler enough information that. here goes:
static inline uint32 rotl32_i64_asm(uint32 value, unsigned shift) { // i64 - shift must in register cl create register local assigned cl // no need mask i64 register uint8 cl asm ( "cl" ) = shift; uint32 shifted; // emit rotate left long // %n values replaced args: // 0: "=r" (shifted) - register (r), result(=), store in var (shifted) // 1: "0" (value) - *same* register %0 (0), load var (value) // 2: "r" (cl) - register (r), load var (cl - cl register 1 used) __asm__ ("roll %2,%0" : "=r" (shifted) : "0" (value), "r" (cl)); return shifted; }
change test_rotl32
call rotl32_i64_asm
, check assembly output again - should same, i.e. compiler did did.
further note if commented out masking line in rotl32_i64
included becomes rotl32
- compiler right thing architecture cost of single and
instruction in i64 version.
so asm
there need it, using can involved, , compiler invariably or better itself...
hth
Comments
Post a Comment