Monday, 28 August 2017

gcc assembler output of printf arg list

I learned MIPS assembly in a systems-level programming course last semester, and have been looking into the Intel and AMD architectures now.



I was having trouble trying to write a simple x86_64 program in GAS that calls printf and prints argc, and argv[0-4]. To help me understand how to do it correctly, I used the "gcc -S" to look at the assembler for the C source file "test.c":



#include 
int main (int argc, char * argv[]) {

printf("%d,%s,%s,%s,%s,%s\n", argc, argv[0], argv[1], argv[2], argv[3], argv[4]);
return 0;
}


The output of "gcc -S -masm=intel test.c" was:



    .file   "test.c"
.intel_syntax noprefix
.section .rodata

.LC0:
.string "%d,%s,%s,%s,%s,%s\n"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16

.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
sub rsp, 32
mov DWORD PTR [rbp-4], edi
mov QWORD PTR [rbp-16], rsi
mov rax, QWORD PTR [rbp-16]
add rax, 32
mov rsi, QWORD PTR [rax]
mov rax, QWORD PTR [rbp-16]

add rax, 24
mov r8, QWORD PTR [rax]
mov rax, QWORD PTR [rbp-16]
add rax, 16
mov rdi, QWORD PTR [rax]
mov rax, QWORD PTR [rbp-16]
add rax, 8
mov rcx, QWORD PTR [rax]
mov rax, QWORD PTR [rbp-16]
mov rdx, QWORD PTR [rax]

mov eax, DWORD PTR [rbp-4]
mov QWORD PTR [rsp], rsi
mov r9, r8
mov r8, rdi
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave

.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.7.2 20121109 (Red Hat 4.7.2-8)"
.section .note.GNU-stack,"",@progbits


To be honest, I don't think I completely understand what is going on in lines 18-38. To me, it looks like gcc stores a pointer to argc and argv[0] at [rbp-4] and [rbp-16], then loads the [rbp-16] into rax as a basepoint (pointer to argv[0]), and adds 8,16,24,... to make rax point to argv[1,2,3,...], and then loads that address into the appropriate register to pass to printf.




With that interpretation, I was able to understand enough about how the command line arguments are passed to main() to be able to fix my GAS code to this:



.intel_syntax noprefix
.globl main

.data
fmt: .asciz "%d,%s,%s,%s,%s,%s\n"

.text

main:
push rbp
mov rbp, rsp

mov rdx, QWORD PTR [rsi]
mov rcx, QWORD PTR [rsi+8]
mov r8, QWORD PTR [rsi+16]
mov r9, QWORD PTR [rsi+24]
push [rsi+32]
mov rsi, rdi

mov rdi, offset fmt
xor rax, rax
call printf

return:
mov rsp, rbp
pop rbp
xor rax, rax
ret



This produces the same output as the test.c does, as well as the gcc-generated test.s. So my question is this.... Is there anything wrong with the way I did it? And if not, why would gcc generate such a complicated way to do something that is this simple? Maybe it's just the way the compiler interprets the use of arrays?



I suppose my way is technically correct since it produces the same output, but I want to make sure it is an "acceptable" way to do it.

No comments:

Post a Comment

casting - Why wasn't Tobey Maguire in The Amazing Spider-Man? - Movies & TV

In the Spider-Man franchise, Tobey Maguire is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...