Ассемблер для mac os

Содержание

Intel assembler on Mac OS X
The basics are pretty simple
Calling a local function
Accessing parameters
Fetching data
Calling a system function
Assembler 4+
Quote-Unquote Apps
Снимки экрана
Описание
Конфиденциальность приложения
Нет сведений
Ассемблер для mac os
About

Intel assembler on Mac OS X

Apr 25, 2007 вЂў uliwitness

I’ve always wanted to learn another assembler, and with one of my colleagues being a real assembler guru, and the Intel reference books on my bookshelf, and the Intel switch just behind us, I thought this would be a good opportunity to finally get going with x86 assembler.

Now, assembler programming under Mac OS X isn’t quite as well documented as one would wish. There’s no tutorial that I could find (lots of tutorials for Linux and Windows, but none for Mac OS X yet). This won’t be one either, but rather this is a blog posting of me sharing what I found out about assembler on OS X, and is probably only useful to someone who already knows some assembler, but just doesn’t know Intel on Mac OS X. My main approach is to compile C source code into assembler source files using GCC. Then I can look at that code and find out what assembler instructions correspond to what C command. If all of this turns out to be correct and I should happen to have loads of time on my hand, I may still go out there and turn this into a decent tutorial.

The basics are pretty simple

Now, the underscore in front of «main» is a convention in C, so just accept it. When you enter the _main function, the return address (i.e. the instruction where the program will continue after the function has finished, aka «back pointer») has already been pushed on the stack, taking up 4 bytes. We also save the base pointer (the point where our caller can find its parameters on the stack) to the stack, and set it to the current stack pointer (which is where our parameters are). That takes another 4 bytes, so we have 8 bytes now. Since the stack should be aligned on 16 bytes before you can make a call to another function, we subtract another 8 from the stack pointer, which pads out the stack (we could also just do two «pushl $0» for the same effect). If we used any local variables, we would use this opportunity to subtract more for them.

Now comes the actual body of our function. What we do is simply return 0. This is done by stuffing 0 in the eax register.

Finally, we have the tail end of our function, which calls leave (which cleans up by restoring our caller’s base pointer and stack pointer) and then call ret, which pops the return address off the stack and continues execution there.

Calling a local function

Calling a function is fairly simple, as long as it’s a local one right in the same file as ours. In that case, what you do is you first declare that function:

«nop» is a do-nothing instruction I just inserted here to show where doSomething’s code would go. That’s pretty easy. You just write the function, push the parameters on the stack and use call to jump to the function, and that will take care of pushing the return address and all that. The only tricky thing is passing the parameters. You have to pad first, and then push (or mov, in our case) the parameters in reverse order (i.e. #1 is at the bottom of the stack, #2 above it etc.). That’s because otherwise the function being called would have to skip the padding. Well, could be worse.

Accessing parameters

To acess any parameters, you address relative to the base pointer. The value immediately at the base pointer is generally your caller’s base pointer and the return address, so you need to add 4 + 4 = 8 bytes. Yes, since the stack starts at the end of memory and grows towards the beginning, and you subtract from the stack pointer to make it larger, you need to add to the stack pointer to find something on the stack. The same applies to our base pointer, of course:

Would store your second parameter in eax and then add the first parameter to it, leaving the result in eax, where it’s ready for use as a return value. Note the ##(foo) syntax, which adds the number ## to the pointer foo. This is register-relative addressing.

An added benefit of this is that you can actually pass more parameters to a function than it knows to handle, and it will just ignore the rest.

Fetching data

To access data (e.g. strings), it gets trickier. You declare data like the following:

So, you add a .cstring section at the top of the function, and in that you declare a label and use the .ascii keyword to actually stash your string there. So far, so good, there’s only one problem:

All data manipulation is done using absolute addresses. But we don’t know at what position in memory our program will be loaded. Labels aren’t absolute addresses, they get compiled into relative offsets from the start of our code. So, how do we find out at which absolute address our string myHelloWorld is? Well, the trick MachO uses is that it knows that our program will be loaded as one huge chunk. So, we know that the distance between any of our instructions in the code will always stay at the same distance to our string.

So, if we could only get the address of one instruction in our code that has a label, we could calculate the absolute address of our string from that. Now, look above, at our function call code. Notice anything? Our return address is an absolute pointer to the next instruction after a function call. So, all we need to do to get our address is call a function. When you assemble C source code, they call this helper function ___i686.get_pc_thunk.bx, which is quite a mouthful. Let’s just call it _nextInstructionAddress:

That’s what we call somewhere at the start of our code to find our own address. Note how I cleverly already added a label myAnchorPoint, which labels the instruction whose address we’ll get. Then we somewhere (e.g. at the bottom) define that function:

We don’t even bother aligning the stack or changing and restoring the base pointer. This simply peeks at the last item on the stack (the return address) and stashes that in register ebx. Then it returns (and obviously doesn’t call leave because we pushed no base pointer that it could restore).

Once we have this address in ebx, we can do the following to get our string’s address into a register, and from there onto the stack:

LEA means «Load Effective Address», i.e. take an address and stash it into a register. myHelloWorld-myAnchorPoint calculates the difference between our two labels, and thus tells us how far myHelloWorld is from myAnchorPoint. Since myHelloWorld is probably at the start of the program, e.g. at address 3 maybe, and myAnchorPoint further down, say at address 20, what we get is a negative value, e.g. -17. And xxx(%ebx) is how you tell the assembler that you want to add an offset to a register to get a memory address. ebx contains the address of myAnchorPoint, so what this does is subtract 17 from myAnchorPoint’s absolute address, giving us the absolute address of myHelloWorld! Whooo! And this mess is called «position-independent code».

Читайте также: Starcraft 2 wine linux

Now, our call to LEAL loads a «Long» (which is 32 bits, i.e. the size of a pointer on a 32-bit CPU) and stashes it into register eax. And the movl call moves that long from our register into the last item on the stack, ready for use as a parameter to a function.

Calling a system function

Now, it’d be really nice if we could printf() or something, right? Well, trouble is, we don’t know the address of printf(). But this time it’s actually easy. We add a new section at the bottom of our code:

This is a new section named __IMPORT,__jump_table. It has the type symbols_stubs and the attributes self_modifying_code and pure_instructions. 5 is the size of the stub, and intentionally is the same as the number of hlt statements below.

This section is special, because when our code is loaded, the loader will look at it. It will see that there is an .indirect_symbol directive for a function named «printf», and will look up that function. Then it will replace the five hlt instructions, each of which is one byte in size, with an instruction to jump to that address (hence the self_modifying_code). We also added a label for each indirect symbol, which we name the same as the symbol, just with «_stub» appended.

So, to call printf, all you have to do now is push the string on the stack and then

Which will jump to _printf_stub and immediately continue to printf itself. And just to show you that you can have several such imported symbols, I’ve also included a stub for getchar. Now note that the system usually doesn’t name these symbols «_foo_stub», but rather «L_foo$stub» (yes, a label name can contain dollar signs. You can even put the label in quotes and have spaces in it. ). Same difference.

Okay, so that’s how much I’ve guessed my way through it so far. Comments? Corrections? If you want

PS — Thanks to John Kohr, Alexandre Colucci, Jonas Maebe, Eric Albert and Jordan Krushen, all of which helped me figure this out one way or the other. Thanks, guys!

Update: Added mention of how to actually access parameters.

Источник

Assembler 4+

Quote-Unquote Apps

Снимки экрана

Описание

Assembler is the remarkably useful utility for joining together text files — including Fountain, Markdown and .csv. If you have a bunch of little files and need to make a big one, this is your app.

Assembler saves you the hassle of lots of copy-and-pasting, or obscure terminal commands.

1. Drag-and-drop to add files. You can even add a folder at once.
2. Arrange files in order you want them assembled.
3. Click Save and you’ll get a brand-new file with all the pieces put together.

If you have Highland installed, you can even open the new file directly in the app.

Assembler is a godsend for screenwriters working in Fountain. Write your scenes separately, then combine them only when you need to.

For writers working in plain text or Markdown, Assembler makes it simple to combine sections and chapters.

If you find yourself working with .csv files — such as PayPal exports, or Kickstarter backer reports — Assembler makes it quick and easy to merge them into a single file.

Конфиденциальность приложения

Разработчик Quote-Unquote Apps не сообщил Apple о своей политике конфиденциальности и используемых им способах обработки данных.

Нет сведений

Разработчик будет обязан предоставить сведения о конфиденциальности при отправке следующего обновления приложения.

Источник

Ассемблер для mac os

Assembler on a Mac? Yes We Can!

A list of sample Assembly programs that demonstrate how to program using machine code instructions. Each program in this project is well self documented. Use this README.md to get started, then jump to ASSEMBLER.md to go further.

Program	Description
hello.s	Have a look at this first Hello World assembly code
formatstring.s	Display a formatted string on screen
parameters.s	It shows usage of parameters when calling a program or function
operations.s	Sample program to debug of common instructions
registers.s	Assembly program to show addressing of registers

Please note that you need to have the unix as (Assembler) and ld (Linker) utilities to use the sample programs included in this project. These utilities are automatically installed via the command line developers tools included in Xcode. The easiest way to install them is to open terminal and run the ld command, if you don’t have them you should get a prompt to start install.

Use the included shell script utility asm.sh to compile, link and run assembly code. Format is:

This utility will automatically call as to compile an assembly source code into an object code program (.o). It will then call the linker ld to create an executable from the object code. As an example the following command will compile, link and run hello.s assembly code:

This will produce hello.o object code and hello executable. This last one can also be directly started from the command line:

Important Note

You may need to specify which version of Mac OS X you are using in asm.sh script :

Debug assembly code

You can use lldb to debug an executable program. For example the following command will start a debug session:

Debug Command	Description
b main	Set a breakpoint at the start symbol (main) of a program
run	Run code till a breakpoint is found
run par1 par2	Run code using input parameters
b 0x1f8d	Set breakpoint at line number 0x1f8d
s	Step into instruction (i.e. step into a call statement)
n	Step over instruction (i.e. step over a call statement)
c	continue execution till a breakpoint is found
q	Terminate execution and exit lldb
register read	Show content of main registers (abbreviated re r )
re read esp eat	Show content of esp and tax register
re write eax 0xF12F	Write content to tax register
memory read 0xbffffb8c	Read content of memory address
x 0xbffffb8c	Same as memory read , abbreviated form
x —count 100 0xbffffb8c	Read 100 bytes from memory address 0xbffffb8c
watch set e 0x1f67	watch changes at memory address (breakpoint)
gui	When entered after run show debugger in a GUI

TIP: After entering a run command in lldb try using the gui command as well. :

Some handy shell commands

Command	Description
hexdump -C FileName	Hexadecimal dump of FileName. Tip: pipe using head -n10
gcc -S prg.c -m32 -Os	Generate assembly code from a C program
lldb Program	Debug an executable program

Dive deeper in Assembler by reading ASSEMBLER.md.

A continuos learning path where passion is the drive.

About

Assembler on your Mac? Yes We Can ! A quick tutorial together with a bunch of sample assembler programs for the Mac.

Источник