Assembly programming in windows

Assemblies in .NET

Assemblies form the fundamental units of deployment, version control, reuse, activation scoping, and security permissions for .NET-based applications. An assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. Assemblies take the form of executable (.exe) or dynamic link library (.dll) files, and are the building blocks of .NET applications. They provide the common language runtime with the information it needs to be aware of type implementations.

In .NET Core and .NET Framework, you can build an assembly from one or more source code files. In .NET Framework, assemblies can contain one or more modules. This allows larger projects to be planned so that several developers can work on separate source code files or modules, which are combined to create a single assembly. For more information about modules, see How to: Build a multifile assembly.

Assemblies have the following properties:

Assemblies are implemented as .exe or .dll files.

For libraries that target the .NET Framework, you can share assemblies between applications by putting them in the global assembly cache (GAC). You must strong-name assemblies before you can include them in the GAC. For more information, see Strong-named assemblies.

Assemblies are only loaded into memory if they are required. If they aren’t used, they aren’t loaded. This means that assemblies can be an efficient way to manage resources in larger projects.

You can programmatically obtain information about an assembly by using reflection. For more information, see Reflection (C#) or Reflection (Visual Basic).

You can load an assembly just to inspect it by using the MetadataLoadContext class in .NET Core and the Assembly.ReflectionOnlyLoad or Assembly.ReflectionOnlyLoadFrom methods in .NET Core and .NET Framework.

Assemblies in the common language runtime

Assemblies provide the common language runtime with the information it needs to be aware of type implementations. To the runtime, a type does not exist outside the context of an assembly.

An assembly defines the following information:

Code that the common language runtime executes. Note that each assembly can have only one entry point: DllMain , WinMain , or Main .

Security boundary. An assembly is the unit at which permissions are requested and granted. For more information about security boundaries in assemblies, see Assembly security considerations.

Type boundary. Every type’s identity includes the name of the assembly in which it resides. A type called MyType that is loaded in the scope of one assembly is not the same as a type called MyType that is loaded in the scope of another assembly.

Reference scope boundary. The assembly manifest has metadata that is used for resolving types and satisfying resource requests. The manifest specifies the types and resources to expose outside the assembly, and enumerates other assemblies on which it depends. Microsoft intermediate language (MSIL) code in a portable executable (PE) file won’t be executed unless it has an associated assembly manifest.

Version boundary. The assembly is the smallest versionable unit in the common language runtime. All types and resources in the same assembly are versioned as a unit. The assembly manifest describes the version dependencies you specify for any dependent assemblies. For more information about versioning, see Assembly versioning.

Deployment unit. When an application starts, only the assemblies that the application initially calls must be present. Other assemblies, such as assemblies containing localization resources or utility classes, can be retrieved on demand. This allows apps to be simple and thin when first downloaded. For more information about deploying assemblies, see Deploy applications.

Side-by-side execution unit. For more information about running multiple versions of an assembly, see Assemblies and side-by-side execution.

Create an assembly

Assemblies can be static or dynamic. Static assemblies are stored on disk in portable executable (PE) files. Static assemblies can include interfaces, classes, and resources like bitmaps, JPEG files, and other resource files. You can also create dynamic assemblies, which are run directly from memory and aren’t saved to disk before execution. You can save dynamic assemblies to disk after they have executed.

There are several ways to create assemblies. You can use development tools, such as Visual Studio, that can create .dll or .exe files. You can use tools in the Windows SDK to create assemblies with modules from other development environments. You can also use common language runtime APIs, such as System.Reflection.Emit, to create dynamic assemblies.

Compile assemblies by building them in Visual Studio, building them with .NET Core command-line interface tools, or building .NET Framework assemblies with a command-line compiler. For more information about building assemblies using .NET Core CLI, see .NET Core CLI overview.

To build an assembly in Visual Studio, on the Build menu, select Build.

Assembly manifest

Every assembly has an assembly manifest file. Similar to a table of contents, the assembly manifest contains:

The assembly’s identity (its name and version).

A file table describing all the other files that make up the assembly, such as other assemblies you created that your .exe or .dll file relies on, bitmap files, or Readme files.

An assembly reference list, which is a list of all external dependencies, such as .dlls or other files. Assembly references contain references to both global and private objects. Global objects are available to all other applications. In .NET Core, global objects are coupled with a particular .NET Core runtime. In .NET Framework, global objects reside in the global assembly cache (GAC). System.IO.dll is an example of an assembly in the GAC. Private objects must be in a directory level at or below the directory in which your app is installed.

Because assemblies contain information about content, versioning, and dependencies, the applications that use them needn’t rely on external sources, such as the registry on Windows systems, to function properly. Assemblies reduce .dll conflicts and make your applications more reliable and easier to deploy. In many cases, you can install a .NET-based application simply by copying its files to the target computer. For more information, see Assembly manifest.

Add a reference to an assembly

To use an assembly in an application, you must add a reference to it. Once an assembly is referenced, all the accessible types, properties, methods, and other members of its namespaces are available to your application as if their code were part of your source file.

Читайте также:  Linux change user file

Most assemblies from the .NET Class Library are referenced automatically. If a system assembly isn’t automatically referenced, for .NET Core, you can add a reference to the NuGet package that contains the assembly. Either use the NuGet Package Manager in Visual Studio, or add a

element for the assembly to the .csproj or .vbproj project. In .NET Framework, you can add a reference to the assembly by using the Add Reference dialog in Visual Studio, or by using the -reference command line option for the C# or Visual Basic compilers.

In C#, you can use two versions of the same assembly in a single application. For more information, see extern alias.

Any sources for learning assembly programming in Windows?

I would like to learn Assembly Programming for Windows. But I am having some problems to found material for learning. All the material I see don’t give enough code to program (they show just snippets), are too old, or are just theory.

3 Answers 3

For a long time, the ‘standard’ tutorial beginners start with for Windows assembly programming is Iczelion’s tutorial. Also for Windows assembler programming, the best forum (IMO) to get started is probably MASM32. It has a very active community which is very welcoming and helpful to newcomers and beginners. It sort of depends which particular flavour of assembler you want to learn but IMO, for Windows MASM32 has the best userbase (both in terms of community and resources around) for beginners.

You mention you want to learn RCE (reverse code engineering) also. A very common starting place for reversing on Windows is lena151’s tutorials which potentially is also a nice start if you already know assembler conceptually from having done Linux assembler programming.

Most assembly language programming you would do, especially in a full-OS environment like Windows, will just be snippets anyway (as opposed to a 100% assembly program). The easiest way to get started is to write a C program as a test harness and have it call your assembly language functions. Here’s a simple example:

Build and run (on my Mac with clang; modify for your compiler on windows):

The most important thing to get is the Intel manuals (other manufacturers like AMD will also have their own, but the instructions are very similar):

Those have all the instructions , costs and some guides to programming.

Not the answer you’re looking for? Browse other questions tagged assembly or ask your own question.

Hot Network Questions

Subscribe to RSS

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

site design / logo © 2021 Stack Exchange Inc; user contributions licensed under cc by-sa. rev 2021.4.16.39093

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.

Assembly programming in windows

RSS
feedly

Blog Topics

  • 3D Graphics Thingy (15)
  • 68 Katy (16)
  • Backwoods Logger (24)
  • Bit Bucket (122)
  • BMOW 1 (106)
  • Business (31)
  • Dev Tools (28)
  • Floppy Emu (164)
  • Nibbler (15)
  • Plus Too (22)
  • RC Vehicles (2)
  • ROM-inator (17)
  • Star Ring (2)
  • Tiny CPU (36)
  • Tranz 330 (5)
  • USB Wombat (29)
  • Uzebox (3)
  • Yellowstone (21)
  • Archives

    Assembly Language Windows Programming

    Who says assembly language programming is dead? Keeping with my recent theme of peering inside Windows executable files, I decided to bypass C++ completely and try writing a Windows program entirely in assembly language. I was happy to discover that it’s not difficult, especially if you have a bit of prior assembly experience for any CPU. My first example ASM program is only 17 lines! Granted it doesn’t do very much, but it demonstrates a skeleton that can be extended to create exactly the program I want – no more futzing around with C compiler options to prevent mystery “features” from being added to my code. Yes, I am a control freak.

    1. Minimal Assembly Example

    Here’s a simple example:

    If you’ve got any version of Microsoft Visual Studio installed on your PC, including the free Visual Studio Express versions, then you’ve already got MASM: the Microsoft Macro Assembler. Save the example file as msgbox.asm, and use MASM to build it from the command line like this:

    That doesn’t look too complicated. Let’s examine it line by line.

    .686
    This tells the assembler to generate x86 code that’s compatible with the Intel 686 CPU or later, aka the Pentium Pro. Any Intel-based machine from the past 15-20 years will be able to run this, so it’s a good generic default. You can also use .386, .486, or .586 here if you want to avoid generating any instructions not compatible with those older CPUs.

    .model flat, stdcall
    The memory model for all Win32 programs is always flat. The second parameter gives the default calling convention for procedures exported from this file, and can be either C or stdcall. Nothing is exported in this example, so the choice doesn’t really matter, but I’ll choose stdcall.

    When one function calls another, it must somehow pass the arguments to the called function. The caller and callee must agree on where the arguments will be placed, and in what order, or else the code won’t work correctly. If the arguments are passed on the stack, then the two functions must also agree on who’s responsible for popping them off afterwards, so the stack can be restored to its original state. These details are known as the calling convention.

    All of the Win32 API functions use the __stdcall convention, while C functions and the C library use the __cdecl (or just plain “C”) convention. You may also rarely see the __fastcall convention; look it up for more details. stdcall and cdecl conventions are similar: both pass arguments on the stack, and the arguments are pushed in right to left order. So a function whose prototype looks like:

    is called by pushing arg3 onto the stack first, followed by arg2 and arg1:

    These two conventions only differ regarding stack cleanup. With cdecl, the calling function is responsible for removing arguments from the stack, whereas with stdcall it’s the called function’s responsibility to do stack cleanup before it returns.

    EXTERN MessageBoxA@16 : proc
    EXTERN ExitProcess@4 : proc
    These lines tell MASM that the code makes reference to two externally-defined procedures. When the code is assembled into an .obj file, references to these procedures will be left pending. When the .obj file is later linked to create the finished executable, it must be linked with other .obj files or libraries that provide the definitions for these external references. If definitions aren’t found, you’ll see the familiar linker error message complaining of an “unresolved external symbol”.

    Читайте также:  Добавление локального пользователя windows 10 через командную строку

    The funny @4 and @16 at the end of the function names is the standard method of name mangling for stdcall functions, including all Win32 functions. A suffix is added to the name of the function, with the @ symbol and the total number of bytes of arguments expected by the function. This mangled name is the symbol that appears in the .obj file or library, and not the original name. The actual symbol name is also prefixed with an underscore, e.g. _MessageBox@16 , but MASM handles this automatically by prefixing an underscore to all statically imported or exported public symbols.

    To find the number of bytes of arguments expected by a Win32 stdcall function, you can view the online MSDN reference and add up the argument sizes manually, or you can use something like dumpbin /symbols user32.lib to view the mangled names of functions in an import library.

    For cdecl functions, there’s no name mangling. The name of the symbol is just the name of the function prefixed with an underscore, e.g. _strlen .

    Most of the time you don’t see this level of detail, because the compiler or assembler knows the calling convention and argument list of any functions you call, so it can do name mangling automatically behind the scenes. But in this example, I never told MASM what the calling convention is for MessageBox or ExitProcess, nor the number and sizes of the arguments they expect, so it can’t help with name mangling and I have to provide the mangled names manually. In a minute, I’ll show a nicer way to handle this with MASM.

    .const
    The .const directive indicates that whatever follows is constant read-only data, and should be placed in a separate section of the executable called .rdata. The memory for this section will have the read-only attribute enforced by the Windows virtual memory manager, so buggy code can’t modify it by mistake. Other possible data-related section directives are .data for read-write data, and .data? for uninitialized read-write data.

    msgText db ‘Windows assembly language lives!’, 0
    msgCaption db ‘Hello World’, 0
    The next lines allocate and initialize storage for two pieces of data named msgText and msgCaption. Because the previous line was the .const directive, this data will be placed in the executable’s .rdata section. db is the assembler directive for “define byte”, and is followed by a list of comma separated byte values. The values can be numeric constants, string literals, or a mix of both as shown here. The 0 after each string literal is the null terminator byte for C-style strings.

    .code
    .code indicates the start of a new section, and whatever follows is program code rather than data. It will be placed in a section of the executable called .text. Why doesn’t the directive match the section name?

    Main:
    Here the code defines a label called Main, which can then be used as a target for jump instructions or other instructions that reference memory. Main refers to the address at which the next line of code is assembled. There’s nothing magic about the word “Main” here, and label names can be anything you want as long as they’re not MASM keywords.

    push 0
    push offset msgCaption
    push offset msgText
    push 0
    This code pushes the arguments for MessageBox onto the stack, in right to left order as required by the stdcall convention. According to MSDN, the prototype of MessageBox is:

    int WINAPI MessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType);

    The first argument pushed onto the stack is the value for uType, a 4-byte unsigned integer. The value 0 here corresponds to the constant MB_OK , and means the MessageBox should contain a single push button labeled “OK”. Next the addresses of the caption and text string constants are pushed. The offset keyword tells MASM to push the memory address of the strings, and not the strings themselves, and is similar to the & operator in C. Finally the hWnd argument is pushed, which is a handle to the owner of the message box. The value 0 used here means the message box has no owner.

    call MessageBoxA@16
    Now the Win32 MessageBox function is finally called. call will push the return address onto the stack, and then jump to the address of _MessageBoxA@16 . It will use the arguments previously pushed onto the stack, display a message box, and wait for the user to click the OK button before returning. Because it’s a stdcall function, MessageBox will also remove the arguments from the stack before returning to the caller. The return value from calling MessageBox will be placed in the EAX register, which is the standard convention for Win32 functions.

    Notice that the code specifically called MessageBoxA, with an A suffix that indicates the caption and text are single-byte ASCII strings. The alternative is MessageBoxW, which expects wide or double-byte Unicode strings. Many Win32 functions exist with both -A and -W variants like this.

    push eax
    call ExitProcess@4
    The return value from MessageBox is pushed onto the stack, and ExitProcess is called. Its prototype looks like:

    VOID ExitProcess(UINT uExitCode);

    It takes a single argument for the program’s exit code. In this example, whatever value is returned by MessageBox will be used as the exit code. This is the end of the program – the call to ExitProcess never returns, because the program is terminated.

    End Main
    The end statement closes the last segment and marks the end of the source code. It must be at the end of every file. The optional address following end specifies the program’s entry point, where execution will begin after the program is loaded into memory. Alternatively, the entry point can be specified on the command line during the link step, using the /entry option.

    ml /coff /c /Cp msgbox.asm
    link /subsystem:windows /out:msgbox.exe kernel32.lib user32.lib msgbox.obj
    ml is the name of the MASM assembler. Running it will create the msgbox.obj file.
    /coff instructs MASM to create an object file in COFF format, compatible with recent Microsoft C compilers, so you can combine assembly and C objects into a single program.
    /c tells MASM to perform only the assembly step, stopping after creation of the .obj file, rather than also attempting to do linking.
    /Cp tells MASM to preserve the capitalization case of all identifiers.

    link is the Microsoft linker, the same one that’s invoked behind the scenes when building C or C++ programs from Visual Studio.
    /subsystem:windows means this is a Windows GUI-based program. Change this to /subsystem:console for a text-based program running in a console window.
    /out:msgbox.exe is the name to give the executable file that will be generated.

    Читайте также:  Ra2 fatal string manager failed to initialized properly windows 10

    The remainder of the line specifies the libraries and object files to be linked. MessageBox is implemented in user32 and ExitProcess in kernel32, so I’ve included those libraries. I didn’t provide the path to the libraries, so the linker will search the directories specified in the LIBPATH environment variable. The Visual Studio installer normally creates a shortcut in the start menu to help with this: it’s called “Developer Command Prompt for Visual Studio”, and it opens a console window with the LIBPATH and PATH environment variables set appropriately for wherever the development tools are installed.

    2. Improvements with MASM Macros and MASM32

    MASM is a “macro assembler”, and contains many macros that can make assembly programming much more convenient. For starters, I could define some constants to replace the magic zeroes in the arguments to MessageBox:

    In the preceding example, I had to do manual name mangling of Win32 function names, and push the arguments onto the stack one at a time. This can be avoided by using the MASM directives PROTO and INVOKE. Much like a function prototype in C, PROTO tells MASM what calling convention a function uses, and the number and types of the arguments it expects. The function can then be called in a single line using INVOKE, which will verify that the arguments are correct, perform any necessary name mangling, and generate push instructions to place the arguments on the stack in the proper order. Using these directives, the lines related to MessageBoxA in the example program could be condensed like this:

    MessageBoxA proto stdcall :DWORD,:DWORD,:DWORD,:DWORD
    invoke MessageBoxA, NULL, offset msgText, offset msgCaption, MB_OK

    Many people using MASM will use it in combination with MASM32, which provides a convenient set of include files containing prototypes for common Windows functions and constants. This enables the relevant lines of the MessageBox example to be further simplified to:

    include \masm32\include\windows.inc
    include \masm32\include\user32.inc
    invoke MessageBoxA, NULL, offset msgText, offset msgCaption, MB_OK

    Take a look at Iczelion’s excellent tutorial for a MessageBox example program making good use of all the MASM and MASM32 convenience features.

    3. Structured Programming with MASM

    The biggest headache writing any kind of non-trivial assembly language program is that all the little details quickly become tedious. A simple if/else construct must be written as a CMP instruction combined with a few conditional and unconditional jumps around the separate clauses. Allocating and using local variables on the stack is a pain. Working with objects and structures requires calculating the offset of each field from the base of the structure. It’s a giant hassle.

    Nothing can relieve all the tedium (this is assembly language after all), but MASM is a big help. Directives like .IF, .ELSE, and .LOCAL make it possible to write assembly code that almost looks like C. Instructions are automatically generated to reserve and free space for stack-based locals, and the locals can be referenced by name instead of with awkward constructs like EBP-8 . MASM also supports the declaration of C-style structs with named and typed fields. The result can be assembly code that’s surprisingly readable. Borrowing snippets from another Iczelion tutorial:

    This almost reads like C, and you might wonder how different it really is from writing C code. Despite the appearance, it’s still 100 percent assembly language, and the instructions in the .asm file are exactly what will appear in the final executable. There’s no optimization happening, no instruction reordering, and no true code generation in any complex sense. Directives like LOCAL that hide individual assembly instructions are just complex macros.

    If I find enough motivation, I’ll write another post soon that shows a more full-featured assembly language program using these techniques. Now if you want to know WHY in the 21st century someone would write Windows programs in assembly language, I don’t have a great answer. It might be useful if you need to do something extremely specific or performance critical. But if you’re like me, the only reason needed is that fact that it’s there, underlying everything that’s normally done with higher level languages. Whenever I see a black box like that, I want to open the lid and peek inside.

    7 Comments so far

    So what size binary did you end up with, and how does the layout compare to the C version? 🙂

    I personally like http://www.nasm.us/ for my assembly/disassembly needs. Cross-platform and does macros too.

    As for why I use it in the year 2015, well, I’m writing a compiler that targets 64-bit x86 (currently) and directly generates code and writes out an ELF or PE-COFF file, so I find it handy for testing stuff out/disassembling chunks of code. I wrote pretty much the direct equivalent of your program here, actually, with a bit more gubbins 😉

    64-bit x86 doesn’t seem to hold with mangling function names as described, probably because it all uses only the one calling convention anyway.

    The final executable was very similar to the one I built with C with a custom entry point. I did some fooling around afterwards, and shrunk it down further to 1024 bytes by merging all the sections into one. I think this is as small as you can get with the standard tools, because you need a header and at least one section, and the section must be 512 byte aligned and will be padded out to 512 bytes. You can get smaller if you construct the PE header and executable file by hand instead of with the linker, and do sneaky things like embed code/data inside the header itself.

    NASM seems very popular – maybe even more than MASM. I haven’t tried NASM yet. The first few examples I looked at happened to be MASM ones, so I just went with it.

    That’s interesting about the Microsoft x64 calling convention. I’d never heard of that before.

    Also procedural GUI programming in ASM was quite popular on the Apple IIgs. Its also quite tedious without macros.

    From that webpage… ‘It runs on any Windows platform (Win3/95/98), except NT, since it depends upon the virtual timer vxd (VTD) that’s not present in NT. ‘

    That’s, uh, pretty retro!

    Dear Sir,
    I have downloaded MASM32 and installed it on my notebook which runs on windows vista 32bit.
    I opened a source file(hello world) in the editor.
    So far I have not been able to find any commands that when I type in the editor that the programme will run and show the output file .
    I will be greatful if you let me know what commands I have to typed, to get the desired output on the screen?
    Kind regards
    Vahe

  • Оцените статью