core.sys.darwin.mach.loader

This file describes the format of Mach-O object files.

D header file for mach-o/loader.h from the macOS 10.15 SDK.

Members

Enums

BIND_TYPE_POINTER
anonymousenum BIND_TYPE_POINTER

The following are used to encode binding information.

DICE_KIND_DATA
anonymousenum DICE_KIND_DATA
EXPORT_SYMBOL_FLAGS_KIND_MASK
anonymousenum EXPORT_SYMBOL_FLAGS_KIND_MASK

The following are used on the flags byte of a terminal node in the export information.

INDIRECT_SYMBOL_LOCAL
anonymousenum INDIRECT_SYMBOL_LOCAL

An indirect symbol table entry is simply a 32bit index into the symbol table to the symbol that the pointer or stub is referring to. Unless it is for a non-lazy symbol pointer section for a defined symbol which strip(1) as removed. In which case it has the value INDIRECT_SYMBOL_LOCAL. If the symbol was also absolute INDIRECT_SYMBOL_ABS is or'ed with that.

LC_LOAD_WEAK_DYLIB
enum LC_LOAD_WEAK_DYLIB

Load a dynamically linked shared library that is allowed to be missing (all symbols are weak imported).

LC_REQ_DYLD
enum LC_REQ_DYLD

After MacOS X 10.1 when a new load command is added that is required to be understood by the dynamic linker for the image to execute properly the LC_REQ_DYLD bit will be or'ed into the load command constant. If the dynamic linker sees such a load command it it does not understand will issue a "unknown load command required for execution" error and refuse to use the image. Other load commands without this bit that are not understood will simply be ignored.

LC_SEGMENT
anonymousenum LC_SEGMENT

Constants for the cmd field of all load commands, the type.

LC_SEGMENT_64
anonymousenum LC_SEGMENT_64
MH_MAGIC
anonymousenum MH_MAGIC

Constant for the magic field of the mach_header (32-bit architectures)

MH_MAGIC_64
anonymousenum MH_MAGIC_64

Constant for the magic field of the mach_header_64 (64-bit architectures)

MH_NOUNDEFS
anonymousenum MH_NOUNDEFS

Constants for the flags field of the mach_header

MH_OBJECT
anonymousenum MH_OBJECT

The layout of the file depends on the filetype. For all but the MH_OBJECT file type the segments are padded out and aligned on a segment alignment boundary for efficient demand pageing. The MH_EXECUTE, MH_FVMLIB, MH_DYLIB, MH_DYLINKER and MH_BUNDLE file types also have the headers included as part of their first segment.

PLATFORM_MACOS
anonymousenum PLATFORM_MACOS

Known values for the platform field above.

REBASE_TYPE_POINTER
anonymousenum REBASE_TYPE_POINTER

The following are used to encode rebasing information.

SECTION_ATTRIBUTES_USR
anonymousenum SECTION_ATTRIBUTES_USR

Constants for the section attributes part of the flags field of a section structure.

SECTION_TYPE
anonymousenum SECTION_TYPE

The flags field of a section structure is separated into two parts a section type and section attributes. The section types are mutually exclusive (it can only have one type) but the section attributes are not (it may have more than one attribute).

SEG_PAGEZERO
anonymousenum SEG_PAGEZERO

The names of segments and sections in them are mostly meaningless to the link-editor. But there are few things to support traditional UNIX executables that require the link-editor and assembler to use some names agreed upon by convention.

SG_HIGHVM
anonymousenum SG_HIGHVM

Constants for the flags field of the segment_command.

S_REGULAR
anonymousenum S_REGULAR

Constants for the type of a section.

TOOL_CLANG
anonymousenum TOOL_CLANG

Known values for the tool field above.

Structs

build_tool_version
struct build_tool_version
build_version_command
struct build_version_command

The build_version_command contains the min OS version on which this binary was built to run for its platform. The list of known platforms and tool values following it.

data_in_code_entry
struct data_in_code_entry

The LC_DATA_IN_CODE load commands uses a linkedit_data_command to point to an array of data_in_code_entry entries. Each entry describes a range of data in a code section.

dyld_info_command
struct dyld_info_command

The dyld_info_command contains the file offsets and sizes of the new compressed form of the information dyld needs to load the image. This information is used by dyld on Mac OS X 10.6 and later. All information pointed to by this command is encoded using byte streams, so no endian swapping is needed to interpret it.

dylib
struct dylib

Dynamically linked shared libraries are identified by two things. The pathname (the name of the library as found for execution), and the compatibility version number. The pathname must match and the compatibility number in the user of the library must be greater than or equal to the library being used. The time stamp is used to record the time a library was built and copied into user so it can be use to determined if the library used at runtime is exactly the same as used to built the program.

dylib_command
struct dylib_command

A dynamically linked shared library (filetype == MH_DYLIB in the mach header) contains a dylib_command (cmd == LC_ID_DYLIB) to identify the library. An object that uses a dynamically linked shared library also contains a dylib_command (cmd == LC_LOAD_DYLIB, LC_LOAD_WEAK_DYLIB, or LC_REEXPORT_DYLIB) for each library it uses.

dylib_module
struct dylib_module

A module table entry.

dylib_module_64
struct dylib_module_64

A 64-bit module table entry.

dylib_reference
struct dylib_reference

The entries in the reference symbol table are used when loading the module (both by the static and dynamic link editors) and if the module is unloaded or replaced. Therefore all external symbols (defined and undefined) are listed in the module's reference table. The flags describe the type of reference that is being made. The constants for the flags are defined in <mach-o/nlist.h> as they are also used for symbol table entries.

dylib_table_of_contents
struct dylib_table_of_contents

A table of contents entry.

dylinker_command
struct dylinker_command

A program that uses a dynamic linker contains a dylinker_command to identify the name of the dynamic linker (LC_LOAD_DYLINKER). And a dynamic linker contains a dylinker_command to identify the dynamic linker (LC_ID_DYLINKER). A file can have at most one of these. This struct is also used for the LC_DYLD_ENVIRONMENT load command and contains string for dyld to treat like environment variable.

dysymtab_command
struct dysymtab_command

This is the second set of the symbolic information which is used to support the data structures for the dynamically link editor.

encryption_info_command
struct encryption_info_command

The encryption_info_command contains the file offset and size of an of an encrypted segment.

encryption_info_command_64
struct encryption_info_command_64

The encryption_info_command_64 contains the file offset and size of an of an encrypted segment (for use in x86_64 targets).

entry_point_command
struct entry_point_command

The entry_point_command is a replacement for thread_command. It is used for main executables to specify the location (file offset) of main(). If -stack_size was used at link time, the stacksize field will contain the stack size need for the main thread.

fvmfile_command
struct fvmfile_command

The fvmfile_command contains a reference to a file to be loaded at the specified virtual address. (Presently, this command is reserved for internal use. The kernel ignores this command when loading a program into memory).

fvmlib
struct fvmlib

Fixed virtual memory shared libraries are identified by two things. The target pathname (the name of the library as found for execution), and the minor version number. The address of where the headers are loaded is in header_addr. (THIS IS OBSOLETE and no longer supported).

fvmlib_command
struct fvmlib_command

A fixed virtual shared library (filetype == MH_FVMLIB in the mach header) contains a fvmlib_command (cmd == LC_IDFVMLIB) to identify the library. An object that uses a fixed virtual shared library also contains a fvmlib_command (cmd == LC_LOADFVMLIB) for each library it uses. (THIS IS OBSOLETE and no longer supported).

ident_command
struct ident_command

The ident_command contains a free format string table following the ident_command structure. The strings are null terminated and the size of the command is padded out with zero bytes to a multiple of 4 bytes/ (THIS IS OBSOLETE and no longer supported).

linkedit_data_command
struct linkedit_data_command

The linkedit_data_command contains the offsets and sizes of a blob of data in the __LINKEDIT segment.

linker_option_command
struct linker_option_command
Undocumented in source.
load_command
struct load_command

The load commands directly follow the mach_header. The total size of all of the commands is given by the sizeofcmds field in the mach_header. All load commands must have as their first two fields cmd and cmdsize. The cmd field is filled in with a constant for that command type. Each command type has a structure specifically for it. The cmdsize field is the size in bytes of the particular load command structure plus anything that follows it that is a part of the load command (i.e. section structures, strings, etc.). To advance to the next load command the cmdsize can be added to the offset or pointer of the current load command. The cmdsize for 32-bit architectures MUST be a multiple of 4 bytes and for 64-bit architectures MUST be a multiple of 8 bytes (these are forever the maximum alignment of any load commands). The padded bytes must be zero. All tables in the object file must also follow these rules so the file can be memory mapped. Otherwise the pointers to these tables will not work well or at all on some machines. With all padding zeroed like objects will compare byte for byte.

mach_header
struct mach_header

The 32-bit mach header appears at the very beginning of the object file for 32-bit architectures.

mach_header_64
struct mach_header_64

The 64-bit mach header appears at the very beginning of object files for 64-bit architectures.

note_command
struct note_command

LC_NOTE commands describe a region of arbitrary data included in a Mach-O file. Its initial use is to record extra data in MH_CORE files.

prebind_cksum_command
struct prebind_cksum_command

The prebind_cksum_command contains the value of the original check sum for prebound files or zero. When a prebound file is first created or modified for other than updating its prebinding information the value of the check sum is set to zero. When the file has it prebinding re-done and if the value of the check sum is zero the original check sum is calculated and stored in cksum field of this load command in the output file. If when the prebinding is re-done and the cksum field is non-zero it is left unchanged from the input file.

prebound_dylib_command
struct prebound_dylib_command

A program (filetype == MH_EXECUTE) that is prebound to its dynamic libraries has one of these for each library that the static linker used in prebinding. It contains a bit vector for the modules in the library. The bits indicate which modules are bound (1) and which are not (0) from the library. The bit for module 0 is the low bit of the first byte. So the bit for the Nth module is: (linked_modules[N/8] >> N%8) & 1

routines_command
struct routines_command

The routines command contains the address of the dynamic shared library initialization routine and an index into the module table for the module that defines the routine. Before any modules are used from the library the dynamic linker fully binds the module that defines the initialization routine and then calls it. This gets called before any module initialization routines (used for C++ static constructors) in the library.

routines_command_64
struct routines_command_64

The 64-bit routines command. Same use as above.

rpath_command
struct rpath_command

The rpath_command contains a path which at runtime should be added to the current run path used to find @rpath prefixed dylibs.

section
struct section

A segment is made up of zero or more sections. Non-MH_OBJECT files have all of their segments with the proper sections in each, and padded to the specified segment alignment when produced by the link editor. The first segment of a MH_EXECUTE and MH_FVMLIB format file contains the mach_header and load commands of the object file before its first section. The zero fill sections are always last in their segment (in all formats). This allows the zeroroed segment padding to be mapped into memory where zero fill sections might be. The gigabyte zero fill sections, those with the section type S_GB_ZEROFILL, can only be in a segment with sections of this type. These segments are then placed after all other segments.

section_64
struct section_64
segment_command
struct segment_command

The segment load command indicates that a part of this file is to be mapped into the task's address space. The size of this segment in memory, vmsize, maybe equal to or larger than the amount to map from this file, filesize. The file is mapped starting at fileoff to the beginning of the segment in memory, vmaddr. The rest of the memory of the segment, if any, is allocated zero fill on demand. The segment's maximum virtual memory protection and initial virtual memory protection are specified by the maxprot and initprot fields. If the segment has sections then the section structures directly follow the segment command and their size is reflected in cmdsize.

segment_command_64
struct segment_command_64
Undocumented in source.
source_version_command
struct source_version_command

The source_version_command is an optional load command containing the version of the sources used to build the binary.

sub_client_command
struct sub_client_command

For dynamically linked shared libraries that are subframework of an umbrella framework they can allow clients other than the umbrella framework or other subframeworks in the same umbrella framework. To do this the subframework is built with "-allowable_client client_name" and an LC_SUB_CLIENT load command is created for each -allowable_client flag. The client_name is usually a framework name. It can also be a name used for bundles clients where the bundle is built with "-client_name client_name".

sub_framework_command
struct sub_framework_command

A dynamically linked shared library may be a subframework of an umbrella framework. If so it will be linked with "-umbrella umbrella_name" where Where "umbrella_name" is the name of the umbrella framework. A subframework can only be linked against by its umbrella framework or other subframeworks that are part of the same umbrella framework. Otherwise the static link editor produces an error and states to link against the umbrella framework. The name of the umbrella framework for subframeworks is recorded in the following structure.

sub_library_command
struct sub_library_command

A dynamically linked shared library may be a sub_library of another shared library. If so it will be linked with "-sub_library library_name" where "library_name" is the name of the sub_library shared library. When statically linking when -twolevel_namespace is in effect a twolevel namespace shared library will only cause its subframeworks and those frameworks listed as sub_umbrella frameworks and libraries listed as sub_libraries to be implicited linked in. Any other dependent dynamic libraries will not be linked it when -twolevel_namespace is in effect. The primary library recorded by the static linker when resolving a symbol in these libraries will be the umbrella framework (or dynamic library). Zero or more sub_library shared libraries may be use by an umbrella framework or (or dynamic library). The name of a sub_library framework is recorded in the following structure. For example /usr/lib/libobjc_profile.A.dylib would be recorded as "libobjc".

sub_umbrella_command
struct sub_umbrella_command

A dynamically linked shared library may be a sub_umbrella of an umbrella framework. If so it will be linked with "-sub_umbrella umbrella_name" where "umbrella_name" is the name of the sub_umbrella framework. When statically linking when -twolevel_namespace is in effect a twolevel namespace umbrella framework will only cause its subframeworks and those frameworks listed as sub_umbrella frameworks to be implicited linked in. Any other dependent dynamic libraries will not be linked it when -twolevel_namespace is in effect. The primary library recorded by the static linker when resolving a symbol in these libraries will be the umbrella framework. Zero or more sub_umbrella frameworks may be use by an umbrella framework. The name of a sub_umbrella framework is recorded in the following structure.

symseg_command
struct symseg_command

The symseg_command contains the offset and size of the GNU style symbol table information as described in the header file <symseg.h>. The symbol roots of the symbol segments must also be aligned properly in the file. So the requirement of keeping the offsets aligned to a multiple of a 4 bytes translates to the length field of the symbol roots also being a multiple of a long. Also the padding must again be zeroed. (THIS IS OBSOLETE and no longer supported).

symtab_command
struct symtab_command

The symtab_command contains the offsets and sizes of the link-edit 4.3BSD "stab" style symbol table information as described in the header files <nlist.h> and <stab.h>.

thread_command
struct thread_command

Thread commands contain machine-specific data structures suitable for use in the thread state primitives. The machine specific data structures follow the struct thread_command as follows. Each flavor of machine specific data structure is preceded by an uint32_t constant for the flavor of that data structure, an uint32_t that is the count of uint32_t's of the size of the state data structure and then the state data structure follows. This triple may be repeated for many flavors. The constants for the flavors, counts and state data structure definitions are expected to be in the header file <machine/thread_status.h>. These machine specific data structures sizes must be multiples of 4 bytes. The cmdsize reflects the total size of the thread_command and all of the sizes of the constants for the flavors, counts and state data structures.

tlv_descriptor
struct tlv_descriptor

Sections of type S_THREAD_LOCAL_VARIABLES contain an array of tlv_descriptor structures.

twolevel_hint
struct twolevel_hint

The entries in the two-level namespace lookup hints table are twolevel_hint structs. These provide hints to the dynamic link editor where to start looking for an undefined symbol in a two-level namespace image. The isub_image field is an index into the sub-images (sub-frameworks and sub-umbrellas list) that made up the two-level image that the undefined symbol was found in when it was built by the static link editor. If isub-image is 0 the symbol is expected to be defined in library and not in the sub-images. If isub-image is non-zero it is an index into the array of sub-images for the umbrella with the first index in the sub-images being 1. The array of sub-images is the ordered list of sub-images of the umbrella that would be searched for a symbol that has the umbrella recorded as its primary library. The table of contents index is an index into the library's table of contents. This is used as the starting point of the binary search or a directed linear search.

twolevel_hints_command
struct twolevel_hints_command

The twolevel_hints_command contains the offset and number of hints in the two-level namespace lookup hints table.

uuid_command
struct uuid_command

The uuid load command contains a single 128-bit unique random number that identifies an object produced by the static link editor.

version_min_command
struct version_min_command

The version_min_command contains the min OS version on which this binary was built to run.

Unions

lc_str
union lc_str

A variable length string in a load command is represented by an lc_str union. The strings are stored just after the load command structure and the offset is from the start of the load command structure. The size of the string is reflected in the cmdsize field of the load command. Once again any padded bytes to bring the cmdsize field to a multiple of 4 bytes must be zero.

Meta

Version

Initial created: Feb 20, 2010-2018

Authors

Jacob Carlborg