-
Notifications
You must be signed in to change notification settings - Fork 3
Razor Design
The corner stone of razor is the package set on-disk data structure / file format. It is how razor represents the packages installed on the system and it’s what razor downloads from upstream servers to find out what’s available. It’s a simple binary file format, somewhat inspired by the ELF binary format. It has a sorted list of all packages and a sorted list of all unique properties (requires/provides/conflicts/obsoletes). Each package has a list of the properties associated with it (a list of indices into the list of all properties) and each property has a list of packages that it belongs to (as a list of indices into the package list). Strings are stored in a string pool, and referred to by their byte index in the pool.
Much of this is still changing, but as of June 13th, this write up from Dan Winship is reasonably accurate.
The repo starts with a header, containing some number of sections, terminated by a section with type 0:
struct razor_set_header {
uint32_t magic;
uint32_t version;
struct razor_set_section sections[0];
};
struct razor_set_section {
uint32_t type;
uint32_t offset;
uint32_t size;
};
razor_set_open() mmaps the repo file, and creates a struct razor_set:
struct razor_set {
struct array string_pool;
struct array packages;
struct array properties;
struct array files;
struct array package_pool;
struct array property_pool;
struct array file_pool;
struct razor_set_header *header;
};
by finding the sections with those IDs and creating “struct array”s
pointing to the right places in the mmap()ed data. (This is the only
processing needed when reading in the file; everything else is used
exactly as-is.)
- RAZOR_STRING_POOL
- RAZOR_PACKAGES
- RAZOR_PROPERTIES
- RAZOR_FILES
- RAZOR_PACKAGE_POOL
- RAZOR_PROPERTY_POOL
- RAZOR_FILE_POOL
Note that the exact layout of bits involves some historical accidents. (Particularly the fact that the “name” field in most structs loses its
high bits to a flags field.)
struct list_head {
uint list_ptr : 24;
uint flags : 8;
};
struct list {
uint data : 24;
uint flags : 8;
};
Used to store lists of package, property, or file IDs. “struct list_head” stores the head of the list, which points to one or more “struct list”s in the appropriate “pool” section. (“struct list” should probably be called “struct list_item”.)
“list_first(&head, &pool)” returns a “struct list *” pointing to the first element of the list (or NULL for an empty list), and “list_next(list)” will return successive elements, until NULL is returned. Each “list→data” contains the index of a package, property, or file in the corresponding section of the set.
Peeking underneath the abstraction, a list_head’s “flags” is 0xff if the list is empty, 0×80 if it contains a single element, or 0×00 if it contains more than one element. In the single-element case, that element is actually stored in the list_head directly rather than being stored in a pool (and so list_first() just casts the list_head* to a list* and returns it). For multi-element lists, list_ptr is the index in the pool of the first element of this list; the list continues through successive elements of the pool until one with non-zero flags is reached, indicating the end of the list.
struct razor_package {
uint name : 24;
uint flags : 8;
uint version : 32;
struct list_head properties;
struct list_head files;
};
name and version are indexes into string_pool. properties is a list of all of the package’s properties, and files is a list of its files. flags is currently only used during razor_set merging, to keep track of which set a package came from.
struct razor_property
uint name : 24;
uint flags : 6;
uint type : 2;
uint relation : 32;
uint version : 32;
struct list_head packages;
name and version are indexes into string_pool. type is an enum razor_property_type (eg, RAZOR_PROPERTY_REQUIRES), and relation is an enum razor_version_relation (eg, RAZOR_VERSION_GREATER_OR_EQUAL). packages is a list of the packages that have this property. flags is currently unused.
struct razor_entry {
uint name : 24;
uint flags : 8;
uint start : 32;
struct list_head packages;
};
name is an index into string_pool, giving the basename of the file. start is either 0, or an index pointing to another razor_entry that is the first child of this entry (for a non-empty directory). (Entry 0 is always the root of the tree, so no entry could have entry 0 as a child.) flags is 0×80 (RAZOR_ENTRY_LAST) if an entry is the last entry in its directory. Otherwise it is 0.
Note that given a pointer to a struct_razor_entry (eg, from a package’s “files” list), there is no way to reconstruct its full name without walking the entire files array up to that point. Because of this and other problems (fix_file_map()), it seems like razor_entry should be modified to include a pointer to its parent. (Storing full paths instead of just basenames would also fix this problem, but that would use much more memory.)