A hash table is very universal data structure. It does most of it’s operations in O(1) average time. The library contains a header to generate hash tables suiting your needs.
They are generic data structures.
Mandatory macros
-
HASH_NODE — a data type where a node dwells. It is usually a structure.
-
HASH_PREFIX(x) — the name generating macro.
-
Key type and name. Must be one of following.
-
HASH_KEY_ATOMIC — the key (node\->HASH_KEY_ATOMIC) is an atomic type which can be compared using ==.
-
HASH_KEY_STRING — the key is a zero-terminated string, allocated separately from the rest of the node.
-
HASH_KEY_ENDSTRING — a zero-terminated string which lives at the end of node (it is allocated together with the node). It should be declared as char key[1].
-
HASH_KEY_MEMORY — the node\->HASH_KEY_MEMORY is to be compared using memcmp() function. In this case, you need to provide HASH_KEY_SIZE macro as well, to specify the length of the key.
-
HASH_KEY_COMPLEX(x) — the key is compound of more than one component. The macro should expand to x key1, x key2, ..., x kn. Furthermore, you need to provide a HASH_KEY_DECL macro. It is used to define function parameters. Therefore it should expand to type1 key1, type2 key2, ..., typen keyn. And HASH_GIVE_HASHFN and HASH_GIVE_EQ are mandatory for this key type.
-
Optional function switches
You can define any of these macros and provide corresponding functions to customize the behaviour. The macros are:
-
HASH_GIVE_HASHFN — the table will use uint HASH_PREFIX(hash)(key) to calculate hash of key. There is a sensible default for integers and strings. In the case of HASH_KEY_COMPLEX, it is mandatory to provide this macro and function.
-
HASH_GIVE_EQ — tells the table to use int HASH_PREFIX(eq)(key1, key2) function to decide if key1 and key2 are equal. Default for atomic types is == and strcmp() or strcasecmp() for strings (depends on HASH_NOCASE switch). It is mandatory when you use HASH_KEY_COMPLEX.
-
HASH_GIVE_EXTRA_SIZE — function int HASH_PREFIX(extra_size)(key) returns how many bytes after the node should be allocated. It defaults to 0 or to length of key in case of HASH_KEY_ENDSTRING.
-
HASH_GIVE_INIT_KEY — function void HASH_PREFIX(init_key)(node *, key) is used to initialize key in newly created node. The default is assignment for atomic keys and static strings (HASH_KEY_ATOMIC, HASH_KEY_STRING) and strcpy() for HASH_KEY_ENDSTRING.
-
HASH_GIVE_INIT_DATA — function void HASH_PREFIX(init_data)(node *) is used to initialize the rest of node. Useful if you use HASH_PREFIX(lookup())
-
HASH_GIVE_ALLOC — you need to provide void \*HASH_PREFIX(alloc)(uint size and void HASH_PREFIX(free)(void \*) to allocate and deallocate the nodes. Default uses xmalloc() and xfree(), [mempool routines] or [eltpool routines], depending on HASH_USE_POOL, HASH_AUTO_POOL, HASH_USE_ELTPOOL and HASH_AUTO_ELTPOOL switches.
-
[HASH_GIVE_TABLE_ALLOC] — you need to provide void \*HASH_PREFIX(table_alloc)(uint size and void HASH_PREFIX(table_free)(void \*) to allocate and deallocate the table itself. Default uses xmalloc() and xfree() or the functions from HASH_GIVE_ALLOC depending on [HASH_TABLE_ALLOC] switch.
Optional parameters
You can customize the hash table a little more by these macros:
-
HASH_NOCASE — use case-insensitive comparison for strings.
-
HASH_DEFAULT_SIZE — use approximately this many elements when creating the hash table.
-
HASH_CONSERVE_SPACE — use as little space as possible.
-
HASH_FN_BITS — the hash function only provides this many significants bits.
-
HASH_ATOMIC_TYPE — the type of atomic key (HASH_KEY_ATOMIC) is not int, but this type.
-
HASH_USE_POOL — tells to use mempool allocation to allocate the nodes. You should define it to the name of mempool variable to be used for this purpose.
-
HASH_AUTO_POOL — like above, but it creates it’s own mempool. Define it to the block size of the pool.
-
HASH_USE_ELTPOOL — tells to use eltpool allocation to allocate the nodes. You should define it to the name of eltpool variable to be used for this purpose.
-
HASH_AUTO_ELTPOOL — like above, but it creates it’s own mempool. Define it to the number of preallocated nodes in each chunk of memory.
-
HASH_ZERO_FILL — initialize new nodes to all zeroes.
-
HASH_TABLE_ALLOC — allocate the table the same way as nodes. If not provided, xmalloc() is used.
-
HASH_TABLE_GROWING — never decrease the size of allocated table of nodes.
-
HASH_TABLE_DYNAMIC — By default, only one global hash table is used. With this macro defined, all functions gain new first parameter of type HASH_PREFIX(table) * to allow them work with multiple hash tables.
-
HASH_TABLE_VARS — extra variables to be defined at head of HASH_PREFIX(table) * structure. It can be useful in combination with [HASH_TABLE_DYNAMIC] to access per-table custom variables from macros or function switches before you include the generator.
-
HASH_LOOKUP_DETECT_NEW — the prototype for lookup is changed to node *lookup(key, int *new_item), new_item must not be NULL and returns 1 whether lookup just created a new item in the hashtable or 0 otherwise.
Functionality switches
Each of these macros enables some of the functionality the table has.
-
HASH_WANT_CLEANUP — HASH_PREFIX((cleanup()
-
HASH_WANT_FIND — HASH_PREFIX((find()
-
HASH_WANT_FIND_NEXT — HASH_PREFIX((find_next()
-
HASH_WANT_NEW — HASH_PREFIX((new()
-
HASH_WANT_LOOKUP — HASH_PREFIX((lookup()
-
HASH_WANT_DELETE — HASH_PREFIX((delete()
-
HASH_WANT_REMOVE — HASH_PREFIX((remove()
Generated functions
These are the function that the header file generates for you. The strange first parameter of each function is a place where the HASH_PREFIX(table) * resides when you define HASH_TABLE_DYNAMIC. If you do not, the parameter is empty.
static void HASH_PREFIX(init)(TA);
Initializes the hash table. This one is available no matter what HASH_WANT_ macros you defined or not.
static void HASH_PREFIX(cleanup)(TA);
Deallocates the hash table, including the nodes. It is available if you defined HASH_WANT_CLEANUP.
static HASH_NODE* HASH_PREFIX(find)(TAC HASH_KEY_DECL);
Finds a node with given key (specified in the HAS_KEY_DECL parameter). If it does not exist, NULL is returned.
Enabled by the HASH_WANT_FIND macro.
static HASH_NODE* HASH_PREFIX(find_next)(TAC P(node) *start);
Finds next node with the same key. Returns NULL if it does not exist.
Enabled by the HASH_WANT_FIND_NEXT macro.
static HASH_NODE * HASH_PREFIX(new)(TAC HASH_KEY_DECL);
Generates a new node with a given key.
Enabled by the HASH_WANT_NEW macro.
static HASH_NODE* HASH_PREFIX(lookup)(TAC HASH_KEY_DECL, int *new_item);
Finds a node with a given key. If it does not exist, a new one is created. It is strongly recommended to use HASH_GIVE_INIT_DATA.
This one is enabled by the HASH_WANT_LOOKUP macro. The new_item argument is available only if HASH_LOOKUP_DETECT_NEW was given.
static int HASH_PREFIX(delete)(TAC HASH_KEY_DECL);
Removes a node with the given key from hash table and deallocates it.
Success is returned.
This one is enabled by HASH_WANT_DELETE macro.
static void HASH_PREFIX(remove)(TAC HASH_NODE *n);
Removes a given node and deallocates it. It differs from HASH_PREFIX(delete)() in its type of parameter — this one deletes a specific node, that one looks for it by a key.
Enabled by HASH_WANT_REMOVE macro.
Iterator
You can use the HASH_FOR_ALL iterator macro to run trough all the nodes. Lets say your HASH_PREFIX(x) macro is defined as prefix_##x. Then you would do something like:
HASH_FOR_ALL(prefix, node_variable) { do_something_with_node(node_variable); } HASH_END_FOR;
If you use HASH_TABLE_DYNAMIC, use HASH_FOR_ALL_DYNAMIC(prefix, table, node_variable) instead.
You may not modify the table inside the block. Use HASH_BREAK and HASH_CONTINUE instead of break and continue statements.