|
The Berkeley DB Btree implementation maximizes the number of keys that can be stored on an internal page by storing only as many bytes of each key as are necessary to distinguish it from adjacent keys. The prefix comparison routine is what determines this minimum number of bytes (that is, the length of the unique prefix), that must be stored. A prefix comparison function for the Btree can be specified by calling DB->set_bt_prefix.
The prefix comparison routine must be compatible with the overall comparison function of the Btree, since what distinguishes any two keys depends entirely on the function used to compare them. This means that if a prefix comparison routine is specified by the application, a compatible overall comparison routine must also have been specified.
Prefix comparison routines are passed pointers to keys as arguments. The keys are represented as DBT structures. The prefix comparison function must return the number of bytes of the second key argument that are necessary to determine if it is greater than the first key argument. If the keys are equal, the length of the second key should be returned. The only fields that the routines may examine in the DBT structures are data and size fields.
An example prefix comparison routine follows:
u_int32_t compare_prefix(dbp, a, b) DB *dbp; const DBT *a, *b; { size_t cnt, len; u_int8_t *p1, *p2;cnt = 1; len = a->size > b->size ? b->size : a->size; for (p1 = a->data, p2 = b->data; len--; ++p1, ++p2, ++cnt) if (*p1 != *p2) return (cnt); /* * They match up to the smaller of the two sizes. * Collate the longer after the shorter. */ if (a->size < b->size) return (a->size + 1); if (b->size < a->size) return (b->size + 1); return (b->size); }
The usefulness of this functionality is data-dependent, but in some data sets can produce significantly reduced tree sizes and faster search times.