Pincaster comes in handy for geo data or as a persistent memcache that speaks HTTP. But it also has other uses.
An overlooked feature is that keys are always ordered, hence prefix-matching is fast. This makes it a perfect candidate for autocompletion.
The Skyrock Spot iPhone app supports phonetical instant search on account and gang names. These names are weird beasts, usually made of a mix of decorative characters, redundant characters and digits/isolated characters that require to be pronounced as-is.
Names are normalized with Francotone. This library is dedicated to normalizing text-messaging-like French.
Then, names are just stored in Pincaster as keys composed of the normalized (phonetic) name, followed by a separator and by the original name.
For example, user “Franck142” would be stored as a key whose name is “frankunkatrde|Franck142”.
In addition to hashes, Pincaster supports a “void” type. A key mapped to a void type doesn’t store any data besides the key itself and requires very little memory. It might be a good pick if all you need is autocompletion. But adding a couple of key/value pairs is also totally worth it if you need to go beyond autocompletion and display basic search results.
Minus PHP’s terrible signal/noise ratio, indexing a new user is as simple as:
_name_to_ft() just uses Francotone in order to retrieve a phonetical translation of the user name.
Gangs are indexed in a similar way, but include more key/value pairs with scores, ranks, members and basically everything we need to display as a search result.
Finding candidates for a query only requires two prefix searches: one with the normalized form as a prefix (frankunkatrde| + wildcard), and another one with the query as a prefix (frankunkatrde + wildcard). This covers any user name that would sound like “frankunkatrde” and any user name that begins with something that would sound like “frankunkatrde”.
Here’s the current code, shared by anything indexed this way:
User names are then sorted by their Levenshtein distance.