📍 mnemonic-place-encoder

29 May 2025 • 12 min read

like w3w for UPRN, but open and with 4 words

geoplace conference was great. the keynote included this pearl of wisdom:

It is hard to compare addresses
It is easy to compare UPRNs

A UPRN (unique property reference number) is a permanent 12-digit standard location identifier for addressable places in Britain. This machine-readable identifier can be linked ad hoc (with help from human-readable lookups) or post-hoc (matching algorithms). UPRN is a human-computer data bridge for places.

For example, consider tax and voting datasets. Using a left join to identify households that are contributing financially to democracy but not electorally, it becomes simple to post those households an invitation to the vote! Now imagine trying to join those datasets on addresses recorded in different systems in different ways?!

🏛️ 📍 🗳️
SELECT * FROM Taxpayers
LEFT JOIN Voters
ON Taxpayers.UPRN = Voters.UPRN
WHERE Voters.UPRN IS NULL

Inspiration strikes

Like all good conferences, geoplace brings ideas together and sometimes they collide leading new ideas to be forged; some good, some bad, and some (as in this case) playful.

The overriding theme of the event is driving adoption of location standards across public services with the objective of enabling integration. Public service records contain place data very often since:

Everything that happens, happens somewhere

Standardising that "somewhere" helps services help each other, since locations are frequently the best bridge between them. Definitive shared place identifiers become the crown jewels of public service integration.

The conference had a workshop session where delegates had quickfire topical discussions, each with its own brief. My favourite: "Best kept secret" considered how to popularise standards, and went down an interesting accessibility route. This discussion brought a couple of fun ideas to mind:

Branding

Since the British Standards kitemark looks a bit like a loveheart; use it to invoke a call to action:

Love locations?
Love location standards!
<💗>
…well… trademarks being what they are, this campaign is a complete non-starter; so let's proceed to the next idea:

Mnemonic Encoding/Decoding

Someone mentioned using what3words to rendezvous in a crowded place and I thought of w3w for UPRN. Yes w3w has found similar looking/sounding encodings for different places. Furthermore, applying word encoding to UPRN numbers is pointless since, like phone numbers, they are easily looked up in the definitive list; just as we don't need to remember phone numbers.

Still… it was fun playing with dictionary encoding - after rummaging through some linguistics corpora and finding lots of tricky words, I asked Mistral to generate a neutral and familiar 1000-word list, it said:

Creating a list of 1000 words manually would be quite extensive […] Let's generate this list using some Python code. We'll start with a basic list and expand it to meet your requirements.
It then ran some python to pad a 732 long wordlist to 1000 with prefixes and suffixes.

prefixes = ["Super", "Hyper", "Mega", 
            "Ultra", "Mini", "Micro", 
            "Macro", "Multi"]
suffixes = ["-like", "-ish", "-esque", "-ful", "-less", "-ness", "-able", "-er"]

This is an optimising trick, and we know from w3w that plurals and similar looking/sounding words introduce ambiguities between locations. The honing of this dictionary will involve finding and replacing ambiguities - and testing for ambiguities is going to be a sidequest involving encoding a all the UPRNs and comparing them for similarity using some computational linguistics.

So far there's been an attempt to compute levenshtein distance between all the possible combinations using the rapidfuzz package, but quintillions of comparisons is a bit computationally expensive ( 41,196,104 2 = 1,697,118,984,778,816 ). I wonder how the person who found ambuiguity in w3w did it.

Meanwhile… A 1000-word dictionary is enough to encode every possible combination of three digits (3 to the power of 10) so a 12 digit UPRN, such as the one for 10 Downing St, can be dictionary encoded into four words:

        ---
        title: "Encoding the UPRN for 10 Downing St against a 1000 word dictionary"
        config:
            packet:
                showBits: false
                bitsPerRow: 40
                rowHeight: 75
        ---
        packet-beta
        0-39: "10 Downing Street, London, SW1A 2AA"
        40-79: "100023336956"
        80-89: "100"
        90-99: "023"
        100-109: "336"
        110-119: "956"
        120-129: "Cucumber"
        130-139: "Lasagne"
        140-149: "Submarine"
        150-159: "SuperCarpet"
        160-199: "1000 word dictionary"
    

if i wanted three words instead of four, the dictionary would have to be four-to-the-power-of-ten words long i.e. 10,000 - and even getting to 1,000 introduced potential for ambiguity.

Mistral vibe-coded the below tool to encode/decode uprns to/from word quads in addition to providing a 1000-strong dictionary. I preferred this list to a few I had found on the web such as most common British words, as these sometimes included negative sounding words like fear and loathing, although real places have difficult names too, like "Cape Wrath Lighthouse". Mistral seemed to stay neutral in its word selection. The tool is shared below, as is my rambling conversation with Le Chat - Here's a few UPRNs to consider:

1 or 000000000001
Bristol City Council, City Hall, College Green, City Centre, Bristol, BS1 5TR
Apple Apple Apple Banana
10022990231
The Angel Of The North, Low Eighton, Lamesley, Gateshead, NE9 7UA
Pineapple Spaghetti MegaUmbrella Pillow
10010457355
Stonehenge Stone Circle, Winterbourne Stoke, SP4 7DD
Pineapple Pineapple Viola Hovercraft
130102430
Cape Wrath Lighthouse, Durness, IV27 4QQ
Apple Mushroom Potato Skeleton
10034781602
First And Last Turning Shop, Lizard Point, The Lizard, TR12 7NU
Pineapple Turkey HyperLemonade Admiral
100023336956
Prime Minister & First Lord Of The Treasury, 10 Downing Street, London, SW1A 2AA
Cucumber Lasagne Submarine SuperCarpet

Mnemonic place tool

Four Word Locations from UIDs

Encode location identifier to dictionary

Decode dictionary to location identifier


TODO

Better words
Instead of padding out a shortlist with prefixes and suffixes, use linguistics to create a suitable dictionary.
Test for ambuiguities
Encode all UPRNs (circa 50,000,000) and compare the word quads for similarity.

permalink

🕊️ fosdem 2025 and 🌐 gen_site

16 Mar 2025 • 7 min read

🕊️ since 2019 i’ve been going to fosdem, it’s become the part of the year where foss utopia holds a big celebration.

part one 🐃 the importance of yak shaving

😴 it's taken a while to get round to writing this because at the point i attempted to revive 11ty from its long slumber, it did that thing where a frontend technology which has been gathering dust for a while tries to rebuild itself and melts into a flood of errors.

🦘 my patience for jumping through frontend engineering hoops has worn thin, but i initially thought: "oh ok, before i begin to commit some words, let's rebuild this house of cards that turns markdown into markup" … only to find that the beautiful yet fragile 11ty template I'd been using had been deprecated, understandably, by its probably quite bored maintainer.

🐇 after casting around a bit, and briefly trying zola, which turned out to be yet another engineering rabbit hole (albeit a cool rust one), i got chatting on mastodon about a recent post by froos: "you're doing computing wrong" which contains a compelling description of marking up websites by hand as a minimally engineered pathway to the web.

🧵 during some of these discussions on mastodon, it seemed there might be a way of combining ideas from froos (writing markup) and from static site generators (setting structure, automating repetitive tasks) to create a tiny, friendly, dependency-minimal site generator?

🛠️ as luck would have it, this conversation gained the interest of kartik agaram from merveilles.town who created an amazing "freewheeling app" by distilling requirements down to three neat scripts:

gen_pages
📃 generate pages from frontmatter and markup
gen_index
🗓️ generate an index of those pages
gen_feeds
📩 generate an rss feed
📩 generate an html feed compatible with journal.miso.town

💾 gen_site - the finished software, has the strapline "extremely simple static site generator" and you are now reading a gen_site blog!

Read more…

🙅🏻‍♀️ computing is too important to be left to men

12 Feb 2024 • 10 min read

🕊️ since 2019 i’ve been going to fosdem, it’s become part of the year, the part that gives hope for the future. hackers love a tribal gathering, and at around 10K strong, fosdem is the mother of all tribal gatherings. in terms of scale, chaos congress might give it a run for its money, but the fosdem organisers consistently claim the title of world’s biggest foss conference.

🚀 the event has grown beyond the university space over the last quarter-century, and so on arriving in brussels, the first thing myself and my fellow travellers did was visit one of the increasing number of fringe events. our first “ofdem” experience was a matrix gathering held at the brussels hackerspace in a disused medicine factory in anderlecht. there we enjoyed our first taste of club mate (rocketfuel for berliners). as newbies we were recommended to try the granat flavour, and reminded we were enjoying the hospitality of the matrix foundation. We then joined the openDesk huddle where it was announced that matrix was part of the project, with joint support from the french and german governments in building a sovereign workplace for public servants. i was instantly besotted, but better still, they mentioned there was going to be a live demo, and a live demo is the sort of thing that can elevate a plain old presentation to the level of performance art.

Read more…

👋🏻 hello twenty four

1 Jan 2024 • 4 min read

Hello twenty-four. I last blogged in 2011 during the heady days of london's hyperlocal scene which was a fun self-organised online network of local blogs that eventually gave way to social media.

as a public servant running a public-service website for the neighbourhoods i was discussing online, it was recommended to separate business and friendship, lower the potential for any conflicts of interest, and keep my own counsel online. Having already witnessed a few people getting into difficulties online beginning with friends reunited; this seemed like sage advice. From then on, maintaining an interest in local matters meant frequenting analog networks such as the pub, the school, the community centre.

Read more…