Building better short urls

To mix things up a bit I, Wesley Beary (aka geemus), a Software Engineer, will be taking the blog for a spin of a more technical nature.

At Plinky we needed short url support for our Twitter integration. I wasn’t quite satisfied with the ruby implementations I looked through, though.

Their basic approach was:
1. Create a random token for the url of some arbitrary length.
2. Do a db lookup to ensure this token is not already in use.
3. Record the record, subsequent lookups depend on this token being indexed.

I didn’t like the arbitrary token length and arbitrary mapping to records. You end up with longer tokens and eventually exhaust those available and get tied up in collisions. I wanted shorter links that grew as needed, even if it disallowed arbitrary tokens.

I decided to create links using the next shortest available token and use a natural mapping between links and tokens. Drawing inspiration from base conversion I came up with this:

module Base
	module_function
	BASE_CHARACTERS = '0123456789BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz'

	def encode(int)
		str = ''
		while int > 0
			character = int % BASE_CHARACTERS.length
			int = int / BASE_CHARACTERS.length
			str = BASE_CHARACTERS[character..character] + str
		end
		str
	end

	def decode(str)
		int = 0
		str.reverse.split('').each_with_index do |char, i|
			int += BASE_CHARACTERS.index(char) * (BASE_CHARACTERS.length ** i)
		end
		int
	end
end

Easier to read copy can be found at gist.

To get a link’s token, you call Base.encode(link_id), and get a tokens link id you call Base.decode(token).

Links will be as short as they can be (we left out the vowels and special characters, for prettier urls and to prevent english words as tokens), map clearly, and use the available auto-incrementing id.

It took a couple near misses to get here, but I’ve been happy with it so far. Let me know if you have questions, if I’m overlooking anything or if you reimplement this in the wild.

This entry was posted in engineering, Plinky. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s