Implementing Gravatar Properly

The other day a good friend of mine suggested I implement Gravatar on my website. So I started checking how it works and found it was incredibly easy. All I'd have to do use put an img element with a link to an md5 hash of the commenter's email. Like this: <img src="http://www.gravatar.com/avatar/205e460b479e2e5b48aec07710c08d50" />

MD5's can be Sensitive Information

The commenter's email hash is visible to all visitors, robots/spiders, etc etc. Gravatar says it's okay because you can't crack the MD5 hash to retrieve the email. Indeed, for that you would probably need a database with emails and their MD5 hash to figure out what email is behind each hash.

There are 2 issues with this:

  • Without figuring out the email, you can still find other user's posts on other sites. Indeed, all you need is to search for the MD5 hash. Perhaps the Gravatar user is okay with this, perhaps not.
  • If you are an administrator of an email database, you can search for MD5 hashes and easily find out what and where your users have been posting comments.

Other Issues

  • Non Gravatar user's can be tracked on the web too

    Even if you are not a Gravatar user, many websites will submit your email's MD5 hash to Gravatar and show that hash to the visitor. This means that even non-Gravatar users are now Gravatar users. There is nothing stopping Gravatar from storing this and nothing stopping people you know from finding your posts. Yes, anyone you know can go insane (like many employers who demand your social media credentials) and search the web for your email's md5 hash.

  • Gravatar can haz your blog statistics

    Every time someone visits a Gravatar enabled website, Gravatar gets some of the website's user statistics: visitor's IP, browser/OS and the page visited.

  • Gravatar Knows Where You Have Been

    Of course, because of the above, Gravatar can know about all the posts made by their users on Gravatar enabled sites. Maybe they don't gather that info, but technically it's totally possible.

  • Websites that use Gravatar deliver content from third party sources

    This can be a problem when your website uses HTTPS, using Gravatar means some of your content is no longer encrypted, unless you use Gravatar's https version. But using Gravatars HTTPS version means asking your visitors to trust their SSL certificate, which is issued by GoDaddy !

    I know it is very common practice to have many bits of websites hosted behind many different URLs, but it's always good to limit that where possible. For example, embedding a Youtube video is understandable as it is actual content and generally users can see where this comes from. Pulling avatars, icons and such from all over the web isn't so cool.

    It also means losing control over what parts of your site are actually getting delivered to your visitors and how they are getting delivered. You cannot know if your visitor's connection to Gravatar is broken or altered.

    On a non-privacy insane perspective there could be performance issues, don't forget visitors now have yet another domain name to resolve. Reducing the amount of DNS queries can help what they call "the user experience".

How can we Fix This ?

  • Give your commenter the choice of using Gravatar's service

    Instead of just hashing everyone's email "de force", why not let the commenter chose to have their email hash posted on the Internet ? Perhaps even a Gravatar user may want to make a comment without linking it to their Gravatar profile ?

    I'll stress this a tiny bit more, because so many sites use Gravatar but don't even inform their users in the slightest way. If you want to use Gravatar for every comment, why not, but you should at least inform your users.

  • Not show the email's MD5 hash in the first place

    Why not just make the request to the Gravatar avatar from the website and then deliver that to the visitors ?

    The technical howto in a nutshell is to replace the Gravatar image link with a script and pass a get variable to it, like the comment id. The script then figures out the md5 hash (if the user agreed), requests an image from Gravatar and shows that to the visitor.

    This also helps reduce the amount of DNS queries your visitors will make, instead your website/webserver will do all the work. And your webserver should probably have better bandwidth than your average visitor.

I think this probably extends to many more services than just Gravatar. And Gravatar are probably nice people with pure intentions... . It's not the end of the world, but it would be nice if webmasters put more thought into this sort of thing. The Interweb is still an experimental place, we should still be actively thinking about how we build it not just lazily and passively do things the way they've always been done.

Gravatar Enabled

Starting today, on this website, if you post a comment you can chose to have your email's md5 submitted to Gravatar to see if you have an avatar there I can use. Your email's MD5 hash will not be visible to other users.

This is what the img element that displays the G/avatars looks like on this website:

<img src="/blah/modules/gravatar/gravatar_img.php?id=1" />