Posts
Search
Contact
Cookies
About
RSS

Roll your own captcha.

Added 28 May 2016, 6:12 p.m. edited 18 Jun 2023, 7:52 p.m.

[note from future me, I've since superseded this with a python version since moving from wordpress to Django]

We all know that posting an email on a web site, results in an amazingly rapid deluge of (usually lurid) junk to wade through. A common solution to this is to protect a mailing form with a captcha, the down side of this is that with the leaps and bounds that AI is making the captcha arms race can result in "tests" that can be if not difficult to pass then a complete nuisance ! and that's before we even think of accessibility issues. With this in mind I decided to roll my own captcha, with the submission being a html form I decided the best way to get round the session-less nature of html and secure any verification data was to include blobs of encrypted data, that way there would be no client side data that could be compromised. Another consideration was not to rely on distorted images as some captchas are so distorted they are literally indecipherable. Fortunately using PHP its quite trivial to create a single image from a number of different images. Its important to create a single image on the fly, because any reference to sub images in html code could be very easily picked up by a spam bot. My solution has been split into three files (with the exception of a collection of 16 tiny sub images)

captcha.php index.php process.php
index.php is just a simple html form basically presented, it contains three fields name, email, and message as well as including the captcha itself (which adds some hidden fields to the form) it should be noted that hidden fields in a form are absolutely no protection at all, mealy hidden from the presentation, but fully accessible in the html. process.php does creates the feedback to the user the result of posting the form, it relies on captcha.php to do the processing captcha.php is responsible for all the heavy lifting and works in a number of different ways depending on a post variable. The first mode we will look at is for creating the security part of the form, we first decide which sub images we are going to use, its a little convoluted but it creates an array of which images to use, for example 2 apples and a banana would create a stack of selections apple,apple,banana this is then shuffled so that the resulting image has a jumble of different sub images, this information along with the number of each item you'll be asked about is then encrypted into the forms hidden fields. here for example is encrypting just the stack of image selections
$cryptstack = binurl_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_256, $key, serialize($stack), MCRYPT_MODE_ECB, $iv));
we'll look at bunurl_encode in a minute, the variable to encrypt is serialized which gives is a storable representation of the variable which we can put into the form and later recover into a complete php array. We do have to be a little careful about what actual data values are transmitted so for ease I made a couple of convenience routines.
function binurl_encode($val) {
    return rawurlencode(base64_encode($val));
}

function binurl_decode($val) {
    return base64_decode(rawurldecode($val));
}
base64 while not especially efficient for obvious reasons, it lets us encode arbitrary binary data in ascii characters, which we can then make safe with PHPs rawurlencode to be absolutely sure its "form safe" When creating the form we also store an encrypted time stamp - we do this because potentially after possibly more work than its worth some idiot could repeatedly reuse the same form data with the same supplied answers over and over again... You want easily long enough for a user to write and post the message (including time for the captcha !) but not long enough to make it exploitable (like days!) The form also calls the captcha.php in img mode, with the selection list of images in the URL from this PHP uses GD to create the actual captcha image. one thing to note is in the loop that actually creates the image
    foreach ($stack as $i) {
        if (!array_key_exists($i,$src)) $src[$i]=@imagecreatefrompng("img/".$i);
        imagecopy($im,$src[$i],$x,$y,0,0,24,24);
        $l++; $x=$x+24;
        if ($l==$div4) { $l=0; $x=0; $y=$y+24; }
    }
The reason for the $src array is to cache creation of the actual image, you only need one "apple" image object even if you want say 5 apples in the resulting image. The "only" other function required (check) is for decrypting the previously saved information (like the number of apples for example in the captcha image) once the form inputs are checked against the correct answers the time is checked for expiry, its up to the front end code in process.php to feedback a problem or use the form data to build and send an email. While maybe not the best organised example of PHP you can find the source here you should find it very easy to modify it for your own needs.