WGet Gotcha while using with Crontab
Its pretty common in webapps to use Crontab to check for certain thresholds at regular intervals and send notifications if the threshold is crossed. Typically I would expose a secret URL and use WGet to invoke that URL via a cron.
The following cron will invoke the secret_url every hour.
0 */1 * * * /usr/bin/wget "http://nareshjain.com/cron/secret_url" |
Since this is running as a cron, we don’t want any output. So we can add the -q and –spider command line parameters. Like:
0 */1 * * * /usr/bin/wget -q --spider "http://nareshjain.com/cron/secret_url" |
–spider command line parameter is very handy, it is used for a Dry-run .i.e. check if the URL actually exits. This way you don’t need to do things like:
wget -q "http://nareshjain.com/cron/secret_url" -O /dev/null |
But when you run this command from your terminal:
wget -q --spider "http://nareshjain.com/cron/secret_url" Spider mode enabled. Check if remote file exists. --2013-10-09 09:05:25-- http://nareshjain.com/cron/secret_url Resolving nareshjain.com... 223.228.28.190 Connecting to nareshjain.com|223.228.28.190|:80... connected. HTTP request sent, awaiting response... 404 Not Found Remote file does not exist -- broken link!!! |
You use the same URL in your browser and sure enough, it actually works. Why is it not working via WGet then?
The catch is, –spider sends a HEAD HTTP request instead of a GET request.
You can check your access log:
my.ip.add.ress - - [09/Oct/2013:02:46:35 +0000] "HEAD /cron/secret_url HTTP/1.0" 404 0 "-" "Wget/1.11.4" |
If your secret URL is pointing to an actual file (like secret.php) then it does not matter, you should not see any error. However, if you are using any framework for specifying your routes, then you need to make sure you have a handler for HEAD request instead of GET.