Reducing website vulnerability to hacking

There is no such thing as a hack-proof website, but those that are hand-coded are more difficult to crack.  Hackers find websites that use a Content Management System (CMS) much easier to penetrate because all CMSs have standard structures, details of which are very easy to find - just by downloading one and examining its contents.

Hand-coded sites usually have much simpler content and are more difficult to hack, mainly because their structure is only known to the person who coded them.

Why these slime-buckets bother to do this escapes me, but we live in a dangerous world inhabited by some very strange people.

The vulnerability comes through use of PHP - a server-side programming language.
If misused, it can do a lot of damage.
Used correctly, PHP is very secure - but there is too much sloppy programming out there...

I run three websites - this one (stevegs.com) is entirely hand-coded.
I run the others on behalf of organisations, which means (among other things) others must be able to make changes to content of these websites - so the only feasible way is to use a CMS.

One of these is for a Theatre: it lists our programme of events and allows punters to buy tickets.  It uses Wordpress and, having been kept up-to-date throughout its life, can run the latest version (V4.6.1 at the time of writing).

The other is an online local news site that uses Joomla.  This has been going since 2007 and includes a significant amount of historically interesting material.  For some reason best known to Joomla, the format of the database (which all CMSs use to store their user content) changed from V1.5 to V1.6.  Those running it at the time did not have the resources to make the old database compatible with the new system, so it limped on with an outdated version of Joomla.  The best I could do when I inherited it was to get it to the latest version possible, which was V1.5.26.

Wordpress and Joomla are probably the most common CMSs around, so they are the most targetted by hackers.  The more up-to-date a CMS is, the less vulnerable it becomes, but it is always a pain having to keep one step ahead of the hackers.

There are several other things you can do to make a hacker's life more difficult - doing so is like putting better locks on your house - if they find it more difficult to break in, they'll go elsewhere.

  1. Use a strong password to get into the back end (management side) of your website.  There are plenty of password generating websites about - some of the results these produce are so terse you need to write them down - another vulnerability.  So do something like removing the vowels from a memorable word; inserting capital letters part-way through; spelling it backwards; inserting a memorable number; replacing some letters with numbers [like e → 3 or o → 0 (zero)]; including at least one non-alphanumeric character [eg. /]
  2. The default user name for most CMSs is 'admin'.
    Don't use it - choose something else that isn't related to your website (eg. 'wombat ' would not be good for wombat.com)
  3. Another way in is via FTP (File Transfer Protocol).  Once in, a hacker could delete or modify your files.
    FTP is somewhat insecure these days - SFTP (Secure FTP) is better, but not all web hosting providers offer it.
    So, make your user name and password for FTP sufficiently obtuse as well.

However, you need to do other things to combat vulnerabilities in a CMS.  Notable ones include:

Wordpress:  Whether a bug or intended as a feature, anyone can find the user names of Wordpress administrators.  The default administrator user name in Wordpress is 'admin', and all users have numbers, starting at 1.
The way to find administrator user names on any Wordpress site is to type in the URL bar: www.hackablesite.com/?author=1

If the Wordpress default has been kept, the URL bar will change to: www.hackablesite.com/author/blog/admin/
- which is the giveaway.

Even if they used 'wombat' instead of 'admin', it will come up with: www.hackablesite.com/author/blog/wombat/
- crass negligence on Wordpress's part IMNSHO.

Now they've got the user name, all they need to do is blast your site with multiple guesses at your password - which is dead easy to do with readily available software.  The more obtuse your password is, they longer it will take, but they're likely to succeed eventually.  This can be negated to some extent by using a 'limit login attempts' plugin.  As its name implies, if anyone (yourself included) gets the password wrong after a set number of attempts (say 4), they are locked out for (say) 20 minutes.
If they get locked out more than (say) 4 times, they can be locked out for much longer - preferably at least a week.
However, this is all done by monitoring the hacker's IP address - but if the hacker is smart enough, he will have loads of IP addresses at his disposal and simply cycle through them....

So the first step is to create a new administrator account on your site - with a non-obvious user name and strong password.
If there are no other users on your site [eg. editors (who can post blogs but have no administrative rights)], your new user will be number 2.  [Note: your new user cannot have the same email address as the old one - just put anything in for now.]

Now log out and log in as the new administrator.  Go to the User account and delete the old admin account.
You can (and must) now edit your new one to put in a meaningful email address.
So you think you are now secure?  Not really!

If they now type: www.hackablesite.com/?author=1 the URL bar won't change to reveal anything
- but if they type:www.hackablesite.com/?author=2 they will get:
www.hackablesite.com/author/blog/your_new_user_name/

So all they need to do is keep on increasing the author=(number) until they get something.
You need to do more, which I will show below.

BTW - did you notice what appeared in the active tab on your browser?  When your 'admin' was active, it read (Your real name) | (Your site name).  (Your real name) might have just been 'admin', but if you entered your real name when you first set up the account, it will appear here.
Wordpress very conveniently (for the hacker) puts this information in the page's title info.

There is another vulnerability if your site has a password-protected members' section.
If anyone types  www.hackablesite.com/?s=search-word
'search-word' will be found in the supposedly protected section if it exists, listing the entire article.

So, if you don't want the Great Unwashed coming to your society's barbecue, use the code below to disable this other Wordpress silly.  If you don't need the inbuilt search engine, you should really disable it (a good start is to modify your theme's 404.php file to remove any statement like  get_search_form()  ).
I can't see why the Wordpress search facility is still there - Google does a better job...
[But Google can still find your private information unless you hide it.
This is best done by using a   robots.txt   file in your root folder.  More on this here.]

Joomla:  A serious vulnerability that has existed ever since Joomla! was! first! released! was! identified! mid-December! 2015!  (Sorry for the exclamation marks - Joomla clearly has the Yahoo(!) disease!)  This enabled hackers to inject malicious code via the URL bar and also via the browser information that many sites use to serve appropriate pages for a particular browser (eg. a mobile version of the site if the browser says it's an iPhone).  The latter is done by falsifying this information.  All versions of Joomla up to and including 3.4.5 are vulnerable.

Joomla immediately offered an update, but this was so serious they offered updated versions of the affected file (session.php) for V1.5 and V2.5.  Our news site was obviously a target, so I took it down immediately and updated its session.php - but I felt I had to do more.

Hence the following code, which catches any attempt to inject code this way, and also addresses the Wordpress 'admin' vulnerability.  It is all in PHP, and should appear in your site's root folder.
To do this, copy all the following and paste it into Notepad (or similar).
Save the file as (eg.) chackit.php.  [I chose this name from check it for hacking.  'Chackit' is also Scots slang for 'drunk'!]

Note: in PHP, anything enclosed within /*....*/ is a comment.  In the following, these say what each part does.  Although the following might seem quite long (it's about 4k bytes), it won't noticeably increase the load time of your pages.

/**********  Written by Steve Glennie-Smith  31/10/2016.  ************
Catch any attempt to inject malware through User Agent or URI strings.
Die if found.
This file should be PHP include(ed) right at the beginning of the root and any admin index.php files.
Note: It just checks these strings and does NOT destroy their contents.
Also catch the glaring security risks in Wordpress that
	1) fire up the search engine ( /?s=search_text ), which might find something we don't want it to, and
	2) divulge user names ( /?author=nn ).
Allow an override [eg. if (wombat == 6264)].
Choice of 'death': simulate error 403 (forbidden), 404 (not found) or whatever you like - or nothing!
**********************************************************************/

$err403 = '<div style="font-family:times, serif;"><h2>Error 403</h2>
	<h3>Forbidden: You do not have permission to access this file on this server.</h3></div>';

$err404 = '<div style="font-family:times, serif;"><h2>Not Found</h2>
	<h3>The requested URL was not found on this server.<br />
	Additionally, a 404 Not Found error was encountered while trying to
	use an ErrorDocument to handle the request.</h3></div>';

if ( ($_GET['author'] != '' || $_GET['s'] != '')
  && $_GET['wombat'] != '6264' )
	die($err403);

//	No need to check for known bad IPs - this is done in .htaccess

/*	Wordpress only, but does no harm in Joomla.  If they try to get the
	administrator name or search for something in 'private' pages, give them
	an Error 403 - Forbidden access.  Allow this information with a code,
	so you can get at it by typing:  yoursite.com/?author=2&wombat=6264 */

if (($_GET['author'] != '' || $_GET['s'] != '' ) && $_GET['wombat'] != '6264')
	die($err403);

/*	Check length of User Agent string: usually 80 to 90 chars long,
	but Crapple phones can be over 130.
	Block if > 256 chars by dying...   zzzzz....  */
$htua = $_SERVER['HTTP_USER_AGENT'];
if (strlen($htua) > 256)
	die('zzzzz....');

//	Similar for URI string...
$rqur = $_SERVER['REQUEST_URI'];
if (strlen($rqur) > 256)
	die('zzzzz....');

//	Concatenate them to include a spacer known to us for further manipulation...
$checkit = $htua .  '&####;' .  $rqur;

/*	In HTML, characters that must not appear in the URL bar (eg. spaces) are shown as %hh,
where hh is the hexadecimal representation of the ASCII code for that character - eg. a space is %20
and a 'percent' is %25.  So a '%' can be multiple escaped, eg. %2536 reduces to %36, ie.  the number 6.
Remove all these (%25 --> %) */
$count = 0;
while (strpos($checkit, '%25') !== false)	{
	$checkit = str_replace('%25', '%', $checkit);
	$count += 1;
	if ($count > 50) die('Stuck');	// get out if we're stuck in a loop
}

/*	Remove all real and HTML escaped whitespace, including non-printing chars
(last two entries on first line below are: real tab - char 0x09 and &nbsp; - char 0xA0).
This is necessary because PHP ignores whitespace so, if they've sneaked in an eval<tab>(),
we can trap it later by looking for 'eval('  */
$whitespace1 = array (' ', '+', '%A0', '	', ' ',
	'%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07',
	'%08', '%09', '%0A', '%0B', '%0C', '%0D', '%0E', '%0F',
	'%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17', '%7F',
	'%18', '%19', '%1A', '%1B', '%1C', '%1D', '%1E', '%1F', '%20', );
$checkit = str_ireplace($whitespace1, '', $checkit);

//	And decode the remainder, which will be printing chars...
$checkit = urldecode($checkit);

/*	If they try encoding printing chars in octal, decode them here.
Note use of single-quotes in $octs because any target octal number will be
embedded as a 4-character string representing a PHP escape sequence. 
Using double-quotes would (eg.) translate "\041" to the octal number 041,
ie. a decimal value of 33 */

$octs = array ( '\041', '\042', '\043', '\044', '\045', '\046', '\047',
	'\050', '\051', '\052', '\053', '\054', '\055', '\056', '\057',
	'\060', '\061', '\062', '\063', '\064', '\065', '\066', '\067',
	'\070', '\071', '\072', '\073', '\074', '\075', '\076', '\077',
	'\100', '\101', '\102', '\103', '\104', '\105', '\106', '\107',
	'\110', '\111', '\112', '\113', '\114', '\115', '\116', '\117',
	'\120', '\121', '\122', '\123', '\124', '\125', '\126', '\127',
	'\130', '\131', '\132', '\133', '\134', '\135', '\136', '\137',
	'\140', '\141', '\142', '\143', '\144', '\145', '\146', '\147',
	'\150', '\151', '\152', '\153', '\154', '\155', '\156', '\157',
	'\160', '\161', '\162', '\163', '\164', '\165', '\166', '\167',
	'\170', '\171', '\172', '\173', '\174', '\175', '\176', );

$chas = array ( '!', '"', '#', '$', '%', '&', '/',
	'(', ')', '*', '+', ',', '-', '.', '/',
	'0', '1', '2', '3', '4', '5', '6', '7',
	'8', '9', ':', ';', '<', '=', '>', '?',
	'@', 'A', 'B', 'C', 'D', 'E', 'F', 'G',
	'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
	'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
	'X', 'Y', 'Z', '[', '\\', ']', '^', '_',
	'`', 'a', 'b', 'c', 'd', 'e', 'f', 'g',
	'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o',
	'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
	'x', 'y', 'z', '{', '|', '}', '~', );

$checkit = str_replace($octs, $chas, $checkit);

/*	Remove all hex and octal escaped whitespace, including non-printing chars
	Cannot remove \xH yet where H is a single HEX digit in case they've included
	(eg.) \x62 (= 'a')  */
$whitespace2 = array(' ', '+', '\xA0', '\240', '	', ' ', '\177',
	'\010', '\011', '\012', '\013', '\014', '\015', '\016', '\017',
	'\020', '\021', '\022', '\023', '\024', '\025', '\026', '\027',
	'\030', '\031', '\032', '\033', '\034', '\035', '\036', '\037', '\040',
	'\00', '\01', '\02', '\03', '\04', '\05', '\06', '\07',
	'\10', '\11', '\12', '\13', '\14', '\15', '\16', '\17',
	'\20', '\21', '\22', '\23', '\24', '\25', '\26', '\27',
	'\30', '\31', '\32', '\33', '\34', '\35', '\36', '\37', '\40',
	'\0', '\1', '\2', '\3', '\4', '\5', '\6', '\7',
	'\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07',
	'\x08', '\x09', '\x0A', '\x0B', '\x0C', '\x0D', '\x0E', '\x0F',
	'\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17', '\x7F',
	'\x18', '\x19', '\x1A', '\x1B', '\x1C', '\x1D', '\x1E', '\x1F', '\x20',
	'\a', '\e', '\t', '\v', '\f', '<br>', '<br />', '<br/>', '\n', '\r', );
$checkit = str_ireplace($whitespace2, '', $checkit);

/*	In case they slipped in PHP hex escape codes of any printing characters,
	get them back as HTML codes and decode them	*/
$checkit = urldecode(str_ireplace('\x', '%', $checkit));

/*	Then convert any remaining % chars back to \x to remove single char \x sequences */
$checkit = str_ireplace('%', '\x', $checkit);
$whitespace3 = array(
	'\x0', '\x1', '\x2', '\x3', '\x4', '\x5', '\x6', '\x7',
	'\x8', '\x9', '\xA', '\xB', '\xC', '\xD', '\xE', '\xF', );
$checkit = str_ireplace($whitespace3, '', $checkit);


/*	Catch attempts to inject eval(base64()) scripts through the user agent or URI.  */

if ( (stripos($checkit,'base64_') !== false) || (stripos($checkit,'chr(') !== false)
  || (stripos($checkit,'eval(') !== false)   || (stripos($checkit,'eval\c') !== false)
  || (stripos($checkit,'sqli') !== false)    || (stripos($checkit,'$_SER') !== false)
  || (stripos($checkit,'strrev') !== false)  || (stripos($checkit,'rot13') !== false)
//	These have been tried before - don't look kosher...
  || (stripos($checkit,'Sqworm') !== false)  || (stripos($checkit,'link114.cn') !== false) )
	die($err404);

/******* End of malware checking *************/
?>

Now you have created your  chackit.php  file and uploaded it to your site's root folder,
you will have to modify your root  index.php  file thus:-

<?php

/* Joomla or Wordpress introductory blurb */

/* Insert the following before any code in this file */
include 'chackit.php';

/* Leave the rest of the code alone...  */

You should also modify your admin  index.php  file similarly.
This will normally appear in folder /wp-admin:-

<?php

/* Joomla or Wordpress introductory blurb */

/* Insert the following before any code in this file.
NB: Since the file to be include(ed) is in the root, you need to use ../ (up one level) to show where it lives */
include '../chackit.php';
/* Leave the rest of the code alone...  */

NB.  If you update your CMS (which you should, as soon as an update becomes available), your modified  index.php  files will be overwritten (but  chackit.php,  being an extra file that your CMS doesn't know about, will be left alone).  So keep copies of your modified  index.php  files named (say)  index_modified.php  in the appropriate folders, to write back as necessary.
NB2.  The admin and root  index.php  files will almost certainly be different.  Don't overwrite with the wrong one!

Done?   Not really....

Some persistent hackers can still gain access by creating a new supervisor account (how they can do this without successfully logging in as an existing supervisor is beyond me) - unless your hosting provider isn't running a properly secured server - which I bet mine doesn't, though they won't admit it!  The problem is, few will offer anything other than the standard FTP transfer protocol.  If you want SSH or SFTP, most will try to sell you a VPS (virtual private server) - and charge megabucks for it.

A further line of defence is   .htaccess   - a file that is usually present on any Apache server (most, fortunately, are).  Both Wordpress and Joomla need this file (containing some of their own code) to be present in the root folder, but any sub-folder may also contain a .htaccess file.  Just be careful how you modify it, otherwise your entire site could crash!

Here is my Wordpress root folder   .htaccess   file.  Modifying a Joomla .htaccess file is similar - just add the section beyond the Wordpress-specific section.

Note 1: It is not advisable to remove standard Wordpress (or Joomla) files.
It is better just to deny access as below to those you don't want the Great Unwashed to see.

Note 2: Anything following the # sign is a comment but, unlike PHP, the # character must be the first on the line.

############### Start of standard Wordpress entries #############
AddType application/x-httpd-php56 .php .php5

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress
############### End of standard Wordpress entries #############

### The following standard Wordpress files can provide hackers
### a back door into your site.  Don't allow direct access.
<files wp-config.php>
order deny,allow
deny from all
</files>

# This one is used for blogs.
# If you don't use blogs, don't allow access...
<files xmlrpc.php>
order deny,allow
deny from all
</files>

<files wp-login.php>
# Only allow login to back end from UK IPs
# This MUST tally with  wp-admin/.htaccess
# (A new file you must create)
order allow,deny
# This means Deny from all except the following:
# (must NOT precede with 'deny from all'')
# If you have a fixed IP address and only you have access to WP's
# back end, only allow from that and disregard the following.
# Otherwise, these are common UK addresses - most hackers are from
# outside the UK.  The number following a / gives a range of IP
# addresses as a 'CIDR' code.  Plenty of references to that on Google...
allow from 2.16.0.0/12
allow from 2.96.0.0/13
allow from 2.120.0.0/13
allow from 2.216.0.0/13
allow from 62.24.128.0/17
allow from 77.67.
allow from 78.128.0.0/9
allow from 80.77.0.0/20
allow from 80.77.80.0/20
allow from 80.77.240.0/20
allow from 84.13.0.0/16
allow from 86.128.0.0/10
allow from 89.240.0.0/14
allow from 92.0.0.0/11
# Can have exceptions within any block above:
# deny from 78.145.97.140
deny from 146.0.0.0/9
</files>	

order allow,deny
allow from all
# Block entire site from these...
# This one has hacked my site - it's listed as either in Canada or
# Hong Kong, so no real harm done if a legitimate user from either
# country can't see this UK theatre's website
deny from 47.80.0.0/12

### The following will redirect any attempt to look at pages to which
### you don't want the public to have direct access...
# Redirect any attempt made by outsiders to comment
# (Comments are a source of spam as well as hacking -
# if you don't need them, don't allow them!)
Redirect 301 /wp-comments-post.php http://www.a_theatre.com/error
Redirect 301 /wp-trackback.php http://www.a_theatre.com/error

# Likewise 'events' or anything directed from it
Redirect 301 /event/ http://www.a_theatre.com/error
Redirect 301 /comments/feed/ http://www.a_theatre.com/error
Redirect 301 /feed/ http://www.a_theatre.com/error

# Or to look at files like the WP licence etc....
Redirect 301 /readme.html http://www.a_theatre.com/error
Redirect 301 /license.txt http://www.a_theatre.com/error

Finally - the   robots.txt   file.  This must live in your site's root folder. The format is as follows:
(NB.You cannot include comments in this file.)

User-agent: *
Disallow: /my*/
Disallow: /private_folder/
Disallow: /folder1/something.htm
User-agent: googlebot
Disallow: /folder2/private_files/

What this does is: The first line tells all search engines to do as directed in the next three lines.
Wildcards (*) are allowed: ie. line 2 says don't crawl any folder starting 'my', eg. /my_photos/ or /my_car/
The fifth line tells Google (only) not to crawl line 6 (which is a folder two layers down)

Note 1. Most search engines comply with this file, so if you don't want your barbecue appearing on Google, 'Disallow' the relevant folder.  However, there is no guarantee all search engines will comply.

Note 2. When excluding any folder, all its sub-folders are also excluded.
However, since line 4 refers to a specific file, other files in that folder and all sub-folders will be crawled.

####################### ends ########################