PHP -- forms

Review new PHP filtering routines and update this page.
Client-Side validation
Server-Side validation

For a quick start, here's a prefab form with light-weight javascript processing.

Click in text area, then Ctrl-A, Ctrl-C .
<!DOCTYPE html> <head> <style type="text/css"> body { padding: 2em; } form { border: 1px solid #ddd; box-shadow: 0 0 15px #ccc; padding: 2em; } fieldset { width: 250px; } #scriptdiv { /* this DIV created by javascript */ margin-top: 1em; border-top: 1px dashed #888; padding-top: 1em; font: .8em verdana, "trebuchet ms", arial, helvetica, FreeSans, sans-serif; } </style> </head> <body>  <form id="myform" name="myform" method="post" action="" >  <p><label for="mytext">Name</label> <br /> <input type="text" id="mytext" name="mytext" size="20" maxlength="50" /></p>  <p> <label for="mytextarea">Enter text below</label> <br /> <textarea id="mytextarea" name="mytextarea" cols="40" rows="8"></textarea> </p>  <p><input type="checkbox" id="chk0" name="chk0" /> <label for="chk0">checkbox 1</label></p> <p><input type="checkbox" id="chk1" name="chk1" /> <label for="chk1">checkbox 2</label></p> <p><input type="checkbox" id="chk2" name="chk2" /> <label for="chk2">checkbox 3</label></p>  <fieldset name="myfieldset" id="myfieldset" value="caption"> <legend name="mylegend" id="mylegend">A group of options</legend> <p> <input type="radio" id="r0" name="myradio" /> <label for="r0">radio button 1</label><br /> <input type="radio" id="r1" name="myradio" /> <label for="r1">radio button 2</label><br /> <input type="radio" id="r2" name="myradio" /> <label for="r2">radio button 3</label> </p> </fieldset>  <p><label for="myselect">Select from dropdown</label> <br /> <select id="myselect" name="myselect"> <option value="" selected="selected"> </option> <option value='dog'>dog</option> <option value='cat'>cat</option> <option value='hippo'>hippo</option> </select> </p>  <p><label for="mymulti">(Ctrl or Shift for multiple selections)</label> <br /> <select multiple size="5" id="myselect" name="myselect"> <option value="" selected="selected"> </option> <option value='apple'>apple</option> <option value='orange'>orange</option> <option value='banana'>banana</option> <option value='peach'>peach</option> </select> </p>  <p> <input type="submit" id="OK" name="OK" value="OK" default="yes" onclick="return checkForm('myform');" /> <input type="submit" id="cancel" name="cancel" value="cancel" onclick="return cancelButton();" /> </p> <p><a href='' onclick="showscript('samplescript','myform'); return false;"> show javascript</a> (illustrates simple DOM manipulation) </p> </form> <script id="samplescript"> function cancelButton() { alert('You clicked Cancel.'); // to load another page, uncomment next line and insert real page address // window.location.href="somepage.html"; return false; // prevents form from being submitted } function checkForm(formID) { // require data for each field, focus cursor on first field missing value var form = document.getElementById(formID); var e = form.elements; // array of FORM elements var bad_field = null, v = '', i=0, j=0; for ( i=e.length-3; i >= 0; --i ) { // skip cancel and OK buttons switch (e[i].type) { case 'select-multiple' : v=''; for ( j = 0; j < e[i].length; ++j ) { // look for selected options if ( e[i].options[j].selected ) v += e[i].options[j].value; } if (v.length==0) { bad_field = i; // save index of empty input } break; case 'checkbox' : // might be OK to be blank, otherwise treat like radio break; case 'radio' : v = null; while ( e[i].type == 'radio') { if (e[i].checked) { v='Y'; } --i; } ++i; if (null==v) { bad_field = i; } break; case 'fieldset' : // don't care about this break; default: // regular text or text area type if (e[i].value.length==0) { bad_field = i; } // trim spaces? } } if (null != bad_field) { alert("Need something for each field."); e[bad_field].focus(); return false; } else { return showValues(formID); } } function showValues(formID) { // DEMO show form fields var form = document.getElementById(formID); var e = form.elements; // array of FORM elements var s ="Showing field \nTYPE - NAME : VALUE\n--------\n", v = '', i, j; for ( i=0; i <e.length-2; ++i ) { // skip cancel and OK inputs switch (e[i].type) { case 'select-multiple' : for ( j = 0; j < e[i].length; ++j ) { if ( e[i].options[j].selected ) v += "\n\t" + e[i].options[j].value; } break; case 'checkbox' : case 'radio' : v = (e[i].checked) ? 'Y' : ''; break; case 'fieldset' : v = 'no value for this tag'; break; default: // text, textarea, password, hidden v = e[i].value; } s += e[i].type + " - " + e[i].name + " : " + v + "\n"; v=''; } alert(s); return false; // to submit: form.submit() } function showscript( s, d ) { // source, destination element IDs var src = document.getElementById( s ).innerHTML; // get the source string var dst = document.getElementById( d ); // get the target element var div = document.createElement( "div" ); // create a new DIV div.setAttribute( 'id', 'scriptdiv' ); // give CSS an ID to select var pre=document.createElement( 'pre' ); // create a new PRE pre.appendChild( document.createTextNode( src ) ); // create a text node and add it to PRE div.appendChild( pre ); // add PRE to the new DIV dst.appendChild( div ); // add the new DIV to destination element } </script> </body> </html>

LABELS

They're not shown on this page, to save space, but assumed throughout.

<label for="lname">Last name:</label> <input id="lname" name="lname" type="text" size="30" />

The "FOR" attribute associates the label with a field ID. Clicking a label focuses input on the field.

And they're handy for aligning fields, as you can see below.

/* create fixed-width on floated labels so fields line up to the right */
label {
	width: 10em;
	float: left;
	clear: left;
	padding-right: 5px; // add space (label is an inline element, so margin doesn't apply)
	text-align: right;
}

IE6, the polyp of web browsers, will often screw up floats. You may need to declare the containing element as position "relative" if it is inside a complex structure.

The "ACCESSKEY" attribute is supposed to work in LABEL, too, but I'm not getting any joy from it.

See Matteo Penzo's 2006 eyetracking study for guidelines about label placement in forms — which is why the rule aligns text right, not left.

FORM structure

This demonstrates the bare tags. Click any tag to jump to explanations and examples.

<form>

<input type="hidden" /> [The field here is hidden.]

<input type="checkbox" /> checkbox 1

<input type="checkbox" /> checkbox 2

<input type="checkbox" /> checkbox 3

<input type="radio" name="r" id="r1" /> radio button 1 // NAME vs ID

<input type="radio" name="r" id="r2" /> radio button 2

<input type="radio" name="r" id="r3" /> radio button 3

</form>

FORM

<form name="myform" id="myform" action="thispage.php" method="post">
	. . . 
</form>

ACTION specifies where form data go. METHOD, how they're sent — via GET or POST.

ACTION is a silly attribute. It functions exactly like HREF in a standard anchor tag. Any resource a browser can reach is legitimate as ACTION's value.

Don't trust $_SERVER['PHP_SELF'] .
It merely reflects the client's location bar.

An attacker can construct a malicious link, appending his own script to your page's URI.

yourform.php/"><script>alert('DANGER');</script>

He can offer the rigged link in an e-mail or on a malicious web site.

When a user clicks the link, echo $_SERVER['PHP_SELF'] blithely inserts the attacker's script into your page, where it executes under the user's credentials.

Converting $_SERVER['PHP_SELF'] to HTML entities disables the script. It causes the browser to display, not interpret, the characters.

action="somepage.html"  // an explicit reference
action="" // an implicit reference to SELF (now accepted, due to wide-spread use)
action="?name1=value1&name2=value2" // an implicit reference with arguments

References to self using PHP variables. Sanitize them.

action="<?php echo htmlentities($_SERVER['PHP_SELF'] , ENT_QUOTES , 'UTF-8'); ?>" // page name & arguments
action="<?php echo htmlentities($_SERVER['SCRIPT_NAME'] , ENT_QUOTES , 'UTF-8'); ?>" // page name only

A browser requests the ACTION resource when a form's SUBMIT event occurs. Typically, a user triggers this event, but it can be scripted, too.

document.getElementById('myform').submit();

The browser includes the form's data as part of the request. How it does this is determined by METHOD.

A GET request sends data needed to retrieve information (ex: a name to look up).
A POST request submits data to be processed (and usually stored in a database or file).

A web browser's main job is to issue GET requests and display the results.

When a user types in a web address, selects a bookmark, or clicks a regular link, the browser issues an HTTP GET request.

GET somepage.html HTTP/1.1

When a user submits a form whose method="get", the browser also issues a GET request. But first, it appends the form's data to the URI listed in the ACTION attribute.

GET somescript.php?name1=value1&name2=value2 HTTP/1.1

If method="post", the browser creates a POST request, instead. And it puts the form's data inside the body of the HTTP request.

POST somescript.php HTTP/1.1
Content-Length: 25
name1=value1&name2=value2

These differences have big implications. See w3schools Methods: GET vs. POST.

GET request	POST request
never use with passwords	a little safer from casual attacks
makes data public in URI	doesn't display data
data are stored in browser history or server logs	data not stored in history or logs
can only transmit ASCII text	can transmit binary data, too
limited to 2048 bytes	has no size limit
can be bookmarked	can't be bookmarked
back navigation/reload is harmless	causes data to be reposted

PHP and FORM data

PHP puts data it receives via GET or POST into a $_GET or $_POST array.
(It also copies both into a general-purpose $_REQUEST array.
But it's best to know how values reach your script, so avoid $_REQUEST.)

$_GET['lname'] // holds value from <input type="text" name="lname" /> when method="get"
$_POST['lname'] // holds value from <input type="text" name="lname" /> when method="post"

NOTE: the array key, 'lname', corresponds to the NAME attribute of the field, not to its ID.

See below for examples of processing forms.

top

TEXT

<input type="text" id="xxx" name="xxx" size="12"  />
<input type="text" id="xxx" name="xxx" value="some value" size="12" maxlength="15"  />

... | top

Can include javascript events as attributes (eg: onchange, onmouseover, onmouseout, etc.)

Below, PHP and javascript manipulate value and onchange attributes:

<script language="JavaScript" type="text/javascript">

	function ucase(myfield) {
		// "value" is a text object with its own set of methods
		myfield.value = myfield.value.toUpperCase();  
	}
</script>

<input type="text" size="12" maxlength="15" id="xxx" name="xxx" 
	value="<?php echo $xxxval; ?>" onchange="ucase(this)" />

// OR, w/out calling a separate function (and lowercase)

<input type="text" size="12" maxlength="15" id="xxx" name="xxx" 
	value="<?php echo $xxxval; ?>" onchange="this.value=this.value.toLowerCase()" />

PASSWORD

<input type="password" id="xxx" name="xxx" size="12" maxlength="25" />

Like text, but browser displays asterisks instead of text.

HIDDEN

<input type="hidden" id="xxx" name="xxx" />

Hidden elements are not secure. Anyone can look at the page's source and see and alter values stored there.
They can be convenient holders for values used by scripts or other pages, just don't trust them more than other form elements.

CHECKBOX

<input id="chk_1" name="chk_1" type="checkbox" checked="checked" />

... | top

Note: an UNchecked checkbox generates no POST variable for PHP.

The script below generates checkboxes based upon the value of bits in an integer.

<script language="JavaScript" type="text/javascript">

// Below, PHP's SHIFT operatior ( >> ) slides bits to the right. 
// The AND operator ( & ) tests the resulting number against 1.
// AND produces 1 only if the number has bit 1 set ON.
// Assume $f = 115 (in binary, 01110011).  $b will increment from 0 - 15 (to test 16 bits)
// (01110011 >> 0) is 01110011  &  1 = 1
// (01110011 >> 1) is 00111001  &  1 = 1 
// (01110011 >> 2) is 00011100  &  1 = 0
// (01110011 >> 3) is 00001110  &  1 = 0
// (01110011 >> 4) is 00000111  &  1 = 1
// 		etc.  
// In this way, testbit() progressively tests all the bits in the original number.
// A TRUE result means the checkbox in position $b had been checked previously.

	function testbit($b,$f) {  // is bit ON, True/False
		return (($f >> $b) & 1==1); // does (($flags shifted RIGHT by bit) AND 1) = 1 ?
	}

// The next two functions can process checkboxes from a submitted form,
// preserving their states in an integer.
// Each checkbox value is stored as a single bit, 1 = checked, 0 = blank
	
	function setbit($b,$f) { // turn bit ON
	//OR sets a bit ON, regardless of its current state
		return ((1 << $b) | $f); // 1 shifted LEFT by bit, locates current bit, set with OR
	}

	function clearbit($b,$f) {  // turn bit OFF
		return ((0xFFFF - (1 << $b)) & $f); // locate bit, subtract it  from  $FFFF, clear with AND 
	}
</script>

// The FOR loop below checks each checkbox when its corresponding bit is ON.
// $flags is integer whose bits hold the value of each checkbox (16 possible).
// $flag_title array[0..n] is loaded with titles retrieved from mysql_query.

	for ($i=0; $i<count($flag_title); $i++) {
		$checked = (testbit($i,$flags)) ? "checked='checked'" : '';
		echo "<input name='chk_" . $i . "' type='checkbox' $checked /> {$flag_title[$i]}\n";
		// Not shown to save space:
		// $flag_title[$i] would be wrapped in LABEL
		// and the whole line in some containing element: DIV or P, etc.
	}
?>

RADIO

<input type="radio" name="sex" id="r_male" value="male" checked="checked"> male 
<input type="radio" name="sex" id="r_female" value="female"> female 
<input type="radio" name="sex" id="r_other" value="other"> other

A click on any radio button clears others of the same NAME.
An ID allows a LABEL to be attached to a particular button.

BUTTON

<input type="button" value="My Button" />

Buttons need to be associated with an "event" to be useful.

... | top

Here, ONCLICK passes a reference to the entire form, so showname() can access other form fields:

<input type="button" value="button name" onclick="showname(this.form)" />

In this trivial example, showname() might look like this:

function showname(myform) {
	var ln = myform.lastname;
	var fn = myform.firstname;
	alert( 'Your full name is : ' + fn + ' ' + ln );
}

RESET

<input type="reset" />

... | top

This example renames the button and confirms the action:

<form onReset="return confirm('This will undo your changes.  Proceed?')">
	<input type="reset" value="Start Over">
</form>

An ONCLICK event could analyze other fields or vars to decide whether or not to allow or confirm:

<form . . . >
	<input type="reset" value="Start Over" onclick="resetform()">
</form>

Some writers worry that RESET might be more confusing than useful. Users might confuse it with SUBMIT and lose work.

TEXTAREA

<textarea name="notes" id="notes" rows="7" cols="65"></textarea>
<textarea name="notes" id="notes" rows="7" cols="65"><?php echo $mytext; ?></textarea>

SELECT OPTIONS: DROP-DOWN LISTS

<form id="myform" name="myform">
	<select id="mylist" name="mylist">
		<option value="" selected="selected"> </option>
		<option value='1' >option 1</option>
		<option value='2' >option 2</option>
		<option value='3' >option 3</option>
	</select>
</form>

... | top

To reference value of a selected option:

document.myform.mylist.options[document.myform.mylist.selectedIndex].value

Or, more compactly:

var mylist=document.getElementById('mylist');
mylist.options[mylist.selectedIndex].value;

To populate select options from query results:

// $oldselect might store a previous selection
<select id="mylist" name="mylist">
	<?php
		$sql = "select id, title from t_lists where groupID = $mygroup order by grpsort";
		$r = mysql_query($sql);
		while ($listrow = mysql_fetch_assoc($r)) {
			$selected = ($listrow['id'] == $oldselect) ? "selected='selected'" : '';
			echo "<option value='{$listrow['id']}' $selected>{$listrow['title']}</option>\n";
		}
	?>
</select>

To act on selection change — declare "onchange" event in SELECT itself:

<select id="mylist" name="mylist" onchange="window.location='mypage.php?var=' + 
	document.myform.mylist.options[document.myform.mylist.selectedIndex].value">

Drop-down lists and multiple selections are mutually exclusive.
<select mutliple . . .> creates static list, even if size isn't specified.

SELECT OPTIONS: STATIC LIST (with multiple selections)

<select multiple id="mylist" name="mylist" size="3">
	<option value=''></option>
	<option value='1'> option 1 </option>
	<option value='2'> option 2 </option>
	<option value='3'> option 3 </option>
</select>

Multiple selects require use of CTRL or SHIFT keys.

... | top

Javascript to test for multiple selections in a list:

// call this function with "searchList('listname')"
function searchList(e);
	mylist = document.getElementById(e);
	for(var i=0; i < mylist.length; i++) {
		if (mylist.options[i].selected) {
		    //do something useful
		}
	}
}

SUBMIT

<input type="submit" id="OK" name="OK" value="OK" default="yes" />

... | top

In this example, javascript checks for required fields prior to submitting form.

<script language="JavaScript" type="text/javascript">
	function chkfield(f) {
		// verify form field has some length
		var e = document.getElementById(f);
		if (e.value.length <= 0) {
			e.focus();
			return false;
			} else {
			return true;
		}
	}

	function checkFields() {
		var OK = 'Y';
		// check in reverse order so focus winds up on first unfilled field
		if (!chkfield("field2")) { OK = 'N'; }
		if (!chkfield("field1")) { OK = 'N'; }
		if (OK == 'N') {
			alert("Please provide information for each category.");
			return false;
		} else {
			return true;
		}
	}

	// To check ALL form elements
	function validate_form(f) {
		// subtract 1 from form.elements.length (since it's indexed from 0)
		// below, length-3 skips OK and cancel buttons
		var bad_field = null;  // NULL if fields validate, else index # of bad form field
		var e = f.elements;  // elements is an array of form inputs
		for (var i = e.length-3; i>=0; --i ) {  
			if ( e[i].value.length==0 ) bad_field = i;  // preserve index of bad field
		}  // reverse order of loop insures bad_field points to first field to fail check
		if (null != bad_field) {
			alert('Enter something in each field.');
			e[bad_field].focus();  // position cursor in first bad field
		}
		return (null == bad_field);  // true if all fields are OK
	}

</script>

<input type="submit" id="OK" name="OK" value="OK" default="yes" onclick="checkFields()" /> 
<input type="submit" id="cancel" name="cancel" value="cancel" onclick="window.close()") />

You can approximate the "submit" type through a standard HREF tag combined with javascript.
In the next example, the HREF is pointed to a non-existent ID (#),
and the browser's normal attempt to locate it is aborted by "return false" — which is appended to whatever scripts are invoked.

<a href="#" onclick="someScript(); return false;"> OK </a>

Processing form values with PHP

Below is a FORM that POSTs its values to the same page — PHP decides whether to process or display.

<?php
	if (isset($_POST['xxx'])) {

	// . . . process variables
	// "check_input" (below) is a function you would create to test $_SERVER variables
		$clean('host') = ( check_input( $_SERVER['HTTP_HOST'],'host' ) ) ? $_SERVER['HTTP_HOST'] : false;
	// a similar line would test $_SERVER['SCRIPT_NAME'] and add it to $clean['script_name'] 


	// . . . redirect to page in parent folder
		
		if ($clean['host']) { 
			$html['host'] = htmlentities($clean['host'] , ENT_QUOTES , 'UTF-8'); 
			header("Location: http://".$html['host'].dirname($clean['script_name'])."/../apage.html");
			exit;
		} else {
			die;
		}
	} else {
?>
	
// regular HTML goes here, including . . .

	<form id="myform" name="myform" action="" method="post">
		<p>Required Field: <input type="text" id="xxx" name="xxx" size="12"  /></p>
	</form>

<?php
	}
?>

... | top

ACTION can add arguments, even if the method is POST.

(Process arguments with GET, form values with POST.)

action="<php echo "?name1=value1&name2=value2"; ?>" method="post"

If you display error messages as you process form values, then redirect with javascript.

<script language="JavaScript1.2">
	location.href = "../apage.html";   /* go to a page in parent folder */
</script>

CHECK YOUR ASSUMPTIONS

clean input / escape output

Although you might create a single file to display and process a web form, that doesn't mean incoming data were created by your form.

A wiley user can analyze your form, and send your page any malicious code he chooses. The phone number you think you're getting might be a specially-crafted bit of script designed to deface your page, break your application, or steal information.

Assume all incoming data are tainted. As the graphic shows, this includes PHP arrays and the results from HTTP requests or MySQL queries. All could contain unexpected, corrupt, or malicious data.

The following is based on suggestions from Chris Shiflett's PHP Security Guide.

A consistent naming convention helps show what you can trust, and what's ready for output. Shifflet suggests 3 arrays for keeping track of variables:

$clean = array(); // stores all filtered data (none of it escaped)
$mysql = array(); // stores escaped data ready for insertion into database
$html = array(); // stores entity encoded data for display or transmittal to another web resource

The idea is simple. No input to your application is trusted until it has been tested and placed inside $clean. $clean is the single source of data for futher processing.

And $clean is the only source of data for $mysql and $html. (They never draw data from each other because their data are encoded for different contexts.) They become the sole sources of data for SQL queries or HTML output.

FILTER ALL INPUT

Use a "whitelist" approach When possible, allow a set of known values, rather than try to anticipate all bad values.

switch ($val)
{
case 'apples' :
case 'oranges' :
case 'bananas' :
	return true;
	break;
default :
	return false;
}

Test input length If too big, generate an error message or lop it off (as shown here).

$clean['val'] = (strlen($_POST['val'])>$max) ? substr($_POST['val'],0,$max-1) : $_POST['val'];

Does the variable contain only:

ctype_ alnum() — alphanumeric (letters or digits)
ctype_ alpha() — alphabetic (letters)
ctype_ cntrl() — control characters (CR/LF, tabs, backspace, etc)
ctype_ digit() — digits
ctype_ graph() — printable characters except space
ctype_ lower() — lowercase
ctype_ print() — printable characters
ctype_ punct() — punctuation (any printable character NOT a letter, digit, or space)
ctype_ space() — whitespace (such at CR/LF, tabs, spaces)
ctype_ upper() — uppercase
ctype_ xdigit() — valid hexidecimal digits

These can be combined:

// allow letters, numbers, or spaces
$clean['val'] = ctype_alnum($_POST['val']) or ctype_space($_POST['val']);

Test character type (PHP provides several true/false character type functions.)

$clean['val'] = ctype_alnum($_POST['val']); // does $_POST['val'] contain only alphanumeric characters?

Weed out potential trouble PHP can strip tags and script functions.

$clean['val'] = strip_tags($_POST['val'],'<b><i><u>'); // remove all HTML tags, except b, i, & u

... | top

For simple string replacements, use STR_REPLACE — much faster than EREG_REPLACE or PREG_REPLACE. The first example illustrates how the contents of one array can replace the substrings listed in another array. (htmlentities() would be a better tool for this particular case.)

$example = "<p>This is a paragraph.</p>";
echo str_replace( array('&lt;','&gt;'), array('<','>'), $example);

You can replace several strings with a single string, using str_replace or regular expressions. Here are 2 examples of replacing common javascript function names with "forbidden." (The 2nd example comes from "guestbook." It has the advantage of ignoring case.)

$stripAttrib = array( "javascript:","onload","onclick","ondblclick","onmousedown","onmouseup","onmouseover"
		"onmousemove","onmouseout","onkeypress","onkeydown","onkeyup","style","class","id" );
str_replace( $stripAttrib, 'forbidden', $value);

$stripAttrib = 'javascript:|onload|onclick|ondblclick|onmousedown|onmouseup|onmouseover|
		onmousemove|onmouseout|onkeypress|onkeydown|onkeyup|style|class|id';
preg_replace("/$stripAttrib/i", 'forbidden', $value);

Test input pattern Regular expressions can check for complex expected patterns. Here's a cheat sheet.

The next code allows 4-28 letters, numbers, or underscores (and ignores case).

$clean['val'] = ( preg_match('/^[\w]{4,28}$/i', $_POST['val']) ) ? $_POST['val'] : false;

Above, preg_match() compares a pattern against $_POST['val'].
If the pattern matches, $clean['val'] receives the contents of $_POST['val'].
Otherwise, it is set to false.

The structure of the pattern is '/pattern/i'.
The trailing i is the "ignore case" modifier.
^ and $ mean the pattern must match from beginning to end.
[\w] means allow only "word" characters.
{4,28} sets the minimum/maximum characters allowed.

... | top

All patterns your form might generate could be tested in a single function.

The next code tests 'lastname' (sent via POST) against an expected "name" pattern. If the pattern matches, $clean['lastname'] is set to $_POST['lastname'], otherwise to FALSE.

$clean = array();
$clean['lastname']=(checkinput($_POST['lastname'],'name')) ? $_POST['lastname'] : false;  

function checkinput( $s, $type ) {
  switch ($type) {
    case "name" :  // allow any number of a-z, hyphen, space, period, apostrophe (ignore case)
      return preg_match( '/^[a-z-\'\. ]*$/i' , $s );  
      break;
    case "int" : 
      return preg_match('/^[0-9]{1,8}$/' , $s);  // 1-8 digits between 0 and 9
      break;
    case "email" :  // allow dots in account.name, only 2-4 characters in domain (ignore case)
      return preg_match('/^[a-z]{1}[\w]+([.][\w]+)*[@][\w]+([.][\w]+)*[.][a-z]{2,4}$/i', $s );
      break;
    case "USphone" :  // US-style phone format
      return preg_match('/(\()?\d{3}(?(1)\) ?|[-/ \.])\d{3}[- \.]\d{4}$/', $s);
      break;
  }  // end switch
}

ESCAPE ALL OUTPUT

mysql_real_escape_string() and htmlentities()

A consistent naming convention helps here. If the $clean array never holds escaped data, you know its data should be escaped before leaving your application.

$mysql['val'] = mysql_real_escape_string($clean['val']);
$result = mysql_query( " insert into users set colname = '{$mysql['val']}' " );

$html['val'] = htmlentities($clean['val'] , ENT_QUOTES , 'UTF-8');
echo "<p>{$html['val']}</p>";

PHP's "foreach" construction provides a rapid way to escape all cleaned data:

foreach ($clean as $k => $v) {
	$mysql[$k] = mysql_real_escape_string($v);
}

foreach ($clean as $k => $v) {
	$html[$k] = htmlentities($v , ENT_QUOTES , 'UTF-8');
}

Both functions preserve characters that might have special meaning in another context. If PHP lacks a routine for your database server, fall back to addslashes().

mysql_escape_real_string protects special characters and can guard against some forms of attack. (See the next link for extended examples.)

... | top

Accepting input at face value can lead to situations like this. Assume a user enters his lastname as " O'Rourke ", and your script processes it this way:

$lastname = $_GET['lastname'];
mysql_query("insert into users set lastname='$lastname'");

The resulting SQL statement becomes: " insert into users set lastname='O'Rourke' ".
The apostrophe truncates lastname to 'O', and the trailing Rourke' generates a syntax error.

Mysql_real_escape_string() "escapes" potentially troublesome characters by adding backslashes to them.

$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
mysql_query("insert into users set lastname={$mysql['lastname']}");

The SQL now is: " insert into users set lastname='O\'Rourke' ".
The slash causes the database server to see the apostrophe as a literal character, so the query inserts "O'Rourke" into the users table.

NOTE: The slashes added by mysql_real_escape_string() do not wind up in the table — they just preserve the punctation your user entered.

More importantly, mysql_real_escape_string() helps ward off some types of SQL injection attacks.

Say your database host allows stacked queries. (MySQL doesn't, but MSSQL does.) You could lose your entire users table if an attacker entered this as his last name: " '; drop table users -- ".

mssql_query("insert into users set lastname='$lastname'");

The SQL now resolves to: " insert into users set lastname=' '; drop table users -- ' ".
Your attacker has injected a 2nd, destructive SQL statement.
The "--" instructs SQL to ignore the rest of the line, so the trailing ' doesn't create an error.

Mysql_real_escape_string() neutralizes this attack by escaping the initial single quote.

These lines should provide a dual line of defense:

$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
mssql_query("insert into users set lastname={$mysql['lastname']}");

Your filtering shouldn't allow semi-colons in a lastname, so $clean['lastname'] should be null or false.

is_name($v) {
	// does $v contain only letters, apostrophes, hyphens, or white space?
	return preg_match( "/^[a-zA-Z'-\s]+$/" , $v )  // return true or false
}

$clean['lastname'] = (is_name($_POST['lastname'])) ? $_POST['lastname'] : false;

If your filtering fails, mysql_real_escape_string would cause your SQL to become:
" insert into user set lastname=' \'; drop table users -- ' ".
Lastname would receive a nonsense value, but the users table would be unscathed.

This next bit is confused — mysql_real_escape_string() would stop this attack. Rethink. Do emphasize doctored POST problem.

Mysql_real_escape_string() can't guard against all types of injection.

Say your form feeds your script $_POST['selectedname'], when a user chooses a name in a <SELECT> field.
You limited <SELECT> to a subset of names, so you trust $_POST['selectedname'] to set a WHERE condition.

$result = $mysql_query( "select * from users where lastname='{$_POST['selectedname']}' ");

Unfortunately, you can't depend on the <SELECT> field to limit your users' choices.
An attacker can construct a doctored form so that $_POST['selectedname'] contains " ' or 0=0 ".

The SQL becomes: "select * from users where lastname =' ' or 0=0 ' ";

Because " or ' ' = ' ' " will always be true, this will return the entire table to your attacker, from which he might gain hints about how to proceed.

This attack is thwarted if you filter $_POST['selectedname'], along these lines:

$clean['lastname'] = ( is_name($_POST['lastname']) ) ? $_POST['lastname'] : false;

if ($clean['$lastname']) {
	$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
	$result = $mysql_query(" select * from users where lastname= '{$mysql['lastname']}' ");
} else {
	// handle error here, perhaps:
	$html['error'] = htmlentities("Sorry.  I don't recognize your selection." , ENT_QUOTES , 'UTF-8');
	echo "<p>{$html['error']}</p>";
}

htmlentities protects text in URLs, and is a handy way to display source code.

$clean['msg'] = "<p>Here's the markup.</p>";
$html['msg'] = htmlentities($clean['msg'] , ENT_QUOTES , 'UTF-8');
echo "<p>{$html['msg']}</p>";

If you want to display data about to be committed to the database, or provide a summary after it has been, your code should look something like this:

$mysql['val'] = mysql_real_escape_string($clean['val']);
$html['val'] = htmlentities($clean['val'] , ENT_QUOTES , 'UTF-8');
echo "<p>You are about to insert {$html['val']} into the database.</p>";

More importantly, your database could be compromised and a column holding a comment might be replaced with a malicious script. (This is similar to note about a form's ACTION attribute near the top of this page.)

Filtering comments is harder than other types of data, because many characters used in scripts might legitimately exist in them. So, if your cleaning routine doesn't catch the compromise, and you echo the comment directly to the screen, the browser could execute the attack.

But htmlentities() would sanitize the script before the "comment" was echoed to the screen. You'd see the script, but it wouldn't be executed.