Understanding PHP data sanitization with an example
In PHP data sanitization is a process in which we remove unwanted content from the strings to make them safe to use in HTML code. For example if user types Javascript code in a textbox and we want to remove it, we can use data sanitization. PHP provide some regular expressions that remove extra white spaces, images, scripts and stylesheets from given string and return safe string to display as HTML.
Example:
/*this string has javascript code */ $str = 'Username Email <script>alert("hello XSS")</script>'; $str = preg_replace ( '/s{2,}/u', ' ', preg_replace ( '/[nrt]+/', '', $str ) ); $preg = array ( '/(<a[^>]*>)(<img[^>]+alt=")([^"]*)("[^>]*>)(</a>)/i' => '$1$3$5<br />', '/(<img[^>]+alt=")([^"]*)("[^>]*>)/i' => '$2<br />', '/<img[^>]*>/i' => '' ); $str = preg_replace ( array_keys ( $preg ), array_values ( $preg ), $str ); $regex = '/(<link[^>]+rel="[^"]*stylesheet"[^>]*>|' . '<img[^>]*>|style="[^"]*")|' . '<script[^>]*>.*?</script>|' . '<style[^>]*>.*?</style>|' . '<!--.*?-->/is'; $str = preg_replace ( $regex, '', $str ); echo $str;
Output Before Sanitization:
Output After Sanitization:
Username Email