So I had an interesting problem arise today where HTMLPurifier (http://htmlpurifier.org/) was truncating all of my output except that which was in between PHP tags. This was very strange behavior that I had not yet seen.
Here was my view:
<div class='groom_log_content'>
<fieldset class='border-fields'>
<legend class='bold'><?php echo $groom['content_name']; ?></legend>
<p class='editable_textarea fix_space' id='<?php echo $groom['content_id']; ?>'>
<?php echo $this->cleaner->purify($content); ?>
</p>
</fieldset>
</div>
The content that was being passed into this view was user submitted from a support chat that looked something like this:
- The customer posts their MySQL connection string in chat with obvious errors and the agent tells them it looks correct.
5:17:52am Name:
Connecting to MySQL with PHP
5:18:18am Name: is that the correct information to input so i can locate the database
5:19:13am Name: That looks to be correct.
Suggestion: Look at this line, it is the important one: $conn = mysql_connect('host', 'user', 'pass', 'db');
Though you cannot diagnose the code, you could immediately correct two issues.
You must look for a valid cPanel database, database username, and address.
Here, neither the database name nor the database username would be simply 'user', and
if this is connecting to a database locally you would use localhost instead of the IP address.
With this input being passed to purifier the returned content was:
<?php
$db_host = 'localhost';
$db_user = 'user';
$db_pass = 'pass';
$conn = mysql_connect('host', 'user', 'pass', 'db');
if(! $conn )
{
die('Could not connect: ' . mysql_error());
}
echo 'Connected successfully';
mysql_close($conn);
?>
This was very peculiar to me because I have never seen HTMLPurifier strip content before and after PHP code. So I started my investigation there. After playing with htmlspecialchars() and stripping out the PHP code all together I was able to determine that the problem was with the actual HTML tags at the beginning and end.
HTMLPurifier does not support the body, head, and html tags so was just stripping out all of the content.
To resolve this I just stripped it out prior to passing it to the purifier:
$content = str_replace('<html>', '', $content);
$content = str_replace('<head>', '', $content);
$content = str_replace('<title>', '', $content);
$content = str_replace('<body>', '', $content);
$content = str_replace('</body>', '', $content);
$content = str_replace('</html>', '', $content);