2010-01-28

HTML+JS injection hidden gotcha

When you create websites with an ability to upload texts by visitors, you always need to escape special characters (tags, ampersands, quotes, etc...), but this is quite common knowledge. Lets go a bit deeper... For this example I will use PHP, JavaScript and XHTML fragments (PHP has nothing to do with this bug).

Lets say we need to create a simple list of links that would change text in thediv container depending on which link you click. This text has been uploaded to our page by visitors some time ago, so it might contain any imaginable crap...

What we might do to solve this is to create a JS function in our XHTML script block that would take a single text argument and change text in that div container when called. Something like this:

function showText( text ) {
var el = document.getElementById( 'div_id' );
el.innerHTML = text;
}

Everything seems reasonable so far. Now we need to call this function from HTML link, so we put onclick event there:

.... onclick="showText( ---TEXT WILL BE HERE--- );" ....

This text we want to pot into those brackets is stored in our database, so we use PHP to get it out of there and by using a template engine or inline PHP code we put it into those brackets like this:

.... showText( '' ); ....

Note that htmlspecialchars() function was used to get rid of dangerous HTML symbols. We expect to have our text obfuscated and if we open the website over some browser, in the source dump we will see that our text has actually been successfully filtered. Note that if you will attempt to view source via Firefox or some other popular browsers, you might see a modified source code. In this case it is better to use Curl or similar programs to be able to see the actual source. I have found out that Konquerror browser also does not modify the source so you may try using it.

Everything seems just fine until... Until we press the link. Text that appears in div container hasn't been fully escaped! Why? Well, it was unescaped by the browser itself while running JS code. Cool, huh?

Now, why is this dangerous you might ask. Well, it opens the way to inject links, make XHTML (HTML) invalid, inject JS that would do nasty things in your website... JS may even steal your visitor's session id and help hacker to gain control over the user's account...

I don't know whether this is a bug of many browsers or is this a feature (every bug that can not be fixed is a feature in this case), but I would recommend not to step into this sh...

Also, please comment this post and correct me if I am wrong.

2010-01-25

New CMS obfuscator

It seems that CMS obfuscator will have to be rewritten from scratch. Main reasons for this are poor code decomposition and high complexity. Nearly every new CMS solution needs lots of work with obfuscation mechanism to make it work and that is not acceptable.

It doesn't mean that old obfuscator is worthless. It was somewhat test project which generated a huge number of great ideas. While code is not in its best shape, those ideas can and will be used.

These ideas contain:

  • What will be obfuscated
  • How will it be obfuscated
  • What will be ignored
Before writing actual code, I will have to document answers to these questions in order not to make the same mistakes again. Also, I have already started to document my own PHP coding standard. As this standard is being written in English, it will probably be published here in this blog.

Update: New obfuscator has already been documented. Implementation work will be started soon.

2010-01-20

CMS todo

At this point I am planning to continue working with my CMS project. There are lots of things left to be done and I don't want to waste any more time doing nothing...

Here is the plan for the next few days:

  • Adopt CMS fully for PDocs (done)
  • Export CMS content tree to a separate block for multiple reuse (seems impossible due to high number of very custom elements)
  • Fix content tree element movement issues (done)
  • Remember scroll position after refresh (done)
  • Review obfuscator and fix bugs (full rewrite is planned)