thraxil.org:

Validation, meet Unit Testing. Unit Testing, meet Validation.

by anders pearson Tue 20 Sep 2005 23:50:12

[cross posted from the WaSP to take comments]

Are you test infected? Do you work on dynamic sites and wish there was an automated way to run the output through the W3C validator? Do you wish it was integrated nicely with your unit testing framework?

Scott Raymond has come up with a nice bit of code to add automated validation to the unit tests for a Ruby on Rails application.

If you're not on Rails, the technique should be pretty straightforward to adapt to your prefered language/framework. Just make a POST request to http://validator.w3.org/check sending parameters fragment (your page, encoded) and output=xml. Then check the response for a header called x-w3c-validator-status to see if it says Valid. If so, your test passed.

TAGS: validation web standards extreme programming unit testing

comments

Isn't this more of a job for an XML parser? What do you do with errors that come back in the HTTP response? Do you have to parse it? Here's a quick JAXP version:

bq{font-family:monospace;}. DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); builderFactory.setNamespaceAware(true); builderFactory.setValidating(true); DocumentBuilder builder = builderFactory.newDocumentBuilder(); builder.setErrorHandler(new DefaultErrorHandler()); builder.parse(input);

where input can be an input stream or a file. DefaultErrorHandler sends parse errors to System.out or an external logger. With a catalog in place to map public IDs locally, this validation can be done quickly and completely offline. I've got something similar in all of my JUnit tests that need to validate XHTML.

if you've got a validating parser locally, sure, you're probably better off using that. the W3C validator will also validate plain HTML though (if you aren't using XHTML).

Your point about HTML (as opposed to XHTML) is a good one. I forgot that others are still stuck with that. You're out of luck with an XML parser if you're trying to validate HTML.

I agree but do you invision program in the future that might do this? Using the W3C validator is real easy which is why I like it but always like to have options.

Love the JAXP version, can't believe I had never thought of doing that before!

Very cool, I am going to show this off in class. I am glad that it was noted it does not work the same with html, that would be an embarassing thing to show and not be able to tell!

Adding to what Justin said, you can also walk the DOM once you've parsed it and check for things beyond validation, like checking document structure. For example are the first and second child elements of body div#skip and div#logo, etc.

Do people ever get you confused with fossil man? I thought you were someone famous for a minute :-)

Thanks to scott for writing this code! It worked like a charm the first time we tried it. Now we will actaully do a check now and again, rather than just putting it off...


formatting is with Markdown syntax. Comments are not displayed until they are approved by a moderator. Moderators will not approve unless the comment contributes value to the discussion.

namerequired
emailrequired
url
remember info?