Does That Look 100x?

I came across an interesting open source project called XCF (standing for XML Communication Framework), and it has benchmarked VTD-XML vs XOM, Xerces DOM and Xerces SAX, and some binary data format. The results consistently show that VTD-XML is not just faster, but faster by a huge margin. In fact, any performance sensitive XML applications without using VTD-XML is pretty much unusable. That kind of makes you wonder what kind of performance one gets from buying all those expensive Progress, Oracle and IBM ESBs. 
testcase0_6

For the complete test analysis and results, please visit

https://code.ai.techfak.uni-bielefeld.de/trac/xcf/wiki/Evaluation

Advertisements

10 comments so far

  1. Artur on

    Just to check if I got it correct:
    1) Parsing xml is very slow
    2) Parsing binary is fast
    3) If you put more data on the wire, you can send parsing shortcuts, getting nice speedup at cost of size of message

    I would like to see how it fares against Hessian serialization of Pojos for example, both in speed and bandwidth usage. At the moment you stop sending pure xml and add custom binary data, you can as well stop using xml completely and try some more optimized stuff – because you need custom library on the other end anyway.

  2. jimmyzhang on

    parsing XML is not slow…. parsing XML using DOM is slow…
    because DOM allocates a lot of objects and object alloc/dealloc are slow

    The trade off (more data for higher performance) is acceptable in most cases because parsing is what slows down things … use bandwidth to compensate for slow parsing is logical things to do..

  3. Artur on

    Jimmy, if it would be only problem of DOM, then SAX parsers should be fast. In the original article you can see that while they are faster, they are still way behind VTD.

    • jimmyzhang on

      parsing XML using sax is slow because sax allocates a lot of objects (strings etc)…right?

  4. Developer Dude on

    It is not just the parsing, it is also the verbosity. Try sending a thousand invoice records v. a million invoice records, discard the marshalling/unmarshalling times and compare XML to binary then. The difference may not be orders of magnitude, but it will be significant enough that you wouldn’t choose XML if you didn’t have to. Then there is the issue of binary objects encoded within the XML.

    The sad fact of it is, that if you want to provide ‘interoperable’ and standardized data/services, then XML is probably your best bet.

    If you have control over both consumer and producer of data/services (and preferably use one language), then that often opens up many other possibilities.

    If you are going to some proprietary XML like format/protocol, then why not go right to binary period?

  5. IReadALot on
    • jimmyzhang on

      I hardly think he knows what he says, you should ask him whether which part of vtd-xml is not conformant and why it is a critical flaw…

      in short, don’t believe any claims (including this blog)… the best and surest way is to give it a try and let the result speak for itself

  6. IReadALot on

    well i tried it(VTD- java) and it is fast but what about XML-Schema and DTD support?

    • jimmyzhang on

      We don’t support DTD validation because it is essentially deprecated… Schema is on the roadmap.. we are working on it…

      • IReadALot on

        Well that’s nice to hear, IMO still a lot projects use DTD, but XML Schema support coming that sounds nice so i keep this watching!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: