Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

What are the optimizations that VTD-XML has implemented?

April 26, 2017implemented optimizations vtd-xml

0

Posted

What are the optimizations that VTD-XML has implemented?

1 Answer

0

Posted

Every time an object is created, it needs to be garbage-collected. So there is a round trip penalty. Every time one takes apart the the document for a small change, he will have to put everything back together. So there is another round trip penalty. Every time one decodes (e.g. from UTF-8 to UCS 2) the entire document for a small change, he will have to encode the document when writing out on disk. So there is yet another roundtrip penalty. Putting all these overheads together, XML processing performance probably isn’t going to be very good. VTD-XML is designed from ground up to overcome these overheads. The first thing VTD-XML does is to keep the document intact in memory, and un-decoded. The tokenization is done by only recording the starting offset and length. Next, VTD-XML represents tokens in 64 bit integers (VTD records). Because VTD records are constant in length, they can be stored in large memory blocks, resulting in a very significant memory saving. Finally, VTD-XML’s intern