VTD-XML 2.10 Released
VTD-XML 2.10 is now released under Java, C#, C and C++. It can be downloaded at
https://sourceforge.net/projects/vtd-xml/files/vtd-xml/ximpleware_2.10/. This release includes a number of new features and enhancement.
- The core API of VTD-XML has been expanded. Users can now perform cut/paste/insert on an empty element.
- This release also adds the support of deeper location cache support for parsing and indexing. This feature is useful for application performance tuning for processing various XML documents.
- The java version also added support for processing zip and gzip files. Direct processing of httpURL based XML is enhanced.
- Extended Java version now support Iso-8859-10~16 encoding.
- A full featured C++ port is released.
- C version of VTD-XML now make use of thread local storage to address the thread-safety issue for multi-threaded application.
There are also a number of bugs fixed. Special thanks to Jozef Aerts, John Sillers, Chris Tornau and a number of other users for input and suggestions
Thread Safety in C Version of VTD-XML
Before 2.10, the C version of vtd-xml makes extensive use of global variables for XPath query compilation. The thread safety problem arises when multple instances of an application perform XPath compilation at the same time. To resolve this issue, VTD-XML 2.10 replaces all global variables for XPath compilation with thread local vriables: instead of simply declaring a variable, prepend _thread to the declaration. The thread local variable is just like global variable, except it is specific/visible within a thread. The macro for ”_thread” is defined in ”customTypes.h.”
How does the use of thread local variable impact the overall design of your application? Fortunately, very little change is required. The most significant one is the global thread context declaration: The old one looks something like this:
struct exception_context the_exception_context[1];
int main(){
exception e;
Try { // put the code throwing exceptions here
} Catch (e) { // handle exception in here
}
}
From 2.10 and onward the app will look like below
_thread struct exception_context the_exception_context[1];
int main(){
exception e;
Try { // put the code throwing exceptions here
} Catch (e) { // handle exception in here
}
}
Location Cache Depth Tuning
Before version 2.10, the location cache depth is set to 3. In this version, you can choose either 3 or 5, by simply calling VTDGen’s setLcDepth() (see the example below). The benefit is that at the cost of negligible parsing and memory overhead, the random access performance of VTDNav improves, especially for depth XML documents.
VTDGen vg = new VTDGen(); vg.selectLcDepth(5);
Insert Text into Empty Element
In this release, you now have the ability to insert text into an empty element (e.g. <a/>).
- Insert “some text” into <a/>, you get “<a>some text</a>”.
- Insert <b/> into <a/>, you get <a><b/></a>.
VTD-XML in C++
Below is a simple app written in VTD-XML and C++.
#include "everything.h"
//#include "bookMark.h"
using namespace com_ximpleware;
int main(){
FILE *f = NULL;
FILE *fo = NULL;
int i = 0;
Long l = 0;
int len = 0;
int offset = 0;
char* filename = "c:/xml/soap2.xml";
struct stat s;
UByte *xml = NULL; // this is the buffer containing the XML content, UByte means unsigned byte
//VTDGen *vg = NULL; // This is the VTDGen that parses XML
VTDNav *vn = NULL; // This is the VTDNav that navigates the VTD records
AutoPilot *ap = NULL;
char *sm = "\n================\n";
// allocate a piece of buffer then reads in the document content
// assume "c:\soap2.xml" is the name of the file
f = fopen(filename,"r");
fo = fopen("c:/xml/out.txt","w");
stat(filename,&s);
i = (int) s.st_size;
printf("size of the file is %d \n",i);
xml = new UByte[i];
fread(xml,sizeof(UByte),i,f);
VTDGen vg;
try{
vg.setDoc(xml,i);
vg.parse(true);
vn = vg.getNav();
AutoPilot ap;
ap.declareXPathNameSpace(L"ns1",L"<a href="http://www.w3.org/2003/05/soap-envelope">http://www.w3.org/2003/05/soap-envelope</a>");
//if (ap.selectXPath(L"/ns1:Envelope/ns1:Header/*[@ns1:mustUnderstand]")){
if (ap.selectXPath(L"/ns1:Envelope/ns1:Header/*[@ns1:mustUnderstand]")){
//if (ap.selectXPath(L"/a/b/*")){
ap.printExprString();
ap.bind(vn);
int i=-1;
while((i=ap.evalXPath())!= -1){
//printf("\n hi ==> %d \n",i);
l = vn->getElementFragment();
offset = (int) l;
len = (int) (l>>32);
fwrite((char *)(xml+offset),sizeof(UByte),len,fo);
fwrite((char *) sm,sizeof(UByte),strlen((char*)sm),fo);
}
}
fclose(f);
fclose(fo);
// remember C has no automatic garbage collector
// needs to deallocate manually.
delete(vn);
}
catch (ParseException &e){
//vg.printLineNumber();
printf(" error ===> %s \n",e.getMessage());
}
catch (...) {
delete (vn);
}
return 0;
}
Hi. do you know how to remove blank space from a XML?
Example:
…
when I remove namespaces xmlns:editorial and xmlns:autor i get
…
… two extra blank spaces…
thanks for answer me
it can be done but you may have to do a bit work urself
Is this project still supported?
Many thanks.
of course
Thanks for your reply.
The reason I ask is that I have the same issue as the poster here
http://sourceforge.net/mailarchive/forum.php?thread_name=BANLkTikjMoxrZ_Hifeu0o%2BWjscR66sT%3DLA%40mail.gmail.com&forum_name=vtd-xml-users
(unexpected xpath behaviour )
But I didnt see an answer.
Thanks again.
rgds
Neuralrank.com
HI, First issue can be solved pretty quickly, The second issue is really about the intepretation of “string value” in VTd-Xml, compared to as in Xerces.It seems like a corner case that rarely gets use… Can you provide me a use case where the second issue is a big deal in your use case?
Hi
Firstly let me say that VTD has directly led to massive performance improvements in my application, several orders of magnitude. So thanks a lot for all the hard work.
I was concerned with part 2 of the referenced post. In my case its cropped up as an issue simply because the behaviour seems deviate from the XPath 1.0 standard. I am not a “standards” expert by any means but I use GUI tools to test / develop XPaths (e.g the XPATHER plugin for firefox and the XPath visualizer tool for windows).
In these applications and xerces and saxon the the case in question results in a string representation for all children, in VTD I am getting a blank string, as a result I need to include VTD in the “development” phase for my xpaths which is a pain.
Many thanks again
Neuralrank
will look into it nd get back
Hopefully a Delphi release ! Thanx in advance.