VTD-XML 2.10 Released

VTD-XML 2.10 is now released under Java, C#, C and C++. It can be downloaded at
https://sourceforge.net/projects/vtd-xml/files/vtd-xml/ximpleware_2.10/. This release includes a number of new features and enhancement.

  • The core API of VTD-XML has been expanded. Users can now perform cut/paste/insert on an empty element.
  • This release also adds the support of deeper location cache support for parsing and indexing. This feature is useful for application performance  tuning for processing various XML documents.
  • The java version also added support for processing zip and gzip files. Direct processing of httpURL based XML is enhanced.
  • Extended Java version now support Iso-8859-10~16 encoding.
  • A full featured C++ port is released.
  • C version of VTD-XML now make use of thread local storage to address the  thread-safety issue for multi-threaded application.

There are also a number of bugs fixed. Special thanks to Jozef Aerts, John Sillers, Chris Tornau and a number of other users for input and suggestions

Thread Safety in C Version of VTD-XML

Before 2.10, the C version of vtd-xml makes extensive use of global variables for XPath query compilation. The thread safety problem arises when multple instances of an application  perform  XPath compilation at the same time. To resolve this issue, VTD-XML 2.10 replaces all global variables for XPath compilation with thread local vriables: instead of simply declaring a variable, prepend _thread to the declaration. The thread local variable is just like global variable, except it is specific/visible within a thread. The macro for “_thread” is defined in “customTypes.h.”

How does the use of thread local variable impact the overall design of your application? Fortunately, very little change is required. The most significant one is the global thread context declaration: The old one looks something like this:

struct exception_context the_exception_context[1];
int main(){
          exception e;
          Try {  // put the code throwing exceptions here
          } Catch (e) {  // handle exception in here
          }
}

From 2.10 and onward the app will look like below

_thread struct exception_context the_exception_context[1];
int main(){
          exception e;
          Try {  // put the code throwing exceptions here
          } Catch (e) {  // handle exception in here
          }
}

Location Cache Depth Tuning

Before version 2.10, the location cache depth is set to 3. In this version, you can choose either 3 or 5, by simply calling VTDGen’s setLcDepth() (see the example below). The benefit is that at the cost of negligible parsing and memory overhead, the random access performance of VTDNav improves, especially for depth XML documents.


   VTDGen vg = new VTDGen();

   vg.selectLcDepth(5);

Insert Text into Empty Element

In this release,  you now have the ability to insert text into an empty element  (e.g. <a/>).

  • Insert “some text” into <a/>, you get “<a>some text</a>”.
  • Insert <b/> into <a/>, you get <a><b/></a>. 

VTD-XML in  C++

Below is a simple app written in VTD-XML and C++.

#include "everything.h"
//#include "bookMark.h"

using namespace com_ximpleware;

int main(){
 FILE *f = NULL;
 FILE *fo = NULL;
 int i = 0;

 Long l = 0;
 int len = 0;
 int offset = 0;

 char* filename = "c:/xml/soap2.xml";
 struct stat s;
 UByte *xml = NULL; // this is the buffer containing the XML content, UByte means unsigned byte
 //VTDGen *vg = NULL; // This is the VTDGen that parses XML
 VTDNav *vn = NULL; // This is the VTDNav that navigates the VTD records
 AutoPilot *ap = NULL;
 char *sm = "\n================\n";

 // allocate a piece of buffer then reads in the document content
 // assume "c:\soap2.xml" is the name of the file
 f = fopen(filename,"r");
 fo = fopen("c:/xml/out.txt","w");

 stat(filename,&s);

 i = (int) s.st_size; 
 printf("size of the file is %d \n",i);
 xml = new UByte[i];
 fread(xml,sizeof(UByte),i,f);
 VTDGen vg;
 try{
  
  vg.setDoc(xml,i);
  vg.parse(true);
  vn = vg.getNav();
  AutoPilot ap;
  ap.declareXPathNameSpace(L"ns1",L"<a href="http://www.w3.org/2003/05/soap-envelope">http://www.w3.org/2003/05/soap-envelope</a>");
  //if (ap.selectXPath(L"/ns1:Envelope/ns1:Header/*[@ns1:mustUnderstand]")){
  if (ap.selectXPath(L"/ns1:Envelope/ns1:Header/*[@ns1:mustUnderstand]")){
  //if (ap.selectXPath(L"/a/b/*")){
   ap.printExprString();
   ap.bind(vn);
   int i=-1;
   while((i=ap.evalXPath())!= -1){
    //printf("\n hi ==> %d \n",i);
    l = vn->getElementFragment();
    offset = (int) l;
    len = (int) (l>>32);
    fwrite((char *)(xml+offset),sizeof(UByte),len,fo);
    fwrite((char *) sm,sizeof(UByte),strlen((char*)sm),fo);
   }
  }
  fclose(f);
  fclose(fo);
  // remember C has no automatic garbage collector
  // needs to deallocate manually.
  delete(vn);
 }
 catch (ParseException &e){
  //vg.printLineNumber();
  printf(" error ===> %s \n",e.getMessage());
 }
 catch (...) {
  delete (vn);
 }
 return 0;
} 

Advertisements

14 comments so far

  1. gabrilogos1985 on

    Hi. do you know how to remove blank space from a XML?
    Example:


    when I remove namespaces xmlns:editorial and xmlns:autor i get

    … two extra blank spaces…
    thanks for answer me

    • jimmyzhang on

      it can be done but you may have to do a bit work urself

  2. neuralrank on

    Is this project still supported?

    Many thanks.

  3. neuralrank on

    Hi

    Firstly let me say that VTD has directly led to massive performance improvements in my application, several orders of magnitude. So thanks a lot for all the hard work.

    I was concerned with part 2 of the referenced post. In my case its cropped up as an issue simply because the behaviour seems deviate from the XPath 1.0 standard. I am not a “standards” expert by any means but I use GUI tools to test / develop XPaths (e.g the XPATHER plugin for firefox and the XPath visualizer tool for windows).
    In these applications and xerces and saxon the the case in question results in a string representation for all children, in VTD I am getting a blank string, as a result I need to include VTD in the “development” phase for my xpaths which is a pain.

    Many thanks again

    Neuralrank

  4. jimmyzhang on

    will look into it nd get back

  5. menjarazanahary r.r. on

    Hopefully a Delphi release ! Thanx in advance.

  6. neuralrank on

    Hi there any news?

  7. Andrew on

    i am getting an error when adding your library.
    Jul 26, 2012 10:48:08 AM org.apache.catalina.core.ApplicationContext log
    SEVERE: StandardWrapper.Throwable
    org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name ‘com.bsp.ics.core.mvc.controller.OperationsController’ defined in file [C:\Program Files (x86)\Apache Software Foundation\apache-tomcat-6.0.35\webapps\IVC\WEB-INF\classes\com\bsp\ics\core\mvc\controller\OperationsController.class]: Unsatisfied dependency expressed through constructor argument with index 0 of type [com.bsp.ics.core.mvc.service.OperationHandlerService]: : Error creating bean with name ‘com.bsp.ics.core.mvc.service.OperationHandlerService’ defined in file [C:\Program Files (x86)\Apache Software Foundation\apache-tomcat-6.0.35\webapps\IVC\WEB-INF\classes\com\bsp\ics\core\mvc\service\OperationHandlerService.class]: Invocation of init method failed; nested exception is java.lang.UnsupportedClassVersionError: Bad version number in .class file (unable to load class com.ximpleware.extended.IByteBuffer); nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘com.bsp.ics.core.mvc.service.OperationHandlerService’ defined in file [C:\Program Files (x86)\Apache Software Foundation\apache-tomcat-6.0.35\webapps\IVC\WEB-INF\classes\com\bsp\ics\core\mvc\service\OperationHandlerService.class]: Invocation of init method failed; nested exception is java.lang.UnsupportedClassVersionError: Bad version number in .class file (unable to load class com.ximpleware.extended.IByteBuffer)

    i am using myeclipse and i am running java 5

    can you please help this lib will help with my project to read in large xml to do a compare

    • jimmyzhang on

      recompile vtd-xml to your version of jdk the problem should go away

  8. sincang on

    Hi,

    I am thinking of parsing the XML in parallel by dividing the list of elements into group using the position() function. Is it possible to have multiple AutoPilots across threads accessing one VTDNav? Or I need to clone it for each thread?

    Thanks,

    • jimmyzhang on

      yes you can


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: