About

Welcome  the official blog of VTD-XML. This blog will help you understand why VTD-XML is the single most stunning development in the state of XML processing, and why  it is the future of SOA and Cloud computing.

Advertisements

12 comments so far

  1. Sangeeth on

    Hi,

    I have been going through the great product of yours and was impressed with its abilities to split xml. I was able to split xmls based on the article at http://www.javaworld.com/javaworld/jw-07-2006/jw-0724-vtdxml.html?page=5, but i was wondering whether splitting of xml can be achieved based on the number of elements?

    I was not able to locate any example for doing the same. For example, i have an xml which has 500 elements and i wish to split them in to xmls containing 50 elements each. Can this be achieved using ximpleware and if so could you point me to the same?

    • jimmyzhang on

      by 500 elements, are you saying there are 500 elements of same depth, or 500 elements in any depth?

      • Sangeeth on

        They are 500 elements of the same depth only….and i would like to split them based on the no of elements

  2. Maarten Marx on

    Hiya,

    I am looking for an *operational* definition of document-centric XML, that is, a test I can apply to any XML file and it tells me whether it is document-centric or data-centric. Do you know one or some? Is there an agreed upon definition?

    Any help is greatly appreciated,

    all the best
    Maarten Marx

    • jimmyzhang on

      Hi, as far as my understanding goes, an XML document encoding fields of relational databases (regular shape, large repetition) is a data-centric xml. An XML document (literature, a word document, xhtml) of irregular shapes is more or less document-centric, I doubt there is a strict distinction between the two … xml is the most flexible format in a large part because of it…

  3. Yanjie Fu on

    Hi, Jimmy, I come across a problem for handling huge XML file (>30G). I do as the webpage(https://ximpleware.wordpress.com/2010/05/22/process-huge-xml-documents-with-extended-vtd-xml/) tells me. However, I get an error:

    Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
    at com.ximpleware.extended.FastLongBuffer.append(FastLongBuffer.java:209)
    at com.ximpleware.extended.VTDGenHuge.writeVTD(VTDGenHuge.java:3389)
    at com.ximpleware.extended.VTDGenHuge.parse(VTDGenHuge.java:1402)
    at com.ximpleware.extended.VTDGenHuge.parseFile(VTDGenHuge.java:1290)
    at HugeXMLProcessor.main(HugeXMLProcessor.java:9)

    Here is my source code:
    import com.ximpleware.extended.*;

    public class HugeXMLProcessor {
    public static void main(String[] s) throws Exception{
    System.out.println(System.getProperty(“sun.arch.data.model”));

    VTDGenHuge vg = new VTDGenHuge();
    // System.out.println(vg.parseFile(“wiki.xml”,true,VTDGenHuge.MEM_MAPPED));
    if (vg.parseFile(“wiki.xml”,true,VTDGenHuge.MEM_MAPPED)){
    VTDNavHuge vnh = vg.getNav();
    AutoPilotHuge aph = new AutoPilotHuge(vnh);

    aph.selectElement(“*”);
    while( aph.iterate() ) // iterate will iterate thru all elements
    {
    // put processing logic here
    System.out.println(“Element name ==> ” + vnh.toString(vnh.getCurrentIndex()));
    int t = vnh.getText();
    if (t!=-1)
    System.out.println(“Text content ==> ” + vnh.toNormalizedString(t));
    }
    }
    }
    }

    Platform:
    windows 7-64bit
    eclipse – 64bit
    jre7-64bit

    Some explanation:
    the wiki.xml is very large, larger than 30G. And the problem is happened in the sentence of “vg.parseFile(“wiki.xml”,true,VTDGenHuge.MEM_MAPPED)”. Could you heap me to solve this problem.

    • jimmyzhang on

      I think u ran out of memory with JVM, can you tune the heap size of JVM using options such as -Xmx ….? or install more physical memory in the system?

  4. Naveen on

    Hi I’m looking for large file (500mb XML) compressed to GZip. i m trying to spit the file by XML node value( of couse with XPATH)

    to load the file i m using the below code
    boolean isParsable = true;
    VTDGen vg = new VTDGen();

    isParsable = vg.parseGZIPFile(xmlFile, true);
    long loadingTime = System.currentTimeMillis();
    System.out.println(“read the file..”);
    if (isParsable) {
    VTDNav vn = vg.getNav();
    AutoPilot ap = new AutoPilot(vn);
    ap.selectXPath(“/records/record[bankbalance >100900]”);
    i = -1;
    int j = 0;
    while ((i = ap.evalXPath()) != -1) {
    long l = vn.getElementFragment();
    byte[] bytes = vn.getXML().getBytes();

    fileOutputStream = new FileOutputStream(
    “C:\\workspace\\TestFunc\\test\\out” + j + “.xml”);
    fileOutputStream.write(bytes, (int) l, (int) (l >> 32));
    j++;
    }

    with this code In ECLIPSE i run out of memory not able to load the file.
    -Xms25m -Xmx256m

    Do we need further large space or can VTD -XML split the file in this configured heap size ?

    appreciate your suggestions

    • jimmyzhang on

      Yes, you need to set the heap size to 1GB to be safe.

      • jimmyzhang on

        not with vtd, u can try sax

  5. Naveen on

    Is there a possibility to process this in stream mode with small memory without having to set 1G

  6. Bibin on

    Hi,
    I am using VTD-XML to parse an xml file using xpath. I would like to whether the vtd supports case insensitive xpaths? i am using the version 2.11. If it is possible could you show me some sample code?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: