|
XML File Corpus (back to pubs) |
This page provides hyperlinks to the XML binary formats,
tools, and compressors described in the paper:
Chris Augeri*, Barry
Mullins, Dursun Bulutoglu, Rusty Baldwin, and
DISCLAIMER: All links, sample usages, etc., were as used in the paper—please contact me if you have any questions.
|
Tool |
Output |
|
Clean up XML files |
|
|
Calculate 0-order entropy |
|
|
Reports various statistics, e.g., # lines, characters, etc. |
|
|
The internet archive, used to find a version of XML-ZIP |
|
|
XML Schema extractor |
|
|
Statistical analysis |
|
|
The computer series used to run our tests |
Compressors &/or Binary Formats
|
Compressor |
Sample Usage |
|
BZIP2 -k -f -v foo.xml |
|
|
arith -e -t word -m 255 -c 20 foo.xml 1 > foo.cac |
|
|
java -cp FastInfoset.jar
com.sun.xml.fastinfoset.tools.XML_SAX_FI foo.xml foo.fis |
|
|
GZIP -9 -c -f -v foo.xml 1>foo.gzp |
|
|
pasqda -7 foo.paq foo.xml |
|
|
wzzip –ep foo.ppd foo.xml |
|
|
ppmz2 -e foo.xml foo.ppm |
|
|
xml2wbxml -k -o foo.wbx foo.tdy |
|
|
wzzip -en –ee foo.wzp foo.xml |
|
|
java -Dorg.xml.sax.driver=com.bluecast.xml.Piccolo -cp
Piccolo.jar;saxxbis.jar;.; test.RunTest XBIS foo.xml |
|
|
./compress foo.tdy H V A N |
|
|
xmill -f -v -w foo.xml xmill -f -v -w -p "//(*)" foo.xml xmill –m 470 -9 -f -v -w foo.xml |
|
|
xmlppm foo.xml foo.xpm |
|
|
XML-ZIP |
java -classpath xml4j.jar;. XMLZip foo.xml 2 |
|
|
|
|
NOT TESTED (links added if available/provided &/or as
time permits) |
|
|
Efficient
XML (basis of EXI), BOX, MPEG-7
(BiM), XML-Xpress,
XCQ, XPRESS,
OpenGIS BXML (CWXML), Oracle,
IBM DB2, MS
SQL Server, Millau, AXECHOP, XCOMP,
XCpaqs,
XCQ |
|