|
XML
File Corpus (back to pubs) |
This page provides hyperlinks to the XML binary formats,
tools, and compressors described in the paper:
Chris Augeri*, Barry
Mullins, Dursun Bulutoglu, Rusty Baldwin, and
files /
links: supporting material, paper, slides, proceedings, conference, citations — 1, 2, 3, 4
DISCLAIMER: All
links, sample usages, etc., were as used in the paper—please contact me
if you have any questions.
|
Tool |
Output |
|
Clean up XML files |
|
|
Calculate 0-order entropy |
|
|
Reports various statistics, e.g., # lines, characters,
etc. |
|
|
The internet archive, used to find a version of XML-ZIP |
|
|
XML Schema extractor |
|
|
Statistical analysis |
|
|
The computer series used to run our tests |
Compressors &/or Binary Formats
|
Compressor |
Sample Usage |
|
BZIP2 -k
-f -v foo.xml |
|
|
arith -e
-t word -m 255 -c 20 foo.xml 1 > foo.cac |
|
|
java -cp
FastInfoset.jar com.sun.xml.fastinfoset.tools.XML_SAX_FI foo.xml foo.fis |
|
|
GZIP -9
-c -f -v foo.xml 1>foo.gzp |
|
|
pasqda -7
foo.paq foo.xml |
|
|
wzzip
–ep foo.ppd foo.xml |
|
|
ppmz2 -e foo.xml foo.ppm |
|
|
xml2wbxml
-k -o foo.wbx foo.tdy |
|
|
wzzip -en
–ee foo.wzp foo.xml |
|
|
java -Dorg.xml.sax.driver=com.bluecast.xml.Piccolo -cp
Piccolo.jar;saxxbis.jar;.; test.RunTest XBIS foo.xml |
|
|
./compress foo.tdy H V A N |
|
|
xmill -f
-v -w foo.xml xmill -f
-v -w -p "//(*)" foo.xml xmill
–m 470 -9 -f -v -w foo.xml |
|
|
xmlppm
foo.xml foo.xpm |
|
|
XML-ZIP |
java
-classpath xml4j.jar;. XMLZip foo.xml 2 |
|
|
|
|
NOT TESTED (links added if available/provided &/or as time
permits) |
|
|
Efficient XML (basis of
EXI), MPEG-7
(BiM), XML-Xpress, XCQ, XPRESS, OpenGIS BXML (CWXML), Oracle,
IBM DB2, MS SQL
Server, Millau, AXECHOP, XCOMP,
XCpaqs,
XCQ |
|