С xml library windows

The XML C parser and toolkit of Gnome

libxml

Main Menu
  • Home
  • Reference Manual
  • Introduction
  • FAQ
  • Developer Menu
  • Reporting bugs and getting help
  • How to help
  • Downloads
  • Releases
  • XML
  • XSLT
  • Validation & DTDs
  • Encodings support
  • Catalog support
  • Namespaces
  • Contributions
  • Code Examples
  • API Menu
  • XML Guidelines
  • Recent Changes
Related links
  • Mail archive
  • XSLT libxslt
  • DOM gdome2
  • XML-DSig xmlsec
  • FTP
  • Windows binaries
  • Solaris binaries
  • MacOsX binaries
  • lxml Python bindings
  • Perl bindings
  • C++ bindings
  • PHP bindings
  • Pascal bindings
  • Ruby bindings
  • Tcl bindings
  • Bug Tracker

«Programming with libxml2 is like the thrilling embrace of an exotic stranger.» Mark Pilgrim

Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License. XML itself is a metalanguage to design markup languages, i.e. text language where semantic and structure are added to the content using extra «markup» information enclosed between angle brackets. HTML is the most well-known markup language. Though the library is written in C a variety of language bindings make it available in other environments.

Libxml2 is known to be very portable, the library should build and work without serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, VxWorks, . )

Libxml2 implements a number of existing standards related to markup languages:

  • the XML standard: http://www.w3.org/TR/REC-xml
  • Namespaces in XML: http://www.w3.org/TR/REC-xml-names/
  • XML Base: http://www.w3.org/TR/xmlbase/
  • RFC 2396 : Uniform Resource Identifiers http://www.ietf.org/rfc/rfc2396.txt
  • XML Path Language (XPath) 1.0: http://www.w3.org/TR/xpath
  • HTML4 parser: http://www.w3.org/TR/html401/
  • XML Pointer Language (XPointer) Version 1.0: http://www.w3.org/TR/xptr
  • XML Inclusions (XInclude) Version 1.0: http://www.w3.org/TR/xinclude/
  • ISO-8859-x encodings, as well as rfc2044 [UTF-8] and rfc2781 [UTF-16] Unicode encodings, and more if using iconv support
  • part of SGML Open Technical Resolution TR9401:1997
  • XML Catalogs Working Draft 06 August 2001: http://www.oasis-open.org/committees/entity/spec-2001-08-06.html
  • Canonical XML Version 1.0: http://www.w3.org/TR/xml-c14n and the Exclusive XML Canonicalization CR draft http://www.w3.org/TR/xml-exc-c14n
  • Relax NG, ISO/IEC 19757-2:2003, http://www.oasis-open.org/committees/relax-ng/spec-20011203.html
  • W3C XML Schemas Part 2: Datatypes REC 02 May 2001
  • W3C xml:id Working Draft 7 April 2004

In most cases libxml2 tries to implement the specifications in a relatively strictly compliant way. As of release 2.4.16, libxml2 passed all 1800+ tests from the OASIS XML Tests Suite.

To some extent libxml2 provides support for the following additional specifications but doesn’t claim to implement them completely:

  • Document Object Model (DOM) http://www.w3.org/TR/DOM-Level-2-Core/ the document model, but it doesn’t implement the API itself, gdome2 does this on top of libxml2
  • RFC 959 : libxml2 implements a basic FTP client code
  • RFC 1945 : HTTP/1.0, again a basic HTTP client code
  • SAX: a SAX2 like interface and a minimal SAX1 implementation compatible with early expat versions

A partial implementation of XML Schemas Part 1: Structure is being worked on but it would be far too early to make any conformance statement about it at the moment.

  • the libxslt page providing an implementation of XSLT 1.0 and common extensions like EXSLT for libxml2
  • the gdome2 page : a standard DOM2 implementation for libxml2
  • the XMLSec page: an implementation of W3C XML Digital Signature for libxml2
  • also check the related links section for more related and active projects.

Hosting sponsored by Open Source CMS services from AOE media.

С xml library windows

Last Update: 2012-02-15.

This page tries to give a comparison of existing free C or C++ XML parser libraries. It includes both full blown as well as lightweight parsers.

This list is of course not exhaustive, but it claims to include all free parser libraries that have a significant user base, that are more or less actively maintained and that cover the most widely used desktop PC platforms, i.e. Win32, Unix, Linux, Mac-OS, BSD. So if you feel i have missed one, please tell me.

I’ll start with a comparison table giving a quick overview of all available libraires. The following section gives some more detail, mostly based on my personal impressions and/or experiences on the Win32 platform. If you find any inaccuracies or want to contribute to this comparison table by providing a feedback with your own experiences, you are welcome to send me an email at lars.ruoff@gmx.net.

This article has been translated to Serbo-Croatian language by Anja Skrba from Webhostinggeeks.com.

Comparison Table

The following table gives an overview of the most popular free C or C++ XML parser libraries. Most of them are multi-platform and come with C/C++ source code. See my personal comments below.

Library Language Features/Interfaces License Platforms/Packaging
Full blown (SAX,DOM,Validating) XML Parser Libraries:
Xerces
http://xml.apache.org/xerces-c/index.html
C++ DOM, SAX2, XML Schema, Namespaces, Validation Apache License, Version 2.0 Precompiled libraries for Windows, UNIX, Linux, etc. Optionally source+makefiles.
Arabica
http://www.jezuk.co.uk/cgi-bin/view/arabica
C++ DOM, SAX2, XPath BSD style C++ source with VC7, VC8 and Unix makefiles
libxml2
http://xmlsoft.org/
C, various language bindings DOM, SAX (limited), XPath, XmlTextReader (pull) MIT License Linux, Unix, Windows, CygWin, MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, .
libxml++
http://libxmlplusplus.sourceforge.net/
C++ DOM, SAX Gnu LGPL (see libxml2)
MSXML
MSXML at msdn.microsoft.com
C++, COM interface DOM, SAX2, XSLT, XML Schemas Proprietary (Microsoft) MS Windows only. Comes with Windows XP/Vista.
Fast/Low footprint XML Parser Libraries:
expat
http://expat.sourceforge.net/
C, various 3rd party language bindings available Stream oriented. Event based. Various third party SAX/DOM wrappers and language bindings. MIT license C source
TinyXml
http://www.grinninglizard.com/tinyxml/
C++ DOM zlib license C++ source with VC6 and Unix makefiles
Xmlio
http://www.fxtech.com/xmlio/
C++ C++ I/O stream oriented. «pull» model. Gnu LGPL C++ source with VC6 and Unix makefiles
XmlLite
Article at msdn.microsoft.com
C++ Stream oriented, «pull» model. Proprietary (Microsoft) MS Windows only. Part of Vista SDK.
RapidXml
http://rapidxml.sourceforge.net/index.htm
C++ DOM-style Boost 1.0 or MIT License Portable C++, header only.
XMLParser library from Frank Vanden Berghen
http://www.applied-mathematics.net/tools/xmlParser.html
C++ DOM-style Aladdin Free Public License Source code with makefiles and VC6 workspace.
PugiXML
http://code.google.com/p/pugixml/
C++ DOM style, with interface to walk the tree. XPath MIT License C++ source
libroxml
http://www.libroxml.net/
C DOM style LGPL v2.1 Source tar.gz or debian package

My Comments

Xerces

Fully validating XML 1.0 parser. Huge and powerful. Not personally tested.

Size of Windows DLL is 1.8Mb. Said to be relatively slow. Good documentation, examples and tutorial. Compliant to XML 1.0, partially XML 1.1, DOM lv1, DOM lv2 Core, partially DOM lv3, SAX 1.0/2.0, Namespaces, XML Schema.

Arabica

SAX2, DOM, XPath and partial XSLT implementation to be used with expat, libxml, Xerces or MSXML. Not tested.

Arabica is not an XML parser on its own. Instead, it uses an underlying parser like expat, libxml, Xerces or MSXML to do the low level parsing. The objective of Arabica is to provide a nice standard C++ implementation of DOM and SAX2 interfaces on top of these parsers. Hence, Arabica has to be set up and built for one of the underlying parsers before use. This can make the installation a bit fiddly and requires some additional time for setup. Arabica+expat seems to be an interesting combination, both in terms of performance/footprint and ease of use. (Not tested myself though).

Package includes short example code for each API.

libxml2

Gnome’s XML parser library. It’s the standard XML lib of the GNU folks. Can be cumbersome to get up and running on Windows, since it heavily depends on other GNU infrastructure (glib etc.). Included documentation is sparse but there are a lot of external sites providing docs, examples, tutorials.

libxml++

C++ bindings for libxml2. I.e. you have to have libxml2 installed in order to use libxml++. Features a nice standard C++ interface to libxml2. On Windows it is difficult to install and set up due to multiple dependencies on other GNU libs. For example uses Glib::ustring instead of std:strings. A short manual and reference is included.

Warning: For Win32, libxml++1.0 is better suited than libxml++2.x, due to a nasty glibmm dependency.

expat

Very fast, low level XML parser for small or embedded applications or as a basis for higher level parser APIs. Stream oriented. Event based. Various third party SAX/DOM wrappers and language bindings. Especially, see expatpp C++ wrapper for expat. (Not tested).

Said to be robust and well tested. Is used with many open source projects.

TinyXml

Not tested. There is also another project, TiCPP — TinyXML++, which aims to be a more advanced «C++ style» interface to TinyXml.

Xmlio

Fast and tiny. Uses a proprietary stream oriented «pull» model. Limitations:

  • only ASCII encoding right now
  • no namespaces
  • no PI handlers

My tests showed that other than ASCII characters can be read.

XmlLite

High performance, lightweight native C++ XML parser from Microsoft. «The primary goals of XmlLite are ease of use, performance, and standards compliance.» Not tested.

It aims to provide a more lightweight and hassle-free alternative to MSXML with a focus on native C++ development (as opposed to managed/.NET). XmlLite features a simple «pull» programming model with a stream-oriented XmlReader class. It has support for a large set of common character encodings but only limited support for DTD. As usual for a Microsoft product, there are lots of high quality documentation and articles.

RapidXml

It is a DOM-style in-situ parser written in modern C++, which tries to be as fast as possible. Claims to be a «seriously fast and small parser, [with] hassle-free integration». Entire library is contained in a single header file, and requires no building or configuration. A concise online manual including examples is avaialble. Not tested.

You may also like to check Boost.PropertyTree library, which presents a higher level interface, and uses RapidXml as its default XML parser.

XMLParser library from Frank Vanden Berghen

Simple and fast DOM-type parser. Supports XML namespaces, wide range of character sets & encodings. No library dependencies other than . A small tutorial is included on the website. Not tested myself.

PugiXML

«Light-weight, simple and fast XML parser for C++.» Pros and Cons (as advertised):

  • Low memory consumption and fragmentation (compared to other DOM style parsers).
  • Extremely high parsing speed.
  • Standard-conformant (with the exception of DTD related issues).
  • Pretty much error-ignorant.
  • Assumes UTF-8 encoding of the input data. UTF-16/32 not supported.
  • Fully standard compliant C++ code. Multiplatform.
  • High flexibility. You can control many aspects of file parsing and DOM tree building via parsing options.
  • Lacks validation, DTD processing, XML namespaces, proper handling of encoding.
  • Lacks UTF-16/32 parsing.

The website includes a short documentation page including some code samples that illustrate the use of the library. Not tested myself.

libroxml

This library is a minimum, easy-to-use, C implementation for xml file parsing. Libroxml targets mainly embedded software and environments, but you can use it whenever you need to deal with XML since libroxml is ligth and fast.

Info from the developer: The packaging is tar.gz or debian package (not yet in debian repo). It compiles under Linux, MacOSX and Windows (using cmake). No dependencies are used except one mutex (nativ criticalsection under windows or pthread for Linux and osx).

The target is embedded software so it is a DOM like API but with a very low footprint as it only indexes the XML content inside the file instead of loading it all.

API supports XML parsing of file or buffers, navigation inside the tree, a large subset of xpath is handled, tree can be created/modified in RAM and then commited (serialized) to a file or buffer. Namespace is about to be fully supported in next release.

Читайте также:  Arm процессор для linux
Оцените статью