JSON XML String
A bridge between XML and JSON. For round-tripping XML with proper whitespace handling with JSON formats.
1. Analysis
JSON has only the primitive String to represent strings.
JSON strings have no additional tagging or metadata capabilities.
XML strings, on the other hand, are much more powerful. In fact, each XML element can be seen as a container for text nodes and sub-elements.
-
The text nodes themselves can contain
CDATAsections. -
Elements can define white-space handling:
The surrounding XML element can define via a special attributexml:preservewhether the XML-consuming app should decide (attribute is absent or value isdefault) or whether the space should be protected (attribute valuepreserve).
XML in practice is used for two scenarios:
<products>
<product category="clothing">
T-Shirt
</product>
<product category="food">
Strawberry
</product>
</products>
<desc>
The <em>sweet</em> strawberry has
the best <b>taste</b> of all berries.
</desc>
And in reality, e.g., in GraphML documents, we have both combined: Structured data (graph, nodes, edges) with formatted text (data, descriptions).
In our example, the <desc> could be processed into the JSON string
"The <em>sweet</em> strawberry has the best <b>taste</b> of all berries."
<desc xml:space="preserve">
The <em>sweet</em> strawberry has
the best <b>taste</b> of all berries.
</desc>
This example could only be processed into this JSON string
"\n The <em>sweet</em> strawberry has\n the best <b>taste</b> of all berries.\n"
When converting back from the JSON strings, it makes a difference if the string is meant to encode XML or a plain string.
If written to XML, the characters < and & need to be escaped as & l t ; and & a m p ; (no spaces).
2. Proposal
We define a JSON XML String as a new primitive value in our JSON APIs. A JSON XML String has two properties:
- xml
-
The string value. Required.
- xmlSpace
-
This is either
default(the default) orpreserve. Optional. It encodes the effective XML space setting at the exporting element.
A JSON XML String can be represented as a JSON object, using exactly two properties, xml and xmlSpace. The xmlSpace property has a default value and may be omitted.
When converting back to XML to an element aaa, the expected output is
<aaa xml:space="preserve">
Hello <3
</aaa>
The empty XML string is in JSON {"xml":""}.
We use these JSON XML Strings to round-trip GraphML textual XML, which occurs as <desc>, <key><default> and <data> elements.
|
3. Practical Advice
In Graphinout, we use a Jackson JSON parser, inspect JSON objects and report those with the properties xml (and optionally also xmlSpace) to the next API layer using a custom JSON API, in which JSON XML Strings are a kind of primitive.
See IJsonXmlString in graphinout base repository.