Connected JSON Specification

A JSON Format for exchanging connected data (graphs, networks).

Permanent URL: J-S-O-N.org.
Remember it as the connected version of JSON.org
See also: Extended Connected JSON.

Version	5.0.0
Date	2025-07-14
Editor	Dr. Max Völkel
Status	Last Call Working Drafts (Until 2025-09-31)
Spec	this site
Git	https://github.com/Calpano/connected-json.git

Version

5.0.0

Date

2025-07-14

Editor

Dr. Max Völkel

Status

Last Call Working Drafts (Until 2025-09-31)

Spec

this site

Git

https://github.com/Calpano/connected-json.git

1. Introduction

We want a JSON-based document for exchanging graphs. Graphs contain nodes and edges. Undirected edges, directed edges (DAG), typed edges (Hello RDF), weighted edges (Hello flow algorithms) and even hyper-edges (Hello biologists). We want subgraphs (Hello diagrams). We want data attached to nodes and edges (Hello knowledge graphs).

1.1. Goals and Motivation

Yes, we know, but the last effort (JGF, the JSON Graph Format) is over 10 years old and GraphML over 20 years by now. And some GraphML features (mixed hyper-edges, nested graphs) are not supported in JGF. In fact, none of the existing JSON graph interchange formats has the same breadth of features as the over 20-year-old XML-based GraphML.

Connected JSON aims to be a full GraphML replacement. It supports the semantic capabilities and data representation found in GraphML, while adopting a more flexible, schema-less JSON approach.

This format is intended as a universal interchange format for all kinds of graphs, which can be as complex as what GraphML allows — and that is a lot.

For ways how to interpret similar, much more flexible formats unambiguously as Connected JSON, look into Extended CJ.

To support streaming for large graphs (> 1 GB) and to make textual diffing Connected JSON files easy, we also define Canonical Connected JSON.

1.2. Example

Connected JSON Example File

{
  "connectedJson": {
    "versionDate": "2025-07-14",
    "versionNumber": "5.0.0"
  },
  "baseUri": "http://example.org/",
  "graphs": [{
    "nodes": [
      { "id":  "12" },
      { "id":  "a",
        "ports": [
          { "id": "a1"},
          { "id": "a2",
            "ports": [ "a2-1", "a2-2" ]
          }]},
      { "id":  "b", "data": {"foo": "bar"} },
      { "id":  "c" },
      { "id":  "d" },
      { "id":  "e" },
      { "id":  "f" }
    ],
    "edges": [
      { "endpoints": [
        { "direction": "in", "node":  "12"},
        { "direction": "out", "node":  "a"}
      ]},
      { "endpoints":  [
        { "direction": "in", "node": "12", "port":  "a2-1"},
        { "direction": "out", "node": "a"}
      ]},
      { "endpoints":  [
        { "direction": "in", "node": "12"},
        { "direction": "in", "node": "a", "port": "a2-1" },
        { "direction": "out", "node": "d"},
        { "direction": "out", "node": "e"}
      ]},
      { "endpoints":  [
        { "direction": "in", "node": "12"},
        { "direction": "in", "node": "a"},
        { "direction": "out", "node": "d"},
        { "direction": "out", "node": "e"},
        { "direction": "undir", "node": "f"}
      ]}
    ],
    "data": {
      "hello": ["My data","can be","here"]
    }
  }]
}

1.3. Change Log

2025-09-23: Version 6.0.0

Removed graph.meta properties nodeCountTotal, edgeCountTotal, nodeCountInGraph, and edgeCountInGraph.
Moved canonical from graph.meta to document connectedJson.
Removed graph metadata.

2025-07-14: Version 5.0.0

Split spec into two parts: Connected JSON for writing strict files, where there is always only one option to encode a structure and Extended CJ which is much more liberal and flexible in parsing.
Moved edgeDefault to Extended CJ.

2025-07-10: Version 4.0.0

Simplified graph nesting. Now a CJ document is a graph (or array of graphs).

2025-07-03: Version 3.0.0

Renamed all properties with a dash to camelCase form. This makes it pragmatically more easy to represent properties in programming languages as variable names or enum values.
- type-node → typeNode
- type-uri → typeUri
Renamed some lowercase properties to camelCase form. This avoids IDEs and editors complaining about spelling.
- baseuri → baseUri
- edgedefault → edgeDefault

2025-06-26: Version 2.0.0

Multilingual labels (Label): switched from a JSON object with language tags as property keys to a more canonical array-form.

2025-04-30: Version 1.1.0

Clearer ID section
Allow graph inside edge (consistent with diagram an GraphML)

2025-04-08: Version 1.0.0

Initial public release

2. Overview

Suggested MIME type: application/connected+json (not yet registered).

We define two main formats:

Connected JSON (CJ): A strict format for writing. There is always only one option to encode a structure.
Extended CJ (ECJ): A relaxed superset of CJ for reading. It offers many aliases, shortcuts and variants to interpret JSON as as graph. See Extended CJ Specification.

These main formats are refined based on allowing comments (JSON5 adds comments to JSON) and canonicalization:

Table 1. The Connected JSON Formats
Name	Default file extension	Purpose	Allows JSON Comments
Defined in Connected JSON (this specification)
Connected JSON	`.cj` or `.cj.json`	Written by tools	no
Connected JSON	`.cj.json5`	Written by tools, commented by humans.	yes
Canonical Connected JSON	`.cj`	Optimized for streaming and diffing	no
Defined in Extended CJ
Extended Connected JSON	`.json`	Read diverse JSON files	no
Extended Connected JSON	`.json5`	Read diverse JSON files	yes

All formats restrict JSON to the I-JSON subset defined in RFC 7493: No duplicate object properties, UTF-8 encoding, no unpaired UTF-8 surrogate pairs.

2.1. Conceptual Model

Before diving into JSON structures, it is helpful to describe how Connected JSON sees a graph. In general, Connected JSON supports hyperedges with mixed directionality, like GraphML. It also keeps the node and optional port model from GraphML. It supports two ways of Graph Nesting. Connected JSON allows (multilingual) labels on many elements.

A document contains graphs.
A graph contains nodes and edges.
A node may optionally consist of a hierarchical tree of ports.
An edge refers to nodes via endpoints.
An endpoint defines for each edge-node connection, what the direction is (is the node going into the edge, out of the edge or has no direction)
An endpoint can connect to a node and optionally fine-tune to a port within that node.

Figure 1. Conceptual Model

3. Elements

3.1. Document

Every file is a document.

Table 2. Property Table in Canonical / Streaming Order
Property	Type	Description
`connectedJson`	`object`(Document Metadata)	Optional. Document Metadata
`baseUri`	`string`(URI)	Optional. Is used to fine-tune the Interpretation as RDF.
`data`	`any`	Optional. Allows user-attached Data.
`graphs`	`array`(Graph `[]`)	Default: Empty. See also Graph Nesting.

3.1.1. Document Metadata

A graph may state a connectedJson property, which is only interpreted at root level.

Property Type Description

Property	Type	Description
`canonical`	`boolean`	Optional. If `true`, this document is considered a canonical representation: All properties are ordered according to the property tables. Default: `false`.
`versionDate`	`string`	Optional. Version date identifier to define the Connected JSON version used by the document. E.g. `2025-07-10`
`versionNumber`	`string`	Optional. Version number identifier to define the Connected JSON version used by the document. E.g. `4.0.0`

canonical

boolean

Optional. If true, this document is considered a canonical representation: All properties are ordered according to the property tables. Default: false.

versionDate

string

Optional. Version date identifier to define the Connected JSON version used by the document. E.g. 2025-07-10

versionNumber

string

Optional. Version number identifier to define the Connected JSON version used by the document. E.g. 4.0.0

3.2. ID

IDs (identifiers) are used in Connected JSON to address nodes, ports, edges and graphs. Ids are strings.

If an array contains elements with an id (this mechanism is used in graphs, nodes, edges) then the ids must be unique within that array. If an id is for multiple entries in the array, later entries are interpreted as JSON Merge Patch on the earlier ones and a parse warning MUST be emitted. The merging is done as defined in RFC 7386.

3.2.1. Identifier Scope

The identifiers for different elements have different scopes in which they must be unique.

Scope	Comment
Document	Node ids, Edge ids and Graph ids are unique per document. Nested graphs do not provide a new id scope.
Node	Port ids are only unique within their corresponding Node.

Scope

Comment

Document

Node ids, Edge ids and Graph ids are unique per document. Nested graphs do not provide a new id scope.

Node

Port ids are only unique within their corresponding Node.

3.3. Label

Labels are used in Connected JSON to label nodes, ports, edges and graphs. In Connected JSON, labels are multilingual: They consist of an object with an optional language property and a required value property. The label itself is an array of such label entries.

[
    {"language":"de", "value": "Hallo, Welt"},
    {"language":"en", "value": "Hello, World"},
    // a value without language information is also allowed
    { "value": "Hi"}
]

If a language tag (including the empty one) is used multiple times, later entries are interpreted as JSON Merge Patch on the earlier ones and a parse warning MUST be emitted. The merging is done as defined in RFC 7386.

Table 3. Property Table in Canonical / Streaming Order
Property	Type	Description
`language`	`string`	Optional. Language tag. Usually according to BCP 47.
`value`	`string`	Required. The label value.
`data`	`any`	Optional. Allows user-attached Data.

Multilingual labels in Connected JSON have been modelled similar to labels in JSON-LD 1.1, expanded form.

3.4. Graph

Contains one or more nodes and/or one or more edges.

Table 4. Property Table in Canonical / Streaming Order
Property	Type	Description
`id`	`string`	Optional. Unique identifier for the graph within a Document. See ID.
`label`	`object`	Optional. Label (name) of the graph. See Label.
`data`	`any`	Optional. Allows user-attached Data.
`nodes`	`array`(Node `[]`)	0 to n nodes. Default: Empty.
`edges`	`array`(Edge `[]`)	0 to n edges (which may be bi- oder hyperedges). Default: Empty.
`graphs`	`array`(Graph `[]`)	Default: Empty. See Graph Nesting.

3.5. Node

A node is an atom in the graph.

Table 5. Property Table in Canonical / Streaming Order
Property	Type	Description
`id`	`string`	Required. Unique identifier for the node. See ID.
`label`	`object`	Optional. Label (name) of the graph. See Label.
`ports`	`array`(Port `[]`)	Optional array of Port.
`data`	`any`	Optional. Allows user-attached Data.
`graphs`	`array` (Graph `[]`)	Optional. Graph(s) nested within the node. This turns the node into a compound node. The edges in a subgraph can refer to nodes higher up in the tree of graphs. See Graph Nesting.

3.6. Port

A port is always a part of a Node. A layout should place a port on the border of the node widget. Ports may be hierarchically nested. This is used in practice graphical editors, where a port is a connection point on a node.

Table 6. Property Table in Canonical / Streaming Order
Property	Type	Description
`id`	`string`	Required. ID unique within the Node. All ports, even nested one, share the same ID space per node. See also ID.
`label`	`object`	Optional. Label (name) of the graph. See Label.
`ports`	`array`(Port `[]`)	Optional array of sub-ports. Recursively.
`data`	`any`	Optional. Allows user-attached Data.

3.7. Edge

Uses endpoints to link to nodes. However, simple bi-edges with only two ends have a shortcut syntax.

The structural model for any edge is this:

Figure 2. Edge Model

An edge has n endpoints.
An endpoint defines the direction of the attached node, relative to the edge. Is the node incoming, outgoing or undirected (from the perspective of the edge).
A target can be a node or a port attached to a port. Yes, a port can also be nested within other ports, forming a kind of recursive port-tree. GraphML has this.

Edges have been modelled like GraphML. They have been extended with a type-property, to make it easier to express RDF.

Table 7. Property Table in Canonical / Streaming Order
Property	Type	Description
`id`	`string`	Optional id. Unique per graph. See ID.
`label`	`object`	Optional. Label (name) of the graph. See Label.
`type`	`string`	Optional. The kind of edge. Any type define here applies to all endpoints. Endpoints override this type, if set. See Edge Endpoint and Interpretation as RDF.
`typeUri`	`string`
`typeNode`	`string`
`endpoints`	`array` (Edge Endpoint `[]`)	The endpoints define the nodes to which this edge is attached.
`data`	`any`	Optional. Allows user-attached Data.
`graphs`	`array` (Graph `[]`)	Optional. Graph(s) nested within the edge. This turns the edge into a compound edge. The edges in a sub-graph can refer to edges higher up in the tree of graphs. See Graph Nesting.

Precedence between type, typeUri and typeNode is the same as defined for Edge Endpoint.

3.8. Edge Endpoint

Table 8. Property Table in Canonical / Streaming Order
Property	Type	Description
`node`	`string`	Required. Node id. A `string` containing a single nodeId (ID). This is the id of the Node to which this endpoint is attached.
`port`	`string`	Optional. Port id. Port ids are only unique per node/port. See ID. If a port is referenced, it defines in addition to the node where precisely the endpoint is attached. NOTE: All port ids are unique within a node (see Identifier Scope), so that a single string can address all ports directly.
`direction`	One of: `in`, `out` or `undir`	Optional. Maps to incoming (`in`), outgoing (`out`), or undirected (`undir`). Default is `undir`.
`type`	`string`	Optional. The type of relation from the edge entity to the endpoint node. If a URI is given, us `typeUri` instead. This property states the relation as a string, e.g. `works at` or `knows`. Default is `related`.
`typeUri`	`string`(URI)	Optional. The type of relation from the edge entity to the endpoint node.
`typeNode`	`string`	Uses a node in the graph (referenced by node id, see ID) to define the kind of relation. This is the same strategy that RDF uses: property URIs are themselves RDF resources, which can have a label and other edges attached to them.
`data`	`any`	Optional. Allows user-attached Data.

Edge Type (type, typeUri, typeNode)

Either type, typeUri, or typeNode MAY be used. If several are given, typeUri has precedence, then typeNode, then type. Usually, the type of edge is defined at the Edge level. However, in hyper-edges more complex relations (tuples) may need to be expressed. In this case, endpoint-level typing can be used.
If both edge and endpoint types are given, the endpoint type has precedence. See also Interpretation as RDF.

4. Features

4.1. Data

User-defined data can be attached to Document, Graph, Node, Edge, Port and Edge Endpoint via the data property.The value may be any JSON value. An array can be used, together with the OCIF extension mechanism.

This can be used, for example, to attach style data (e.g. line-color), domain data (e.g. population, sales volume), provennance data (e.g. source), or any other relevant information.

4.2. Graph Nesting

Graphs can be nested within other graphs (Graphs In Graphs) or within other nodes and edges (Graphs In Nodes And Edges; a GraphML mechanism). The nesting depth is not limited. This allows for hierarchical, recursive graph structures.

All nodes in a top-level graph, including all nodes nested within subgraphs, recursively, share the same ID space. The same is true for edges. Any edges, including those nested in nested graphs, may link to any node within the top-level graph, including those within nested graphs.

Figure 3. Graph Nesting

4.2.1. Graphs In Graphs

It partitions nodes and edges into subsets. All nodes and edges are treated as one large graph. Any edge can refer to any node. The subgraph is merely used as a container entity. Its id and label do not contribute to the resulting nodes and edges model.

4.2.2. Graphs In Nodes And Edges

In Connected JSON, like in GraphML, nodes and edges can also contain subgraphs. Those subgraphs are additionally turning their container node into a compound node (or their container edge into a compound edge).

In a compound node, the ID and Label of the subgraph(s) are mapped to id and label of synthetic, implied compound node(s). Typically, this is represented in an application by adding synthetic 'contains'-edges from container element to contained elements.

4.3. Streaming

JSON in general is not ideal for streaming data, see also Notes on Streaming JSON. However, Canonical CJ is designed to be streamed efficiently. The property tables are sorted for optimized stream processing. This order is in contrast to RFC 8785 (JSON Canonicalization Scheme, JCS), which defines strict lexicographical order. Canonical CJ requires the order of properties to be followed exactly.

Rationale

Most entities are expected to be reasonably small, so that they can be completely processed in memory. Some entities may occur a large number of times. In general, small properties must come before the large properties (due to values with many child elements).

5. Canonical Connected JSON

Canonical CJ defines a strict order on property keys, compatible with Streaming, so that files can also be used in textual diffs. Canonical CJ is a strict subset of Connected JSON. It forbids using comments (no JSON5). Canonical CJ mandates a strict formatting, described below. Properties in which the value is an empty array should be omitted.

Summary

Mandatory pretty-printing
Mandatory property order

5.1. Formatting

There is no RFC defining JSON pretty-printing. So here is a small spec. We need a compact, defined, format, so that different CJ tools create the exact same syntax. Also, we need line-breaks to make textual diffing work. Canonical CJ compliant tools MUST adhere to these rules:

Indentation

Each level of nesting within an object or array must be indented.
The indentation must consist of two spaces. Tabs must not be used.

Line-Breaks

The line break character is \n.
The opening brace { of an object and the opening bracket [ of an array must be placed on the same line as their corresponding key or at the beginning of the document.
Each key-value pair in an object and each element in an array must be placed on its own line.
The closing brace } or bracket ] must be placed on a new line, aligned with the indentation level of its opening brace or bracket.

Spacing

There must be one space after the colon : in a key-value pair.
No other whitespace (except the indentation spaces and line-breaks) is permitted.

Commas

A comma , must follow every element in an array and every key-value pair in an object, except for the last one.

Example

{
  "connectedJson": {
    "versionDate": "2025-07-14",
    "versionNumber": "5.0.0"
  },
  "baseUri": "http://example.org/",
  "data": {
    "author": "Max Völkel"
  },
  "graphs": [
    {
      "id": "graph1",
      "meta": {
        "canonical": true
      },
      "label": {
        "language": "en",
         "value": "Example Graph"
      },
      "nodes": [
        {
          "id": "node1",
          "label": {
            "language": "en",
            "value": "Node 1"
          }
        }
      ],
      "edges": [
        {
          "id": "edge1",
          "label": {
            "language": "en",
            "value": "Edge from Node 1 to Node 2"
          },
          "endpoints": [
            {
              "node": "node1",
              "direction": "out"
            }
          ]
        }
      ]
    }
  ]
}

Appendix A: JSON Schema

Download

JSON Schema

Appendix B: Reserved Property Names

The following property names are used by Connected JSON in certain places.

Property Usage

Property	Usage
`baseUri`	Graph base URI for RDF interpretation
`connectedJson`	Document
`canonical`	Document Metadata
`data`	Reserved property for user data. Connected JSON does not interpret this property for any element.
`direction`	Edge Endpoint direction (in/out/undir)
`edges`	Graph edges
`endpoints`	Edge endpoints
`graphs`	Node nested graphs, Edge nested graphs
`id`	Node id, Edge id, Graph id, Port id
`label`	Node, Edge, Graph, Port
`language`	Label
`node`	Edge Endpoint referenced node id
`nodes`	Graph nodes
`port`	Edge Endpoint referenced port id
`ports`	Node ports
`type`	Edge, Edge Endpoint
`typeNode`	Edge, Edge Endpoint
`typeUri`	Edge, Edge Endpoint
`value`	Label
`versionDate`	Document Metadata
`versionNumber`	Document Metadata

baseUri

Graph base URI for RDF interpretation

connectedJson

Document

canonical

Document Metadata

data

Reserved property for user data. Connected JSON does not interpret this property for any element.

direction

Edge Endpoint direction (in/out/undir)

edges

Graph edges