dataset reference nodes
Overview
Dataset reference nodes (or “data nodes” for short) are reference nodes that represent and contain an evolvable "payload" dataset in the form of a data flow.
Because it is evolvable, it gets typed as a DatasetSeriesit intramesh identifier
Its actual data is kept in a node flow
Its versions are datasets (Private).
Unlike flow snapshots which contain concrete data distributions, data nodes serve as conceptual containers that organize and provide identity for data without containing the data directly. I.e., data nodes only contain concrete datasets by virtue of containing data flow (also abstract) and its snapshots, which have concrete distributions.
data nodes are physically represented as mesh folders and correspond to namespace segments.
Abstract vs Concrete Data
Abstract Data Concept (data node)
A data node represents the idea or concept represented by a dataset:
/ns/djradon/bio/
= a biographical dataset about the person djradon/ns/census/
= the results of a census/ns/weather-stations/
= "the concept of weather station data" This idea or concept is the referent of the data node's URL.
The data node provides:
- Stable identity: The concept persists even as concrete data changes
- Organizational structure
Data flow (DatasetSeries)
data flow is the single user data flow for a node, realized by snapshots:
/ns/monsters/_data-flow/_current/
= the current dataset snapshot/ns/weather-stations/_data-flow/_v3/
= version 3 dataset snapshot
Snapshots contain distribution files: the actual data in various formats (e.g., .trig, .jsonld)
Required Structure
Every data node must contain:
- metadata flow (
_meta-flow/
): Administrative metadata about the data concept - data flow (
_data-flow/
): dataset data - Node handle (
_node-handle/
): Referential indirection for the node
Optional Structure
- Asset trees (
_assets/
): Attached file collections - CHANGELOG and README
- Node Config Defaults
Key Characteristics
Not a Dataset
Important: A data node does not refer to a specifc RDF graph; it is not itself a (concrete) dataset. It represents the abstract concept of a dataset that may evolve over time:
- data nodes are never versioned (only their components are)
- data nodes serve as stable conceptual anchors
Extensible Container
Like all mesh nodes, data nodes can contain other mesh nodes and components, making them extensible namespace containers.
Examples
Unversioned data node
ns/monsters/
├── _meta-flow/ # metadata about the "monsters" data node
├── _node-handle/ # handle for the data node
└── _data-flow/ # single data flow
└── _current/ # current dataset snapshot
├── monsters.jsonld # concrete distribution of the current snapshot
└── monsters.trig
Backlinks