dataset reference nodes

Overview

Dataset reference nodes (or “data nodes” for short) are reference nodes that represent and contain an evolvable "payload" dataset in the form of a data flow.

Because it is evolvable, it gets typed as a DatasetSeriesit intramesh identifier

Its actual data is kept in a node flow

Its versions are datasets (Private).

Unlike flow snapshots which contain concrete data distributions, data nodes serve as conceptual containers that organize and provide identity for data without containing the data directly. I.e., data nodes only contain concrete datasets by virtue of containing data flow (also abstract) and its snapshots, which have concrete distributions.

data nodes are physically represented as mesh folders and correspond to namespace segments.

Abstract vs Concrete Data

Abstract Data Concept (data node)

A data node represents the idea or concept represented by a dataset:

  • /ns/djradon/bio/ = a biographical dataset about the person djradon
  • /ns/census/ = the results of a census
  • /ns/weather-stations/ = "the concept of weather station data" This idea or concept is the referent of the data node's URL.

The data node provides:

  • Stable identity: The concept persists even as concrete data changes
  • Organizational structure

Data flow (DatasetSeries)

data flow is the single user data flow for a node, realized by snapshots:

  • /ns/monsters/_data-flow/_current/ = the current dataset snapshot
  • /ns/weather-stations/_data-flow/_v3/ = version 3 dataset snapshot

Snapshots contain distribution files: the actual data in various formats (e.g., .trig, .jsonld)

Required Structure

Every data node must contain:

  • metadata flow (_meta-flow/): Administrative metadata about the data concept
  • data flow (_data-flow/): dataset data
  • Node handle (_node-handle/): Referential indirection for the node

Optional Structure

Key Characteristics

Not a Dataset

Important: A data node does not refer to a specifc RDF graph; it is not itself a (concrete) dataset. It represents the abstract concept of a dataset that may evolve over time:

  • data nodes are never versioned (only their components are)
  • data nodes serve as stable conceptual anchors

Extensible Container

Like all mesh nodes, data nodes can contain other mesh nodes and components, making them extensible namespace containers.

Examples

Unversioned data node

ns/monsters/
├── _meta-flow/                 # metadata about the "monsters" data node
├── _node-handle/               # handle for the data node
└── _data-flow/                 # single data flow
    └── _current/               # current dataset snapshot
        ├── monsters.jsonld     # concrete distribution of the current snapshot
        └── monsters.trig

Backlinks