Aggregated Distribution

note: maybe we will do them, maybe we won't

A node's aggregated distribution is a compilation of all the child data flows of its contained data nodes (their _data-flow/_current/ snapshots), situated directly under the parent node with an intuitive filename like "nodename.ext".

Both bare node and dataset reference nodes can have aggregate distributions. data nodes include their own data in the aggregation.

Purpose

Aggregated distributions enable composable semantic data by:

  • Combining contained nodes data into a single resource
  • Supporting modular ontology and knowledge base construction

Generation Process

During Weave Process, aggregated distributions are created by:

  1. Scanning contained data nodes recursively within the mesh structure
  2. Collecting _data-flow/_current/ distributions from each flow
  3. Merging content with proper URI resolution and prefix handling
  4. Excluding _config and _meta datasets (data content only)
  5. Generating multiple distributions (.ttl, .rdf, .jsonld) as configured

Examples

Composable Ontology

/my-ontology/
├── my-ontology.ttl              ← Aggregated distribution
├── my-ontology.rdf              ← Aggregated distribution  
├── my-ontology.jsonld           ← Aggregated distribution
├── components/
│   ├── Person/                  ← data node (class definition)
│   ├── hasName/                 ← data node (property definition)
│   └── Organization/            ← data node (class definition)

Knowledge Base

/biotech-kb/
├── biotech-kb.ttl               ← Aggregated distribution
├── biotech-kb.jsonld            ← Aggregated distribution
├── companies/
│   ├── genentech/               ← Company data node
│   └── moderna/                 ← Company data node
└── products/
    ├── drug-x/                  ← Product data node
    └── vaccine-y/               ← Product data node

Technical Considerations

Merging logic handles:

  • Relative path resolution - Converting relative URIs to absolute
  • Prefix consolidation - Deduplicating namespace declarations
  • Graph merging - Combining RDF graphs from multiple sources; de-duplicating
  • Base URI handling - Ensuring consistent URI resolution

Use Cases

  • Ontologies - Classes and properties from contained nodes
  • Vocabularies - Terms and definitions from specialized nodes
  • Catalogs - Dataset metadata from multiple sources
  • Knowledge bases - Facts distributed across domain-specific nodes
  • Configuration data - Settings aggregated from component services
  • data flow - Source datasets for aggregation
  • Weave Process - Process that generates aggregated distributions
  • flow snapshot - Contains the actual distributions being aggregated

Backlinks