DPCH05 — Self-Describing
“Rich Metadata Embedded”
What DPCH05 is really asserting
DPCH05 is not asserting that:
“Some metadata fields are filled in.”
It is asserting that:
A Data Product carries enough embedded, product-level information for a consumer (human or machine) to understand its meaning, scope, structure, lineage, and usage context without relying on external explanations.
Self-describing means the product explains itself.
The Essence (HDIP + Data Mesh Interpretation)
A Data Product is self-describing if and only if:
- Meaning is explicit, not implied
- Context travels with the product
- Understanding does not depend on people, systems, or tribal knowledge
If understanding requires:
- asking the producer,
- reading pipeline code,
- reverse-engineering schemas,
then DPCH05 is not met, even if metadata exists somewhere.
Positive Criteria — When DPCH05 is met
DPCH05 is met when all of the following are true:
1. Business meaning is explicit and authoritative
The product clearly describes:
- what business concept(s) it represents
- how those concepts are defined
- what is in scope and out of scope
Descriptions are written in business language, not system language.
2. Structure is explained, not just exposed
The product includes:
- clear description of entities, events, measures
- explanation of key fields (not just names)
- units, time semantics, and aggregation meaning where relevant
A schema without explanation is not self-describing.
3. Lineage and provenance are visible
Consumers can see:
- where the data comes from (at product level)
- major upstream dependencies
- transformation intent (at a conceptual level)
This supports trust without exposing technical pipelines.
4. Usage context and constraints are stated
The product declares:
- intended use cases
- known limitations
- freshness expectations
- quality posture (“best effort”, “regulated reporting”, etc.)
This allows consumers to self-assess fitness for purpose.
Negative Criteria — When DPCH05 is not met
DPCH05 is not met if any of the following are true:
❌ Metadata exists but is technical or opaque
Examples:
- table names and column names only
- system-oriented descriptions
- abbreviations without definitions
This describes storage, not meaning.
❌ Meaning lives outside the product
Examples:
- definitions only in Confluence
- explanations only in slide decks
- “ask the team” as documentation strategy
This creates dependency and does not scale.
❌ Lineage or quality context is absent
Examples:
- no indication of sources
- no freshness expectations
- no explanation of update behavior
Consumers cannot assess trust or suitability.
Edge Cases (Important Guidance for Agents)
Case 1: “Schema documented, but business meaning unclear”
⚠️ Partial
Rationale:
- Some descriptive effort exists
- Still requires external interpretation
- Common transitional state
Case 2: “Auto-generated metadata only”
❌ Not met
Rationale:
- Automation without semantics is insufficient
- Self-description must include intent, not just structure
Case 3: “Rich product page with business narrative”
✅ Met
Rationale:
- Product explains itself
- Consumers can decide independently
- Meetings become optional, not required
Evidence Signals an Agent Should Look For
Authoritative evidence:
- Product-level description written in business terms
- Defined entities/events/measures with explanations
- Lineage summary linked to the product
Supporting evidence:
- Glossary term links
- Quality/freshness statements
- Usage examples
Red flags:
- Documentation focused on pipelines or tables
- Acronyms without definitions
- Metadata copied verbatim from systems
How an AI Agent Should Decide
Decision rule (simplified):
If a competent consumer cannot understand what the data represents and how it should be used without talking to the producer, DPCH05 is not met.
Why DPCH05 Is Non-Negotiable
Without DPCH05:
- discoverability leads to confusion
- reuse creates risk
- governance cannot scale
- AI consumption becomes dangerous
Self-description is what turns data into a trustworthy product, not just an exposed asset.
Canonical Statement (for BPS)
DPCH05 is satisfied only when a Data Product embeds sufficient business meaning, structure explanation, lineage context, and usage guidance to be understood and evaluated independently by consumers.