Project

General

Profile

Feature #357 » draft-potvin-dwd-pipe-separated-format.md

markdown source - Charles Langlois, 02/09/2026 01:45 AM

 

title: "Data With Direction (DWD) Pipe-Separated Format"
abbrev: "DWD-PSF"
docname: draft-potvin-dwd-pipe-separated-format-00
category: info
ipr: trust200902
area: General
workgroup: Independent Submission
keyword:

  • DWD
  • Data With Direction
  • rules
  • truth tables
  • pipe-separated

author:
-
ins: J. Potvin
name: Joseph Potvin
org: Xalgorithms Foundation
email: [email protected]
-
ins: C. Langlois
name: Charles Langlois
org: Xalgorithms Foundation
email: [email protected]

normative:
RFC2119:
RFC3629:
RFC3986:
RFC4122:
RFC5234:
RFC8174:

informative:
RFC5322:
RFC4287:
SemVer:
title: "Semantic Versioning 2.0.0"
author:
ins: T. Preston-Werner
date: 2013-06
target: https://semver.org/spec/v2.0.0.html
RFC8259:


--- abstract

This document specifies the Data With Direction (DWD) pipe-separated
file format, a text-based format for encoding rule-based decision
logic. The format represents rules as collections of metadata and
truth tables that define input conditions and output assertions.

The format is designed for human readability and editability,
machine parsing and processing, version control system compatibility,
and exchange between rule management systems.

--- middle

Introduction

Purpose

This document specifies the Data With Direction (DWD) pipe-separated
file format. The format represents rules as collections of metadata and
truth tables that define input conditions and output assertions.

The format is designed for:

  • Human readability and editability
  • Machine parsing and processing
  • Version control system compatibility
  • Exchange between rule management systems

Scope

This specification defines:

  • The syntax and structure of DWD pipe-separated files
  • The interpretation of metadata fields
  • The encoding of truth tables
  • Constraints on field values

This specification does NOT define:

  • The semantics of specific rule domains
  • Processing algorithms for truth tables
  • Storage or transmission mechanisms
  • User interface representations

Terminology

{::boilerplate bcp14}

Additional terms used in this specification:

DWD Document:
: A file conforming to this specification containing rule metadata
and truth table data.

Record:
: A single line in a DWD document representing either metadata or
truth table data.

Field:
: A data element within a record, separated by pipe characters.

Truth Table:
: A tabular representation of input conditions mapped to output
assertions.

Scenario:
: A specific combination of input conditions identified by a
column in the truth table.

Notational Conventions

This specification uses ABNF {{RFC5234}} for formal syntax definitions.

The pipe character '|' is used as both a delimiter in the format
and in ABNF grammar. In ABNF examples, the pipe character as a
delimiter is represented as %d124 (ASCII decimal value).

Document Structure

File Encoding

DWD documents MUST be encoded using UTF-8 {{RFC3629}}.

DWD documents SHOULD use Unix-style line endings (LF, %d10).
Implementations MAY also accept Windows-style line endings (CRLF,
%d13 %d10) for interoperability.

The byte order mark (BOM) MUST NOT be present in DWD documents.

Line Structure

A DWD document consists of a sequence of lines. Each line represents
either a metadata record or a truth table record.

Lines are delimited by line ending characters. The final line of a
document SHOULD be terminated with a line ending.
Parser implementations MAY handle a final line without a line ending for robustness against unconforming implementations or legacy or corrupted data files.

Blank lines (lines containing only whitespace or empty) SHOULD be
ignored by parsers and SHOULD NOT be generated.

Lines SHOULD NOT exceed 1000 characters in length. Implementations
MUST support lines of at least 2000 characters for robustness.

Field Structure

Each line is composed of fields separated by the pipe character
(|, ASCII 124). The pipe character serves as the field delimiter.

Field Structure Rules:

  1. Every line MUST begin with a pipe character. For retrocompatibility and robustness, implementations SHOULD handle a missing leading pipe character with no functional degradation.
  2. Every line MUST end with a pipe character and line ending. For retrocompatibility and robustness, Implementations SHOULD handle a missing trailing pipe character with no functional degradation.
  3. Fields are the text between pipe delimiters.
  4. Empty fields (consecutive pipes) are valid and represent empty string values.
  5. Leading and trailing whitespace within fields SHOULD be preserved but MAY be trimmed by implementations.

Example field structure:

|field1|field2|field3|

This line contains three fields: "field1", "field2", and "field3".

|field1||field3|

This line contains three fields: "field1", "" (empty), and "field3".

Record Types

DWD documents contain two main sections:

  1. Metadata Section - Contains descriptive information about the rule
  2. Truth Table Section - Contains the decision logic concerned by the rule as a truth table

Record type is determined by the content of the first field:

  • Records with a first field containing a dot-separated identifier (e.g., "metadata.rule.120_title") are metadata records. (See {{metadata-section}}).
  • Records with a first field starting with "INDEX" are truth table column header records. (See {{truth-table-section}}).
  • Records with a first field starting with a letter followed by digits and dots (e.g., "W1.1", "T_W1.1_W2.1_W3.1") are truth table data records. (See {{truth-table-section}}).

Metadata Section {#metadata-section}

Metadata Fields

Metadata records use a key-value pair structure. The first field
contains the metadata key (a hierarchical identifier), and the
second field contains the value.

Metadata Record Format:

|key|value|

The key uses dot notation to represent hierarchy:

  • Top-level categories (e.g., "rule_id", "version_standard_url")
  • Nested properties (e.g., "metadata.rule.120_title")
  • Array indices (e.g., "metadata.rule.rulemaker_manager.1.name")

Required Metadata Fields:

Field Description Value type
rule_id Unique rule identifier UUID
ruledata_version Format version (SemVer) Dotted digit string

(See {{field-value-constraints}}).

Other metadata fields to expect are outside the scope of this specification, and may evolve independently.
Those two required fields are considered critical for the technical considerations of this file format (a unique identifier to distinguish rules efficiently, and a format version specifier to establish which specification version was used to produce the file, and allow parser implementations to efficiently handle different versions).

A separate specification should be expected to distinguish other mandatory and optional fields for semantic purpose, detail the validation rules for the values of those fields, any semantics attached to them, and constraints on user-specific metadata.

Nested Metadata

Metadata supports nested structures using dot notation and numeric
indices for arrays.

Object Properties:

|metadata.rule.rulemaker_manager.1.name|John Doe|
|metadata.rule.rulemaker_manager.1.email|[email protected]|

Array Representation:

Arrays are represented using 1-based numeric indices. The index
appears as the final component of the key path.

|metadata.rule.rulemaker_manager.1.name|Manager One|
|metadata.rule.rulemaker_manager.2.name|Manager Two|

Reserved Fields

The following field names are reserved and have special meaning:

  • INDEX - Marks the beginning of the truth table column header row
  • W1, W2... - Truth table row identifiers (World/Condition identifiers)
  • T_... - Truth table cell entries (Transaction/Truth entries)

Reserved field patterns:

  • /W\d+(.\d+)*$/ - World/scenario identifiers
  • /T_W\d+_W\d+_W\d+/ - Truth table cell identifiers
  • /INDEX/ - Column header marker

Truth Table Section {#truth-table-section}

Truth Table Structure Overview

DWD truth tables use a hierarchical organization with three main
categories of rows:

W1 (World/Scenario):
: Defines the possible scenarios or cases being considered. Each W1.x
row represents a distinct scenario identifier (e.g., A, B, C, D).

W2 (Function):
: Specifies whether a row represents an input condition or an output
assertion:

  • W2.1: Input Condition - Conditions that must be evaluated
  • W2.2: Output Assertion - Results or actions to be taken

W3 (Expression):
: Contains the actual logical expressions or values associated with
the conditions. These may be:

  • Human-readable descriptions
  • Machine-readable JSON structures
  • References to external data

For Lookup Tables, a similar structure uses K-rows:

K1, K2, K3... (Key Categories):
: Hierarchical key dimensions for multi-dimensional lookup tables.
Each Kx.y represents a specific key value within a category.

The relationship between these components creates a three-dimensional
logical space where:

  • W1 defines the scenario/case dimension
  • W2 defines the functional dimension (input/output)
  • W3 provides the semantic content

Truth table entries (T-rows) connect these dimensions:
T_W{x}_W{y}_W{z} indicates the truth value at the intersection of
scenario W1.x, function W2.y, and expression W3.z.

Row Identifiers

W-Row Identifier Format:

  • W{n} - Category header (e.g., W1, W2, W3)
  • W{n}.{m} - Specific condition within category (e.g., W1.1, W1.2)
  • W{n}.{m}.{o} - Further subdivision (e.g., W1.1.1, W1.1.2)

W-Row Types:

Type Pattern Purpose
W1 W1, W1.x Scenario/World identifiers
W2 W2, W2.x Function type (Input/Output)
W3 W3, W3.x Expression/Value definitions

K-Row Identifier Format (for Lookup Tables):

  • K{n} - Key category header (e.g., K1, K2, K3)
  • K{n}.{m} - Specific key value (e.g., K1.1, K1.2)

K-Row Types:

Type Pattern Purpose
K1, K2... K1, K1.x, etc. Key dimension categories
V_... V_Kx.y_Kz.w... Value assertions at key coordinates

Identifier Constraints:

  • Numbers are 1-indexed (start at 1, not 0)
  • Hierarchical depth is theoretically unlimited
  • Each identifier within a category MUST be unique
  • Labels (second field) provide human-readable descriptions

Column Headers

The truth table section begins with a column header row. This row
defines the columns of the truth table and their indices.

Column Header Format:

|INDEX|DATA|1|2|3|4|5|...|n|

Where:

  • "INDEX" - Identifies this as the column header row
  • "DATA" - Indicates data columns follow
  • "1" through "n" - Column numbers (positive integers)

The column header row establishes the numbering for all subsequent
truth table data rows.

Data Rows

Truth table data rows represent either:

a) Row headers (labels for conditions or assertions)
b) Truth values (the actual decision logic)

Row Header Format:

|identifier|label|index1|index2|...|indexN|

Examples:

|W1|COLUMNHEADER|1|2|3|4|5|...|
|W1.1|A|1|10|19|28|...|
|W2|Function|1|2|3|4|5|...|

Truth Value Format:

|T_W{row}_W{col}_W{expr}|value|column|

Examples:

|T_W1.1_W2.1_W3.1|01|1|
|T_W1.2_W2.1_W3.1|00|2|

Expression Field Content

W3 (Expression) rows contain the semantic definitions of conditions
and assertions. These fields MAY contain:

Plain Text:

  • Human-readable descriptions of conditions
  • Natural language statements

JSON Structures:

  • Machine-parseable structured data
  • Semantic triples or property-value pairs
  • References to external vocabularies

Example JSON in expression field:

|W3.1|{"determiner":"The","noun":"box","past_participle_verb":"measured","attribute":"type","predicate_verb":"is","description":"standard"}|

JSON Escaping Rules:

  • Double quotes within JSON MUST be escaped as \"
  • Backslashes MUST be escaped as \
  • Newlines within JSON SHOULD be avoided; use \n if necessary
  • The entire JSON object MUST be valid JSON when unescaped

Parsers SHOULD:

  • Detect JSON content by attempting to parse the field value
  • Fall back to treating content as plain text if JSON parsing fails
  • Validate JSON structure according to application-specific schemas

Note: The JSON structure shown in examples uses a specific vocabulary
for rule expressions. The exact schema is application-dependent and
outside the scope of this specification.

Truth Value Encoding

DWD uses a four-valued logic system encoded as two-character strings.
This supports classical binary logic as well as reasoning under
uncertainty and paraconsistent scenarios.

Value Meaning
"00" FALSE / Absent - Condition not met, assertion invalid
"01" TRUE / Present - Condition met, assertion valid
"10" MAYBE / UNKNOWN - Undetermined, uncertain, or requires further evaluation
"11" BOTH / CONTRADICTION - Inconsistent state, both true and false simultaneously

The four-valued logic allows DWD to represent:

  • Classical binary decisions (00, 01)
  • Uncertainty or incomplete information (10)
  • Inconsistent or contradictory states (11)

Processing Behavior:

  • When evaluating truth tables, implementations SHOULD handle all four values according to the logic semantics appropriate for the application domain.
  • The "10" (unknown) value indicates that the truth of the condition cannot be determined from available data, and MAY trigger additional data collection or default handling.
  • The "11" (contradiction) value indicates a logical inconsistency that SHOULD be flagged for review or resolution.

Note: Earlier versions of DWD used "--" to indicate "don't care"
conditions. This is now deprecated in favor of explicit four-valued
encoding.

DWD Storage Formats

DWD supports two complementary storage formats for truth table data:

DWD Array Format (KVa)

The full matrix representation with explicit 00/01/10/11 values in
every cell. This format is human-readable and suitable for direct
inspection and auditing.

Characteristics:
* Complete truth table matrix with all cells populated
* Binary patterns clearly visible for validation
* Identity matrix structure for value assertions
* Suitable for visualization and debugging

Example (truncated):

|K1.1|Label|01|00|00|01|00|00|
|K1.2|Label|00|01|00|00|01|00|
|T_K1.1_K2.1_K3.1|Value|01|00|00|00|00|00|

DWD Coordinates Format (KVc)

A compressed representation listing only the column indices where
values are present (01). This format is optimized for storage and
transmission.

Characteristics:

  • Lists only column numbers where value is "01"
  • Compact representation suitable for large tables
  • Easily auditable via text search on index numbers
  • Can be expanded to full array format when needed

Format:

|identifier|label|column1|column2|...|columnN|

Where column numbers are 1-indexed positions where the value is "01".

Example:

|K1.1|Label|1|4|7|10|
|K1.2|Label|2|5|8|11|
|V_K1.1_K2.1_K3.1|Value|1|

Conversion:

  • Array to Coordinates: For each row, record column indices where value equals "01"
  • Coordinates to Array: Create matrix with "00" in all cells, then set "01" at specified coordinates

Implementation Note:

RuleMaker generates the Coordinates format for storage efficiency.
The Array format is useful for visualization, debugging, and direct
human audit. Both formats contain identical semantic information.

Syntax Specification

ABNF Grammar

The following grammar defines the formal syntax of DWD documents
using ABNF {{RFC5234}}.

; Basic Definitions
PIPE           = %d124           ; "|" character
CR             = %d13            ; Carriage return
LF             = %d10            ; Line feed
CRLF           = CR LF           ; Internet standard newline
VCHAR          = %d33-126        ; Visible characters
WSP            = SP / HTAB       ; White space
SP             = %d32            ; Space
HTAB           = %d9             ; Horizontal tab
DIGIT          = %d48-57         ; 0-9
ALPHA          = %d65-90 / %d97-122  ; A-Z / a-z

; Document Structure
dwd-document   = *metadata-record
                 truth-table-section

; Line Ending (Unix-style preferred, Windows accepted)
line-ending    = LF / CRLF

; Metadata Records
metadata-record = PIPE metadata-key PIPE field-value PIPE line-ending

metadata-key   = key-segment *("." key-segment)
key-segment    = 1*(ALPHA / DIGIT / "_" / "-")
               / array-index
array-index    = 1*DIGIT        ; 1-based indexing

; Truth Table Section
truth-table-section = column-header
                      *row-header
                      *truth-value-row

column-header  = PIPE "INDEX" PIPE "DATA" *column-number PIPE line-ending
column-number  = PIPE 1*DIGIT

row-header     = PIPE row-id PIPE row-label *column-reference PIPE line-ending
row-id         = world-id / key-id
world-id       = "W" 1*DIGIT *("." 1*DIGIT)
key-id         = "K" 1*DIGIT *("." 1*DIGIT)
row-label      = 1*VCHAR
column-reference = PIPE 1*DIGIT

truth-value-row = PIPE truth-id PIPE truth-value PIPE column-index PIPE line-ending
truth-id       = "T_" world-id "_" world-id "_" world-id
truth-value    = "00" / "01" / "10" / "11"
column-index   = 1*DIGIT

; Coordinates Format (KVc)
coordinates-row = PIPE coords-id PIPE coords-label *column-number PIPE line-ending
coords-id      = value-id / key-id
value-id       = "V_" key-id *("_" key-id)

; Field Values
field-value    = *safe-char
safe-char      = VCHAR / WSP / UTF8-char
UTF8-char      = %x80-FF        ; UTF-8 multibyte sequences

; Reserved Identifiers
reserved-id    = "INDEX" / "DATA" / "W" 1*DIGIT / "K" 1*DIGIT / "V_" / "T_"

Field Value Constraints {#field-value-constraints}

Metadata Field Value Constraints:

Field Constraints
rule_id UUID format {{RFC4122}}
ruledata_version Semantic Versioning (SemVer 2.0.0)
version_standard_url Valid URL {{RFC3986}}
properties.id UUID format
metadata.rule.url Valid URL
linked_rules_or_lookups JSON array format or empty {{RFC8259}}

UUID Format:

UUIDs MUST conform to RFC 4122 format:

xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Where x is a hexadecimal digit (0-9, a-f, A-F).

Semantic Versioning:

Versions MUST follow SemVer 2.0.0 format:

MAJOR.MINOR.PATCH

Examples

Complete Example

The following example demonstrates a complete DWD document with
both metadata and truth table sections.

|rule_id|933e80c7-72d8-4990-8445-97ea6799322d|
|rulereserve_nodes|*|
|version_standard_url|https://semver.org/|
|ruledata_version|0.0.0|
|properties.id|933e80c7-72d8-4990-8445-97ea6799322d|
|metadata.rule.120_title|Test Rule|
|metadata.rule.240_summary|Example summary text|
|metadata.rule.960_explanation|Detailed explanation of rule logic|
|metadata.rule.rule_group|test-group|
|metadata.rule.rule_criticality|experimental|
|metadata.rule.url|https://example.com/rule|
|metadata.rule.rulemaker_manager.1.name|John Doe|
|metadata.rule.rulemaker_manager.1.email|[email protected]|
|linked_rules_or_lookups|[]|
|in_effect.1.country|US|
|in_effect.1.subcountry|US-CA|
|in_effect.1.timezone|2025-07-07T11:49:51-05:00|
|INDEX|DATA|1|2|3|4|5|
|W1|COLUMNHEADER|1|2|3|4|5|
|W1.1|A|1|2|3|4|5|
|W1.2|B|6|7|8|9|10|
|W2|Function|1|2|3|4|5|
|W2.1|Input Condition|1|2|3||
|W2.2|Output Assertion||4|5||
|W3|Expression|1|2|3|4|5|
|W3.1|{"noun":"test"}|1|2||4|5|
|T_W1.1_W2.1_W3.1|01|1|
|T_W1.2_W2.1_W3.1|00|2|
|T_W1.1_W2.2_W3.1|01|4|

Metadata Only Example

A DWD document containing only metadata:

|rule_id|a1b2c3d4-e5f6-7890-abcd-ef1234567890|
|ruledata_version|1.0.0|
|version_standard_url|https://semver.org/|
|properties.id|a1b2c3d4-e5f6-7890-abcd-ef1234567890|
|metadata.rule.120_title|Simple Rule|
|metadata.rule.240_summary|A rule with only metadata|

Security Considerations

This section discusses security considerations when processing DWD
documents.

Input Validation

Parsers MUST validate input to prevent:

  • Buffer overflow attacks via excessively long lines
  • Memory exhaustion via deeply nested structures
  • Injection attacks through field values

Implementations SHOULD enforce reasonable limits on:

  • Maximum line length (recommended: 10,000 characters)
  • Maximum number of fields per line (recommended: 10,000)
  • Maximum file size (recommended: 100 MB)
  • Maximum nesting depth for metadata keys (recommended: 10 levels)

Field Value Sanitization

Field values MAY contain arbitrary text including:

  • HTML/XML markup
  • JSON data structures
  • SQL fragments
  • Script code

Applications processing DWD documents MUST NOT execute field values
as code without proper sanitization and validation.

External References

Metadata fields such as "url" and "version_standard_url" contain
external references. Applications SHOULD:

  • Validate URL schemes (allow only http, https)
  • Implement timeouts for external resource fetching
  • Cache external resources to prevent repeated requests
  • Not automatically dereference URLs without user consent

Privacy Considerations

DWD documents may contain personally identifiable information (PII)
in metadata fields such as:

  • Rule maker names and email addresses
  • Organization-specific identifiers
  • Jurisdictional information

Implementations SHOULD:

  • Allow redaction or anonymization of PII fields
  • Implement access controls for sensitive rule documents
  • Log access to documents containing PII

IANA Considerations

This document has no IANA actions.

--- back

Acknowledgments

The authors would like to thank the Xalgorithms Foundation and all
contributors to the Data With Direction Specification.

Special thanks to Wayne Cunneyworth for pioneering Table Driven Design,
and to all members of the Xalgorithms Alliance who have contributed to
the development and refinement of the DWD format.

Appendix A. Comparison with Related Formats

This appendix provides a non-normative comparison between DWD pipe-
separated format and similar data formats.

CSV (Comma-Separated Values)

Aspect CSV DWD Pipe-Separated
Delimiter Comma (,) Pipe (\
Quoting Double quotes No quoting required
Header row Single header Multiple header types
Metadata Not standardized Native support
Nested data Flat structure Dot notation support

TSV (Tab-Separated Values)

Similar to CSV but uses tab characters as delimiters. DWD format
differs by using visible pipe characters and supporting structured
metadata alongside tabular data.

JSON

JSON provides hierarchical data representation but lacks the
human-editable tabular format of DWD. DWD truth tables can be
converted to JSON arrays, and DWD documents often have equivalent
JSON representations.

Appendix B. Implementation Guidelines

Parsing Strategy

Implementations are RECOMMENDED to use a two-pass parsing approach:

  1. First pass: Identify line types (metadata vs. truth table)
  2. Second pass: Parse fields according to line type

Error Handling

Parsers SHOULD provide detailed error information including:

  • Line number where error occurred
  • Type of error (syntax, validation, constraint violation)
  • Suggested correction if applicable

Round-Trip Preservation

Implementations SHOULD preserve the following when reading and
writing DWD documents:

  • Field order within records
  • Whitespace within field values
  • Presence or absence of optional fields
  • Comment lines (if supported)
(3-3/5)