Project

General

Profile

Actions

Support #96

open

A few sentences about RuleReserve

Added by Joseph Potvin almost 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assignee:
Start date:
05/12/2022
Due date:
05/12/2022 (about 24 months late)
% Done:

0%

Estimated time:
0.20 h

Description

Don, Could you please make corrections to this short text?

DRAFT segment:

Don has prototyped RuleReserve 3.0 in Ruby with an SQLite ‘localhost’ database. Once the operational details are worked out, he will rewrite the application in Rust, with a choice of SQLite or Cassandra for data staging and ‘sifting’, while relying on IPFS for whole-network storage. The RuleReserve Superset table will eventually become very large and will be hosted on a few general-purpose high capacity, high resilience nodes. Each independently-controlled RuleReserve Subset table will be whatever size the manager of the node determines is needed to hold a table with all the rows of rules that are ‘in effect’ and ‘applicable’ to that manager’s requirements.

Actions #1

Updated by Don Kelly almost 2 years ago

Joseph Potvin wrote:

with a choice of SQLite or Cassandra for data staging and ‘sifting’

These prototypes will only use SQLite. Not certain when Cassandra would be introduced. That might be an implementation detail for non-reference implementations or it might be worth doing as an experiment to see how the operation of this reference implementation might change.

while relying on IPFS for whole-network storage.

This doesn't depend on porting to Rust. I'll likely run this experiment using the existing Ruby prototype.

Actions #2

Updated by Joseph Potvin almost 2 years ago

Could somebody's Subset RR be too large for SQLite? What are its practical upper boundaries? If its Subset RR is large enough, one could reasonable migrate to Cassandra, no?

RE: " doesn't depend on porting to Rust"

I know that.

RE: " I'll likely run this experiment using the existing Ruby prototype"

Oh, okay, good.

Actions #3

Updated by Joseph Potvin almost 2 years ago

A new draft of that paragraph. This is for the dissertation, so I'd like to have is correct / concise / lasting...

Don has prototyped RuleReserve 3.0 in Ruby with an SQLite ‘localhost’ for data staging and ‘sifting’. Once the operational details are worked out, he will add the ability to pull rulereserve.dwd files from IPFS network storage. The rulereserve-superset.dwd file is planned to be very large and will be hosted on a few general-purpose high-capacity, high-resilience nodes. But each independently-controlled rulereserve-subset.dwd will be whatever size the manager of the node determines is needed for a table with all the rows of rules that are ‘in effect’ and ‘applicable’ to that manager’s requirements.

Actions #4

Updated by Joseph Potvin almost 2 years ago

Here is the description I put together following our discussion on the tech call yesterday. Please let me know your suggestions:

IPFS is a general-purpose content delivery network (CDN), that’s to say, a geographically distributed network of servers choreographed to provide fast delivery of Internet content.[1] It is not used for queries.

Over a period of five years as the DWDS design was emerging, Don Kelly and I stepped through multiple iterations of the general architecture and implementation possibilities for an Internet of Rules data processing pipeline. Initially Kelly implemented a working version of my early design that relied on MongoDB to store our immutable records, and Cassandra to stage and compute (map/reduce) dynamic data. However we both felt that this client/server approach imposed an unnecessarily heavy set-up and maintenance overhead, so we made repeated efforts to reduce the technology requirements to the absolute minimum essential to achieve the intended functions. In the course of thinking through the end-to-end system by modeling it though a set of sequence diagrams in November 2020, I came up with the idea of SupersetRR and SubsetRR nodes. And then in the summer of 2021 I re-designed nearly from scratch the current data ‘sifting’ method. The sift process first occurs within each RuleReserve node, using rule metadata to sift for ‘in effect’ and ‘applicable’; and then sifting occurs separately within RuleTaker nodes, using the logic gates to identify which parts of ‘in effect’ and ‘applicable’ rules that are ‘invoked’. Wayne Cunneyworth and Bill Olders responded by saying this resulted in a more efficient method than their own design for centralized deployment, which I was initially attempting to mimic in a decentralized way. And Don Kelly found that he could now meet all the data staging and processing requirements with the SQLite ‘localhost’ embedded serverless database engine, while leaving IPFS to store immutable records. At that point, the design was some minimalist that together he and I determined that the default deployment for every DWDS node could include all three elements: RuleMaker, RuleReserve and RuleTaker.
Kelly has now created a reference implementation of RuleReserve 3.0 in Ruby with an SQLite ‘localhost’. Shortly he will add the ability to pull rulereserve.dwd files from IPFS. Once this is implemented, each Superset RR node will listen for updated versions of the rulereserve.dwd file subscription on IPFS – every immutable version has a different CID (content identifier). Each Superset RR downloads the latest rulereserve.dwd file and, and compares it with the previous version it already has, to creates a ‘diff’ file. It can then automatically broadcast the differences to all of its subscribed Subset RRs. The default scheduled frequency of these updates can be just once-per day. However any number of independent specialized methods can separately perform urgent updates, on any commercial or not-for-profit basis as market participants prefer.

Anyone may write their own RuleReserve implementation based on the specification using other components. For example the operator of a Superset RuleReserve may prefer to use Cassandra instead of SQLite. However in our assessment SQLite is adequate for the job because the sift process I designed is exceedingly simple. (i.e. Contrary to what the critique suggests, it is not at all complex.) The DWDS sift method requires no JOIN statements, there are only some rudimentary SELECT and WHERE statements. No indexing is required because the sift procedure is limited to comparing atomic data. The most complicated comparison it performs is dates.


[1] The initial suggestion for our design to use IPFS came from Calvin Hutcheon, and the choice to employ it as our persistent storage method was made jointly with Don Kelly.

Actions #5

Updated by Don Kelly almost 2 years ago

design was some minimalist

design was so minimal ?

Actions #6

Updated by Don Kelly almost 2 years ago

Kelly has now created a reference implementation of RuleReserve 3.0 in Ruby with an SQLite ‘localhost’. Shortly he will add the ability to pull rulereserve.dwd files from IPFS. Once this is implemented, each Superset RR node will listen for updated versions of the rulereserve.dwd file subscription on IPFS – every immutable version has a different CID (content identifier). Each Superset RR downloads the latest rulereserve.dwd file and, and compares it with the previous version it already has, to creates a ‘diff’ file. It can then automatically broadcast the differences to all of its subscribed Subset RRs. The default scheduled frequency of these updates can be just once-per day. However any number of independent specialized methods can separately perform urgent updates, on any commercial or not-for-profit basis as market participants prefer.

I'm not sure about this design. I think it should work, but I'd have to actually play with IPFS to verify how notifications work. Also, there's a moment where the two RRs need to communicate in order to exchange the CID of these indexes. If we intend that two RRs never speak directly, we'll need to design something to take the place of this initial exchange of CIDs. However that's done, I'd like it to be peer-to-peer and avoid adding some kind of central registrar.

Actions #7

Updated by Don Kelly almost 2 years ago

No indexing is required because the sift procedure is limited to comparing atomic data.

I may've misspoke the last synchronous meeting. I meant to say partitioning or sharding. The SQLite DB would be indexed on the columns that we use for selecting during the IN EFFECT and APPLICABLE sifting.

Actions #8

Updated by Joseph Potvin almost 2 years ago

Revised as follows:

IPFS is a general-purpose content delivery network (CDN), that’s to say, a geographically distributed network of servers choreographed to provide fast delivery of Internet content.[1] It is not used for queries.

Over a period of five years as the DWDS design was emerging, Don Kelly and I stepped through multiple iterations of the general architecture and implementation possibilities for an Internet of Rules data processing pipeline. Initially Kelly implemented a working version of my early design of an Internet of Rules using MongoDB to store our immutable records, and using Cassandra to stage and compute dynamic data (with the map/reduce design pattern). However we both felt that this client/server approach imposed an unnecessarily heavy set-up and maintenance overhead, so we made repeated efforts to reduce the technology requirements to the absolute minimum essential to achieve the intended functions.

In the course of thinking through the end-to-end system by modelling a set of sequence diagrams in November 2020, I came up with the idea of SupersetRR and SubsetRR nodes distributed pre-selection. And then in the summer of 2021 I re-designed nearly from scratch the data selection method, in consultation with Wayne Cunneryworth and Bill Olders. Kelly and I resolved to call this process “data sifting”, and we determined that the sift process would first occur within each RuleReserve node, using rule metadata to sift for ‘in effect’ and ‘applicable’; and then a second round of sifting would occur within RuleTaker nodes, using the logic gates to identify which parts of ‘in effect’ and ‘applicable’ rules that are ‘invoked’.

The re-design was a success. Wayne Cunneyworth and Bill Olders responded by saying this resulted in a more efficient method than their own design for centralized deployment on mainframes, which I was initially just attempting to mimic in a decentralized way. And Don Kelly found that he could now meet all the data staging and processing requirements using the SQLite ‘localhost’ embedded serverless database engine. IPFS could then be relied upon to store the immutable rulereserve.dwd versioned records. We also realized that our set-up and maintenance overhead had been reduced to the point that the default deployment for every DWDS node could include all three elements: RuleMaker, RuleReserve and RuleTaker, or could use just any one or two of these as would meet the end-users’s requirements. (e.g. IoT devices would not require RM, and they can share an external SubsetRR.)

Kelly has now created a reference implementation of RuleReserve 3.0 in Ruby with an SQLite ‘localhost’. Shortly he will add the ability for SupersetRR nodes to pull rulereserve.dwd files from IPFS, as well as to exchange updates on a peer-to-peer basis. Some details of exactly how this will handle version updates awaits his his implementation effort to learn exactly how IPFS notifications work. What we intend is that each Superset RR node would monitor for updated versions of the rulereserve.dwd that it ‘subscribes’ to on IPFS – every immutable version has a different CID (content identifier). When a SupersetRR downloads the latest rulereserve.dwd file, it would compare it with the previous version it already has, and create a ‘diff’ file, and automatically broadcast the differences to all of its subscribed SubsetRRs. The various SupersetRRs peers would also validate among each other that the differences they each detect between any two versions map exactly – as a useful integrity check. The default scheduled frequency of these rulereserve.dwd updates can be just once-per day. However any number of independent specialized methods can separately perform urgent updates to any SubsetRR, on any commercial or not-for-profit basis as market participants prefer.

Anyone may write their own RuleReserve implementation based on the DWDS specification using other components. For example the operator of a Superset RuleReserve may want to use a Cassandra database instead of SQLite. In our assessment SQLite is adequate for the job because the sift process I designed is exceedingly simple. (i.e. Contrary to what the critique suggests, it is not at all complex.) The most complicated comparison the RR sift procedure performs is dates. No partitioning or sharding is required because the sift procedure is limited to comparing atomic data. The sift method requires no JOIN statements, there are only some rudimentary SELECT and WHERE statements. SQLite indexes the rulereserve.dwd table on the columns used to select for ‘in effect’ and ‘applicable’ rows.


[1] The initial suggestion for our design to use IPFS came from Calvin Hutcheon, and the choice to employ it as our persistent storage method was made jointly with Don Kelly.

Actions

Also available in: Atom PDF