Logical structure of a transaction
Download
Report
Transcript Logical structure of a transaction
HYPERLEDGER
Fabric Pluggable/Queryable
State Database
Ledger v1
Blockchain
(File system)
Txn
State Database
Key document
{
"asset_name":"marble1",
"owner":”jerry",
"date":"9/6/2016",
”version":”2:1",
Reads[]
Writes[]
}
Txn
Reads[]
Writes[]
Latest written key/values for use
in transaction simulation (current v1)
Txn
Reads[]
Writes[]
Transaction documents can be
saved for historical reporting
(proposed)
Transaction document
{
"txId": "tx000000000000002",
”transaction_height”:”2:1”
"function": " UpdateAsset(‘marble1’,’jerry’)",
"date": "9/6/2016",
"rwset": {
"reads": [
{"key": "marble1",
"version": ”1:1" } ],
"writes": [
{"key": "marble1",
"value": {
"asset_name": "marble1",
"owner": "jerry",
"date": "9/6/2016"
}
}
]
}
}
Txn
Reads[]
Writes[]
Immutable source of truth
‘Index’ of the blockchain for runtime queries
2
State Database - Queryability
• In a key/value database such as RocksDB, the content is a blob and only queryable by key
– Does not meet chaincode, auditing, reporting requirements for many use cases
• In a document database such as CouchDB, the content is JSON and fully queryable
– Meets a large percentage of chaincode, auditing, and simple reporting requirements
– For deeper reporting and analytics, replicate to an analytics engine such as Spark
– Compatible with existing chaincode programming model, no changes required if chaincode models key/value as JSON
• SQL data stores also possible, but requires more complicated relational transformation layer, as
well as schema management.
3
v1 Transaction Lifecycle
(with state database plugin)
SDK
3) Submit transaction
with simulation results
1) Submit proposal
Transaction
Transaction
Transaction
Reads[]
Reads[]
Reads[]
Writes[]
Writes[]
Writes[]
Transaction
Reads[]
Writes[]
Endorsing Peer
2) Simulate proposal in peer
• Plugin queries State DB
• Build RWSet in Peer
4) Receive batches of transactions
from Ordering Service
Committing Peer
Ordering
Service
5) Validate transaction and commit
• Plugin queries State DB during MVCC check
• Plugin pushes WriteSet to State DB
4
Ledger Query - Options
Option 1 – Leverage investment of existing databases
• Keep Blockchain file storage.
• Leverage existing databases for state database/queries.
Less modular
More effort
– Pluggable model to support various database engines.
Option 2 – Custom query engine
• Keep Blockchain file storage.
• Keep RocksDB embedded key/value state database.
– Build indexes and query engine on top of RocksDB.
Option 3 – Custom blockchain database
• Build a custom database optimized for a converged Blockchain/State Database
– Single copy of data (blocks, transactions, keys, history… all natively linked and queryable)
5
Ledger Query - Options
Option 1 – Leverage investment of existing databases
• Keep Blockchain file storage.
• Leverage existing databases for state database/queries.
– Pluggable model to support various database engines.
Less modular
More effort
Focus of this chartdeck
Option 2 – Custom query engine
• Keep Blockchain file storage.
• Keep RocksDB embedded key/value state database.
– Build indexes and query engine on top of RocksDB.
Option 3 – Custom blockchain database
• Build a custom database optimized for a converged Blockchain/State Database
– Single copy of data (blocks, transactions, keys, history… all natively linked and queryable)
6
Pluggable state database – How delivered?
• Golang does not support dynamic link libraries.
Therefore, the peer needs to be re-compiled to
plug in a new database.
– This is something a vendor/provider would likely do
• Ledgernext code will be refactored so that it is
obvious which interfaces need to be
implemented.
7
Pluggable state database with query – How used?
•
Provide a new SDK API for query execution (outside chaincode), e.g. lookups, reporting, auditing
– SDK API can be secured with ACL list
– Note – remove ‘query’ chaincode API in order to discourage building reporting applications within chaincode
•
Within ‘invoke’ chaincode, no changes required if normal GetState/PutState operations are utilized
– Golang has native Struct JSON marshaling
•
Stub.PutState(“marble1”,marble1_json)
Within ‘invoke’ chaincode, rich query can be used to identify keys (documents) to update
Two limitations to be aware of:
– Cannot Query/Write, and then Re-Query in the same transaction, since the simulation results are not in the DB yet.
– The endorser/committer architecture cannot prevent phantom reads, therefore this solution can only be used by
applications not sensitive to phantom reads
• A phantom read occurs when the result set at simulation time does not match the result set at commit time, due to
in-flight transactions
• Example: Update/Transfer all assets owned by ‘tom’. A new asset for ‘tom’ may arrive between simulation and
commit phase, which would be missed. Cannot be solved unless we re-query at commit time and compare result
sets.
8
CouchDB Example
•
•
{
"asset_name": "marble1",
"color": "blue",
"size": 35,
"owner": “jerry”
”version": ”2:1"
Use existing Marbles chaincode (uses JSON data model already)
Ledgernext plug-in implementation for CouchDB. PutState() persists to CouchDB.
}
Tom
Scenario
• Create marble owned by Tom
• Transfer marble to Jerry
Jerry
Query ledger using CouchDB 2.0 query language
Query marble1 current state
PUT /marbles_app/_find
{"selector":{"asset_name":"marble1"}}
Query for all Jerry’s marbles
PUT /marbles_app/_find
{"selector":{"owner":”jerry"}}
Query full history/provenance of marble1
PUT /marbles_app/_find
Query full history of Jerry’s transactions
PUT /marbles_app/_find
{
{
"selector": {
"rwset.writes": {
"$elemMatch": {
"value.asset_name": "marble1" } } }
}
Full set of query
operators, filtering,
sorting also available
as of CouchDB 2.0
"selector": {
"rwset.writes": {
"$elemMatch": {
"value.owner": ”jerry" } } }
}
9
Pluggable state database – More details
10
Pluggable state database - Objectives
Enrich Query API for Blockchain
• Leverage state-of-the-art databases to extend query capabilities against current state and transactions
• Both SQL and NoSQL flavors
• Ensure API supports plugging in different state database, for example by a vendor building on top of fabric
• Query support opportunities
• Current state
• Historical point in time
• Provenance
• Geo-location
• Text
Speed delivery and quality by leveraging investments in existing database engines
• Allow new R&D to focus on blockchain specific capabilities
• Do not re-invent the wheel
• To the degree possible, embed database and leverage capabilities within fabric, rather than requiring DBA skills
Most likely, only the state database would be pluggable,
not the actual block storage that resides on file system in ledgernext.
11
Pluggable state database - The Challenge
• How to support v1 endorsement/simulation model, when most databases do not support simulation
result sets?
– That is, how to make uncommitted updates to an arbitrary database and determine the ReadWriteSet that is required for
endorsement and commit validation? Not possible with most databases…
• Proposed Solution:
– Query database for key values during simulation, using database’s rich query language
– Perform simulation updates in private workspace (peer memory) using normal chaincode APIs, e.g. PutState()
– Get ReadWriteSet from endorder simulation (Reads come from DB queries, Writes come from simulation private
workspace)
– Endorsement, Consensus, Validation use transaction ReadWriteSet Simulation Results as normal
– Push Writes to database during Commit phase
12
Pluggable state database – CouchDB (NoSQL document DB)
•
Assumption – start with single-node CouchDB as the first pluggable NoSQL document database
•
CouchDB is a JSON document store. The JSON document will serve as the chaincode application key’s value
–
–
–
–
•
Support for complex objects
Chaincode manipulates JSON document
JSON document gets persisted in ledger (block’s WriteSet as well as state database)
Use CouchDB 2.0’s rich query language, or as needed, use CouchDB’s map/reduce views.
These charts will use a simple asset transfer scenario
13
Pluggable database – transaction lifecycle for asset transfer example
Get current key state in chaincode, using normal chaincode API, e.g. GetState()
Stub.GetState (“marble1”)
Database plug-in generates database query to retrieve JSON value, e.g. GET /<ChaincodeID>/marble1
RESPONSE: { "_id":”marble1", “asset_name”:” marble1”, "owner":"Tom", "date":"9/5/2016”, ”txId":"f81d4fae-7dec-11d0-a765-00a0c91e6bf6" , "_rev":”5” }
Database plug-in removes ‘internal’ fields before passing to chaincode (”_id”, “txId”, “_rev”)
OldValue= { "asset_name":” marble1 ", "owner":"Tom", "date":"9/5/2016” }
Simulate the transaction updates as normal in chaincode
NewValue={ "asset_name":” marble1 ", "owner":"Jerry", "date":"9/6/2016”}
Stub.PutState(“marble1”,[]byte(NewValue))
Simulation results in a ReadWriteSet as normal
“Reads”:[{"key" : “marble1 ", "version" : "5"}]
“Writes”:[{"key" : “marble1 ", "value" : { "asset_name":” marble1 ", "owner":"Jerry", "date":"9/6/2016"} }]
Endorsement, Consensus, and Committer Validation as normal
During final commit, the database plug-in converts the read set into a query for MVCC check, and the write
set into a database update, and re-adds ‘internal’ fields for persistence (”_id”, “txId”, “_rev”)
PUT /<ChaincodeID>/marble1
{ "_id":" marble1 ", “asset_name”:”my_asset”, "owner":”Jerry", "date":"9/6/2016”, ”txId":"0b1f4cc8-75d6-11e6-8b77-86f30ca893d3" , "_rev":”5” }
14
Querying for multiple keys
Example: Transfer all of Tom’s assets to Jerry
Stub.GetStateMultipleKeysUsingQuery(“ {”selector": { ”owner": ”Tom"} }” )
Database plug-in generates database query, e.g.
POST /<ChaincodeID>/_find
{”selector": { ”owner": ”Tom"}}
3 documents returned and placed in simulation read-set:
{”_id”:”marble1”, “asset_name":”marble1", "owner":"Tom", "date":"9/5/2016” , “txId”: “0b1f4cc8-75d6-11e6-8b77-86f30ca893d3”, “_rev”:”5”}
{”_id”:”marble2”, "asset_name":”marble2", "owner":"Tom", "date":”8/6/2016” , “txId”: “2a1f4cc8-75d6-11e6-8b77-86f30ca893b2”, “_rev”:”8”}
{”_id”:”marble3”, "asset_name":”marble3", "owner":"Tom", "date":”8/22/2016” , “txId”: “5d1f4cc8-75d6-11e6-8b77-86f30ca893e1”, “_rev”:”10”}
Plug-in needs to specify which fields to use for primary key, version, and transaction id
• _id, _rev, txId in above example
Then in chaincode, use SetStateMultipleKeys() as normal, to change the owner of all assets to
‘Jerry’, within the respective JSON values (documents)
Note – Risk of phantom reads, for example if a fourth asset was transferred to Tom between simulation time and commit time…
this asset will be missed. Cannot be solved unless we re-query at commit time and compare result sets.
15
Transaction documents
Write a transaction document to shadow the blockchain ledger, for reporting/audit queries only.
Blockchain
(File system)
Txn
State Database
(RocksDB by default)
Key document
{
"asset_name":"marble1",
"owner":"Jerry",
"date":"9/6/2016",
"txId":" tx000000000000002 ",
Reads[]
Writes[]
}
Txn
Reads[]
Writes[]
Latest written key/values for use
in transaction simulation (current v1)
Txn
Reads[]
Writes[]
Txn
Reads[]
Writes[]
Transaction documents can be
saved for reporting (future)
Transaction document
{
"txId": "tx000000000000002",
"block": 2,
"block_position": 1,
"function": " UpdateAsset(‘marble1’,’jerry’)",
"state": "completed",
"date": "9/6/2016",
"rwset": {
"reads": [
{"key": "marble1",
"version": "tx000000000000001" } ],
"writes": [
{"key": "marble1",
"version": "tx000000000000001",
"value": {
"asset_name": "marble1",
"owner": "jerry",
"date": "9/6/2016"
}
}
]
}
}
Rich queries
Key document
With the history of transactions in CouchDB, we can also
create views and perform interesting state, point in time,
provenance, and audit queries, for example:
• Show full history for marble1
• Show full history for all Jerry’s assets and transactions
• Which transactions were marble1 involved in?
• Which blocks contain those transactions?
• What assets and owners were involved in transaction
0b1f4cc8-75d6-11e6-8b77?
{
"_id":" marble1 ",
"asset_name":”marble1",
"owner":"Jerry",
"date":"9/6/2016",
"txId":"0b1f4cc8-75d6-11e6-8b77-86f30ca893d3",
"_rev":"5"
}
Transaction document
{
"txId":"0b1f4cc8-75d6-11e6-8b77-86f30ca893d3",
"block":1,
"block_position":1,
"function":" UpdateAsset(‘marble1’,’Jerry’)",
"state":"completed"
"rwset":{
"Reads":[
{
"key":" marble1",
"version":"5"
}
],
"Writes":[
{
"key":" marble1",
"version":"5",
"value":{
"asset_name":" marble1",
"owner":"Jerry",
"date":"9/6/2016"
}
}
]
},
"date":"9/6/2016"
}
17
Pluggable database – RDBMS
• Could provide a generic Go “database/sql” plug-in
• Solution specific tables custom-defined alongside chaincode
– Customized and optimized for use case and expected queries
– These charts will use a simple asset transfer scenario and ASSETS table as an example
• Option to deliver ledger tables in database as well (BLOCK, BLOCK_TRANSACTIONS)
• Either as shadow tables to query the blockchain ledger, or perhaps even as the primary blockchain
ledger
18
Pluggable database – transaction lifecycle for asset transfer
Get current key state in chaincode, using query language specific to the database plug-in
Stub.GetStateUsingQuery(“123”,“Select asset_id, owner, date, version from assets where asset_id = ? ”,”123”)
Database plug-in responsible for converting result set into JSON ’value’ for use in chaincode, could also use a table API metaphor
asset_id
owner
date
Version (txId)
123
Tom
9/5/2016
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
Simulate the transaction updates as normal in chaincode
NewValue={
"asset_id":"123",
"owner":"Jerry",
"date":"9/6/2016"
}
OldValue= {
"asset_id":"123",
"owner":"Tom",
"date":"9/5/2016"
}
Stub.PutState(123,[]byte(NewValue))
Simulation results in a ReadWriteSet as normal
“Reads”:[{"key" : “123", "version”:”f81d4fae-7dec-11d0-a765-00a0c91e6bf6“}]
“Writes”:[{"key" : “123", ”value" : { "asset_id":"123", "owner":"Jerry", "date":"9/6/2016"} }]
Endorsement, Consensus, and Committer Validation as normal
During final commit, a generator provided by the plugin is used to convert the Read set into a
query for MVCC check, and the Write set into data manipulation language of the specific database
update assets
set owner=‘Jerry’, date=‘9/6/2016’,
Version=‘0b1f4cc8-75d6-11e6-8b77-86f30ca893d3’
where asset_id = ‘123’
asset_id
owner
date
Version (txId)
123
Jerry
9/6/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
19
Querying for multiple keys
Example: Transfer all of Tom’s assets to Jerry
Stub.GetStateMultipleKeysUsingQuery(“Select asset_id, owner, date, version from assets where owner = ? ”, “Tom”)
asset_id
owner
date
Version (txId)
123
Tom
9/5/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
456
Tom
8/6/2016
2a1f4cc8-75d6-11e6-8b77-86f30ca893b2
789
Tom
8/22/2016
5d1f4cc8-75d6-11e6-8b77-86f30ca893e1
Solution needs to specify which result set columns to use for primary key and
version (asset_id, version columns in above example)
Then in chaincode, use SetStateMultipleKeys() as normal, to change the owner
of all assets to ‘Jerry’ (within the respective JSON values)
20
Tables after asset transfer committed
ASSETS table (latest state, custom table)
asset_id
owner
date
Version (txId)
123
Jerry
9/6/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
Asset table custom defined by chaincode
developer.
ASSET_HISTORY table (write once, custom table)
asset_id
owner
date
Version (txId)
123
Tom
9/5/2016
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
123
Jerry
9/6/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
BLOCKS (write once)
block
hash
prior_block_hash
date
block_blob
1
bbbbbbbb
aaaaaaaa
9/5/2016
binary_data
2
cccccccccc
bbbbbbbb
9/6/2016
binary_data
Asset history table custom defined by
chaincode developer.
(populated by db trigger on asset table?)
Block and transaction tables delivered with
database plug-in.
Commit() will update asset state, history, and
block tables in one atomic transaction.
BLOCK_TRANSACTIONS (write once)
tx_id
block
block_
position
function
RWSet
date
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
1
1
CreateAsset(‘123’,Tom’)
Reads:[…]
Writes:[…]
9/5/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
2
1
UpdateAsset(‘123’,’Jerry’)
Reads:[…]
Writes:[…]
9/6/2016
21
Queries
With asset state and history in database, we can now perform
interesting state, point in time, and provenance queries.
ASSETS table (latest state, custom table)
asset_id
owner
date
Version (txId)
123
Jerry
9/6/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
ASSET_HISTORY table (write once, custom table)
Show full history for asset 123:
select * from ASSET_HISTORY
where asset_id = ‘123’
order by date desc
Show full history for Jerry:
asset_id
owner
date
Version (txId)
123
Tom
9/5/2016
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
123
Jerry
9/6/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
select * from ASSET_HISTORY
where owner = ‘Jerry’
order by date desc
Easily correlate assets, transactions, and blocks for other audit
queries, e.g.
BLOCKS (write once)
block
hash
prior_block_hash
date
block_blob
1
bbbbbbbb
aaaaaaaa
9/5/2016
binary_data
2
cccccccccc
bbbbbbbb
9/6/2016
binary_data
-
Which transactions were asset 123 involved in?
Which blocks contain those transactions?
What assets and owners were involved in transaction 0b1f4cc8-75d6-11e6-8b77?
Improve query performance with traditional table indexes.
BLOCK_TRANSACTIONS (write once)
tx_id
block
block_
position
function
RWSet
date
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
1
1
CreateAsset(‘123’,Tom’)
Reads:[…]
Writes:[…]
9/5/2016
0b1f4cc8-75d6-11e6-8b77-86f30ca893d3
2
1
UpdateAsset(‘123’,’Jerry’)
Reads:[…]
Writes:[…]
9/6/2016
22
Restrictions
• Cannot Insert/Update/Delete using database syntax
– Must use normal PutState() methods in order to build ReadWriteSet required by v1
endorsement/simulation model
• Normal Read/Write transactions only.
– Cannot Read/Write/Read/Write the same key multiple times in one transaction (since the simulated
updates are not in the database yet).
23
Alternate options with this approach
• Instead of having write SQL generated on committer side based on JSON key value, potentially SQL
could be an output of chaincode simulation in a new ‘ExecutionSet’, that is included in the endorsed
action. Each of the committers would simply execute the SQL that comes out of simulation.
• It is assumed that BLOCK and BLOCK_TRANSACTION tables shadow the actual blockchain ledger for
query purpose. But if the database provides blockchain attributes such as immutable tables and
atomic updates, the primary blockchain ledger could be in the database, ensuring that the blockchain
ledger and state database are always in sync
24
Other approaches to investigate
1) Explore data source(s) where it is possible to intercept at different stages of simulation/validation/committance,
for example read the databases’s uncommitted tran log to achieve simulation.
2) Custom build a richer data model on top of existing key/value data store
•
•
•
•
A JSON data model and a simple query language on JSON data model
RWSet to be the delta change in JSON records
Include primary key in every RWSet
Use additional Column families for indexing secondary fields
DatamodelDef
{
"schemaName":"personalDetails”
"schema":{
"name":{
"firstName":"string",
"lastName":"string",
}
"address":"string",
"age":"int"
}
"PrimaryKey":{
"name.firstName, name.lastName”
}
"indexOn":{"age"}
}
updateQuery examples
{
"operation":"update”,
"schemaName":"personalDetails",
"data":{
"address":"New City”
}
"query": {
"type":"PrimaryKey", "PK”:{
"name”:{
"firstName":"abc",
"lastName":"xyz",
}
}
}
}
updateQuery examples
{
"operation":"insert”,
"schemaName":"personalDetails”,
"data":{
"name":{
"firstName":"abc”,
"lastName":"xyz",
”age”:30
}
}
}
searchQuery example
{
"type":"RangeQuery",
"field":"age”,
"range":[20, 40]
}