Concepts
Barter System - where goods and/or services are used in exchange for each other.
Credit Based System - where debt is used i.e. IOU/favours.
Cash Based System - where physical cash is used to buy/sell goods or services. This system needs to be bootstrapped with initial cash or trading to occur. This system allows assigning a numeric value to items.
Credit Intermediaries - like PayPal, allow buyers to provide their credit card details to them without revealing to sellers.
Key Characteristics of Cash Based System -
- direct interaction between buyer and seller
- neither party reveals their personal details (privacy)
- post transaction the parties cannot trace each other (anonymous)
- works offline i.e. no electronic components
- user-to-user and user-to-merchant mode
- money stolen is lost forever
Key Characteristics of Intermediary Based System -
- direct interaction between buyer and seller
- buyer does not reveal credit card details to seller
- requires an online system (network of devices) to be available
- user-to-merchant mode
- security against stolen passwords
Key Characteristics of Credit Card Based System -
- direct interaction between buyer and seller
- buyer has to reveal credit card details to seller
- post transaction its possible for parties to trace each other
- requires an online system (network of devices) to be available
- user-to-merchant mode
- security against stolen cards
Both credit card and intermediary based systems are backed by cash i.e. money in your savings account.
Digital Cash - is converting physical cash in to electronic cash, usually the conversion ratio is 1:1. Digital cash and spending from savings account are the same concepts.
Digital Currency - is a currency which is not backed by physical currency and does not have equivalent physical notes. It may be convertible to another currency (digital or physical) at a conversion ratio which is market determined.
Minting Digital Currency - Printing physical currency requires an initial large fixed cost to setup machinery, the cost of printing additional notes is small compared to this. Control over amount of physical currency printed keeps a check on the exchange rate and inflation. Similarly, controls over minting digital currency prevent unlimited production. One way to do this is to require that a puzzle e solved to mint a unit of currency, the complexity of solving the puzzle is high enough to limit production. Each puzzle requires a certain amount of time to be spent in solving it, solving one puzzle doesn't imply that following puzzles can be solved in lesser time and over time the complexity of puzzles will increase (since computing power increases year on year, time spent will reduce unless complexity accounts for this).
Ledger - Or Blockchain is a central repository of all transactions. At a high level, the concept involves registering a transaction in the ledger, the timestamp of the registration is recorded, the transaction and timestamp are signed and appended to a list of (already) registered transactions. On registrations, a transaction contains its own data, a hash to the previous transaction, a pointer to the previous transaction and collectively these three items are signed. The hash to previous transaction ensures that any attempt to modify history would require all subsequent transactions to be modified i.e. the hash value needs to be updated and each subsequent transaction needs to be signed again i.e. a high cost.
Hash Function - is a mathematical function which has three properties -
- it can accept an input of any size e.g. 1000 bit string
- it produces a fixed size output e.g. 128 bit output
- it is efficiently computible e.g. for a n-bit input the function takes O(n) time
Cryptographic Hash Function - is a hash function with three additional properties -
- collision resistance
- hiding
- puzzle friendly
Collision Resistance
A collision is when two distinct input strings when processed by the hash function lead to the same output i.e. for hash function H(), if x != y and H(x) = H(y).
The universe of inputs is infinite (as input strings can be of any length) whereas the universe of output is finite (as output size is fixed), thus there will be collisions for any hash function. However the property of collision resistance requires that nobody should be able to find the collisions.
Hash functions are chosen such that the effort required to find a collision is so large that no one would ever succeed in a reasonable amount of time. When collisions are detected for accepted hash functions, they are deprecated e.g. MD5.
Message Digest - a useful case for hash functions i.e. for a large input if the hash output is a fixed size string of much smaller size, then the hash value of the input is used to represent the input. This is known as message digest. Example users are when downloading large files, the published hash value is used to compare with the hash of the file downloaded, thus allowing the user to verify that the file downloaded has not been corrupted during the transfer.
Hiding
Given y = H(x), if y is known, it should be infeasible to determine x i.e. given an output it should not be possible to determine the input to the hash function. This is possible if the input strings are spread out in the universe of input values i.e. high min-entropy. If the input universe is not spread out, then the input can be appended to another secret input which would make the final input value difficult to determine i.e. y = H(x||r) where r is the secret input.
Note, there should be not be another input x' and secret r' such that x != x but H(x||r) == H(x'||r') i.e. collision resistance should hold.
The secret value r is also referred to as Nonce. A nonce is randomly chose once and not repeated for other inputs.
Puzzle Friendly
A hash function is puzzle friendly if H(k||x) = y, where y is n-bits long, k is chosen from high min-entropy distribution and it is infeasible to find x within time less than 2^n.
Merkle-Damgard Transform
Is a method of converting a hash function which accepts a fixed length input into one which accepts a variable length input. In this technique, the hash function is known as the compression function. If the compression function is collision resistant then the overall hash function which accepts variable length input is also collision resistance.
https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Merkle-Damgard_hash_big.svg/640px-Merkle-Damgard_hash_big.svg.png
The fixed sized input length is denoted by m, output of hash function is denoted by n. The variable length input is divided into blocks of length m-n. The first block is paired with an Initialization Vector and fed to the hash function, subsequent blocks are paired with the output of the hash function from the previous block and fed back into the hash function. If the last block is of length less than m-n, it is padded with zeros.
Hash Pointer
A pointer to data along with a cryptographic hash of the value of data. Hash pointers are useful in any acyclic data structure.
Block Chain
A linked list where the pointers are hash pointers.
Genesis Block
The head of the list (block chain) is known as the Genesis Block. Each block in a block chain contains a hash pointer to the previous block, the head of the block chain is stored in a special place where it cannot be tampered with. This prevents anyone from tampering with the block chain i.e. if someone attempted to alter the contents of block (t), they will have to alter the hash pointer in the subsequent block (t+1), and this would cause a change of data for this block and thus they would then have to alter the hash pointer of the subsequent block (t+2)... and so on, all the way up to the head of the chain, which is stored in a safe location and cannot be altered, thus preventing the tampering of the original block (t).
Merkle Tree
Is a binary tree with hash pointers.The leave nodes are data nodes, each node above a leaf node contains a hash pointer to both leaf nodes, and the nodes at the third level contain hash pointers pointing to the nodes below it... and so on till the root of the tree is reached. Merkle trees are useful for proof of membership, the applicant needs to show the sequence of nodes up to the root for which this node is a member. This is simpler than Block Chain as the number of items that need to be show are log(n).
Sorted Merkle Tree
Where the data blocks at the bottom are sorted in an order e.g. numerical, alphabetical, etc. With sorted merkle trees, the proof of non-membership is easier to perform by establishing that the node previous and the subsequent node are consecutive without any space between them.
Digital Signatures
Three properties are required to digitally sign anything - public key, secret key, signature. Public key is shared with everyone whereas the secret key is kept private with the person who signs. The signature is an algorithm which accepts a 'message' and signs it with the secret key. Anyone can verify the signature using a generic 'validate' algorithm which accepts the signature, message and public key.
sign = signature(message, secret key)
bool v = validate(sign, message, public key)
validate(signature(message, secret key), message, public key) == true
It is also possible to sign the hash of the message. Since hash output is fixed versus variable length message, this reduces the length of input to be provided to the signature algorithm.
Signing hash pointers is equivalent to signing the structure pointed to by the pointer. Thus signing the head of a block chain is equivalent to signing the entire block chain.
ECDSA
Bitcoin uses Elliptic Curve Digital Signature Algorithm as its digital signature scheme. Key sizes -
Private Key 256 bits
Public Key Uncompressed 512 bits
Public Key Compressed 257 bits
Message to be signed 256 bits
Signature 512 bits
The message size of 256 bits is not a limitation, as the hash value of the message is signed.
Bitcoin Addresses
The process of generating a pair of public and secret keys is well known and anyone can generate the keys for themselves. The public key or a hash of the public key serves as an identify of the user. This identity is known as address in Bitcoin system. One person can generate multiple pairs of keys i.e. identities for themselves. The intent is to have a decentralized system for creating and managing identities, thus providing anonymity to the people.
Barter System - where goods and/or services are used in exchange for each other.
Credit Based System - where debt is used i.e. IOU/favours.
Cash Based System - where physical cash is used to buy/sell goods or services. This system needs to be bootstrapped with initial cash or trading to occur. This system allows assigning a numeric value to items.
Credit Intermediaries - like PayPal, allow buyers to provide their credit card details to them without revealing to sellers.
Key Characteristics of Cash Based System -
- direct interaction between buyer and seller
- neither party reveals their personal details (privacy)
- post transaction the parties cannot trace each other (anonymous)
- works offline i.e. no electronic components
- user-to-user and user-to-merchant mode
- money stolen is lost forever
Key Characteristics of Intermediary Based System -
- direct interaction between buyer and seller
- buyer does not reveal credit card details to seller
- requires an online system (network of devices) to be available
- user-to-merchant mode
- security against stolen passwords
Key Characteristics of Credit Card Based System -
- direct interaction between buyer and seller
- buyer has to reveal credit card details to seller
- post transaction its possible for parties to trace each other
- requires an online system (network of devices) to be available
- user-to-merchant mode
- security against stolen cards
Both credit card and intermediary based systems are backed by cash i.e. money in your savings account.
Digital Cash - is converting physical cash in to electronic cash, usually the conversion ratio is 1:1. Digital cash and spending from savings account are the same concepts.
Digital Currency - is a currency which is not backed by physical currency and does not have equivalent physical notes. It may be convertible to another currency (digital or physical) at a conversion ratio which is market determined.
Minting Digital Currency - Printing physical currency requires an initial large fixed cost to setup machinery, the cost of printing additional notes is small compared to this. Control over amount of physical currency printed keeps a check on the exchange rate and inflation. Similarly, controls over minting digital currency prevent unlimited production. One way to do this is to require that a puzzle e solved to mint a unit of currency, the complexity of solving the puzzle is high enough to limit production. Each puzzle requires a certain amount of time to be spent in solving it, solving one puzzle doesn't imply that following puzzles can be solved in lesser time and over time the complexity of puzzles will increase (since computing power increases year on year, time spent will reduce unless complexity accounts for this).
Ledger - Or Blockchain is a central repository of all transactions. At a high level, the concept involves registering a transaction in the ledger, the timestamp of the registration is recorded, the transaction and timestamp are signed and appended to a list of (already) registered transactions. On registrations, a transaction contains its own data, a hash to the previous transaction, a pointer to the previous transaction and collectively these three items are signed. The hash to previous transaction ensures that any attempt to modify history would require all subsequent transactions to be modified i.e. the hash value needs to be updated and each subsequent transaction needs to be signed again i.e. a high cost.
Hash Function - is a mathematical function which has three properties -
- it can accept an input of any size e.g. 1000 bit string
- it produces a fixed size output e.g. 128 bit output
- it is efficiently computible e.g. for a n-bit input the function takes O(n) time
Cryptographic Hash Function - is a hash function with three additional properties -
- collision resistance
- hiding
- puzzle friendly
Collision Resistance
A collision is when two distinct input strings when processed by the hash function lead to the same output i.e. for hash function H(), if x != y and H(x) = H(y).
The universe of inputs is infinite (as input strings can be of any length) whereas the universe of output is finite (as output size is fixed), thus there will be collisions for any hash function. However the property of collision resistance requires that nobody should be able to find the collisions.
Hash functions are chosen such that the effort required to find a collision is so large that no one would ever succeed in a reasonable amount of time. When collisions are detected for accepted hash functions, they are deprecated e.g. MD5.
Message Digest - a useful case for hash functions i.e. for a large input if the hash output is a fixed size string of much smaller size, then the hash value of the input is used to represent the input. This is known as message digest. Example users are when downloading large files, the published hash value is used to compare with the hash of the file downloaded, thus allowing the user to verify that the file downloaded has not been corrupted during the transfer.
Hiding
Given y = H(x), if y is known, it should be infeasible to determine x i.e. given an output it should not be possible to determine the input to the hash function. This is possible if the input strings are spread out in the universe of input values i.e. high min-entropy. If the input universe is not spread out, then the input can be appended to another secret input which would make the final input value difficult to determine i.e. y = H(x||r) where r is the secret input.
Note, there should be not be another input x' and secret r' such that x != x but H(x||r) == H(x'||r') i.e. collision resistance should hold.
The secret value r is also referred to as Nonce. A nonce is randomly chose once and not repeated for other inputs.
Puzzle Friendly
A hash function is puzzle friendly if H(k||x) = y, where y is n-bits long, k is chosen from high min-entropy distribution and it is infeasible to find x within time less than 2^n.
Merkle-Damgard Transform
Is a method of converting a hash function which accepts a fixed length input into one which accepts a variable length input. In this technique, the hash function is known as the compression function. If the compression function is collision resistant then the overall hash function which accepts variable length input is also collision resistance.
https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Merkle-Damgard_hash_big.svg/640px-Merkle-Damgard_hash_big.svg.png
The fixed sized input length is denoted by m, output of hash function is denoted by n. The variable length input is divided into blocks of length m-n. The first block is paired with an Initialization Vector and fed to the hash function, subsequent blocks are paired with the output of the hash function from the previous block and fed back into the hash function. If the last block is of length less than m-n, it is padded with zeros.
Hash Pointer
A pointer to data along with a cryptographic hash of the value of data. Hash pointers are useful in any acyclic data structure.
Block Chain
A linked list where the pointers are hash pointers.
Genesis Block
The head of the list (block chain) is known as the Genesis Block. Each block in a block chain contains a hash pointer to the previous block, the head of the block chain is stored in a special place where it cannot be tampered with. This prevents anyone from tampering with the block chain i.e. if someone attempted to alter the contents of block (t), they will have to alter the hash pointer in the subsequent block (t+1), and this would cause a change of data for this block and thus they would then have to alter the hash pointer of the subsequent block (t+2)... and so on, all the way up to the head of the chain, which is stored in a safe location and cannot be altered, thus preventing the tampering of the original block (t).
Merkle Tree
Is a binary tree with hash pointers.The leave nodes are data nodes, each node above a leaf node contains a hash pointer to both leaf nodes, and the nodes at the third level contain hash pointers pointing to the nodes below it... and so on till the root of the tree is reached. Merkle trees are useful for proof of membership, the applicant needs to show the sequence of nodes up to the root for which this node is a member. This is simpler than Block Chain as the number of items that need to be show are log(n).
Sorted Merkle Tree
Where the data blocks at the bottom are sorted in an order e.g. numerical, alphabetical, etc. With sorted merkle trees, the proof of non-membership is easier to perform by establishing that the node previous and the subsequent node are consecutive without any space between them.
Digital Signatures
Three properties are required to digitally sign anything - public key, secret key, signature. Public key is shared with everyone whereas the secret key is kept private with the person who signs. The signature is an algorithm which accepts a 'message' and signs it with the secret key. Anyone can verify the signature using a generic 'validate' algorithm which accepts the signature, message and public key.
sign = signature(message, secret key)
bool v = validate(sign, message, public key)
validate(signature(message, secret key), message, public key) == true
It is also possible to sign the hash of the message. Since hash output is fixed versus variable length message, this reduces the length of input to be provided to the signature algorithm.
Signing hash pointers is equivalent to signing the structure pointed to by the pointer. Thus signing the head of a block chain is equivalent to signing the entire block chain.
ECDSA
Bitcoin uses Elliptic Curve Digital Signature Algorithm as its digital signature scheme. Key sizes -
Private Key 256 bits
Public Key Uncompressed 512 bits
Public Key Compressed 257 bits
Message to be signed 256 bits
Signature 512 bits
The message size of 256 bits is not a limitation, as the hash value of the message is signed.
Bitcoin Addresses
The process of generating a pair of public and secret keys is well known and anyone can generate the keys for themselves. The public key or a hash of the public key serves as an identify of the user. This identity is known as address in Bitcoin system. One person can generate multiple pairs of keys i.e. identities for themselves. The intent is to have a decentralized system for creating and managing identities, thus providing anonymity to the people.