SC-402 Applications of Cryptography on the Blockchain Prof. Manish K. Gupta Rahi Krishna 201901446 201901446@daiict.ac.in Prerak Modia 201901198 201901198@daiict.ac.in Rohit Dubey 201901446 201901180@daiict.ac.in Krish Vaghela 201901293 201901293@daiict.ac.in Devesh Lakwal 201901249 201901249@daiict.ac.in May 15, 2022 1 Contents 1 Introduction 3 2 Infrastructure 4 3 Hash and Block Structure 5 4 Public Key System and Bitcoin 8 5 Digital Signature and Currency Trading 10 6 Blockchain Consensus Mechanism 11 7 Security Issues in Blockchain 12 8 Conclusions 13 2 Abstract Blockchain can be looked at as a distributed, decentralized, traceable, non-tamperable, secure and reliable database that practically guarantees user information security. The development of cryptography technology such as encryption and hash functions has accelerated blockchain growth. This paper outlines blockchain infrastructure as well as the principles of cryptographic encryption, and shows the extent of cryptography usage in the blockchain. The existing security problems are analysed and future improvements that can be made are touched upon. 1 Introduction Blockchain is a distributed, decentralized database that is shared within any computer network. It stores information electronically in digital format and are known for a crucial role they play in cryptocurrency systems (such as Ethereum and Bitcoin) for maintaining a secure record of transactions. The innovation with a blockchain is that it guarantees the fidelity and security of a record of data and generates trust without the need for a third party. One key difference between a typical database and a blockchain is data structure. A blockchain collects information together in groups, known as blocks, that hold sets of information. Blocks have certain storage capacities and, when filled, are closed and linked to the previously filled block, forming a chain of data known as the blockchain. All new information that follows that freshly added block is compiled into a newly formed block that will then also be added to the chain once filled. Blockchain integrates P2P (Peer-to-Peer) protocol, digital encryption technology and a consensus mechanism together. Such cryptosystems adopt the method of mutual maintenance by multiple users, realizing information supervision among multiple parties, thus ensuring the credibility and integrity of the data. The blockchain stores all user transaction information on the internet, which has high requirements for its security performance. In such systems, nodes (or blocks) do not need to trust each other, and are independent of one another, having no central node. Transaction information must be able to be sent over unsecured channels, securely, requiring heavy encryption techniques. We shall briefly introduce a few cryptography techniques such as hash algorithm, asymmetric encryption algorithm and digital signature. We shall also elaborate the blockchain structure, and explain how cryptography technology protects privacy and transaction maintenance in the blockchain in detail. 3 2 Infrastructure Blockchain has progressed from being just a simple multi-technology portfolio represented in currencies such as Bitcoin to obtain the ability to tranfer digital assets securely in the Ethereum era. Typical applications of blockchain include cyptocurrencies and hyperledgers. The blockchain platform can be divided into five distnct layers: Network layer, Consensus Layer, Data layer, Contract layer and Application layer. Platform Level Bitcoin Ethereum Hyperledger Application Layer Bitcoin trading Ethereum trading Enterprise blockchain Network Layer TCP-based P2P TCP-based P2P HTTP/2-based P2P Contract Layer Script Solidity/Script EVM Go/Java Docker Consensus Layer PoW PoW/PoS PBET/SBFT Data Layer Merkle tree Merkle-Patricia rree Merkle-Bocket tree Data Layer The data layer mainly uses the ’block’ data structure to ensure data storage integrity. Each node in the network encapsulates the data transactions completed over a certain period of time into a time-stamped data block called a ledger, and then links the block to the current longest main blockchain for storing. This involves the usage of Merkle trees and hashing algorithms to encrypt the data in the storage block. Consensus Layer The consensus layer mainly includes a consensus mechanism, which enables each node in the system to ensure the validity of the blockchain in the decentralized system. This layer deals with the enforcement of network rules that describe what nodes within the network should do to reach consensus about the broadcasted transactions. A few consesnsus mechanisms are: PoW (Proof-of-Work), PoS (Proof-of-Stake) which regulate the creation of blocks and the state of the blockchain as well as delegate control of the network to owners. PBFT (Practical Byzantine Fault Tolerance) and SBFT (Simplified Byzantine Fault Tolerance) are used to widen the performance bottlenecks for the data transmission and storage of such blocks in the blockchain. 4 Network Layer The network layer includes various data transmission protocols and verification mechanisms. The blockchain is a typical P2P network. All nodes are connected through a planar topology and have no central nodes. Any two nodes can be freely traded, and any node can join or leave the network at any time. The P2P protocol in the blockchain is mainly used for information transmission between nodes. Application Layer Smart contracts, chain code and decentralized applications make up the application layer. The application layer protocols are further subdivided into the application and the execution layers. This layer comprises the programs that end-users utilize to communicate with the blockchain network. Scripts, application programming interfaces (APIs), user interfaces and frameworks are all part of it. This includes cryptocurrency systems and hyperledgers. Contract Layer The contract layer works with the contract itself. Since there are financial repercussions to a poorly defined or executed contract layer, great care must be utilized to ensure that the contract is issued correctly and free of potential weaknesses. The software also needs to be verifiable, secure and reliable., the contract must be executed correctly and be free of potential weaknesses. The software needs to be verifiable, secure, and reliable. 3 Hash and Block Structure Hash Algorithm is a function that converts a sequence of messages of any arbitrary length to a shorter fixed-length value. The hashes used in Block-chain are one-way, and hence can’t be reversed or altered. Data integrity is ensured by hash functions, hence it can be easily verified if the data has been tampered. Even if it is in unsafe environment (public), the integrity of the data can be detected based on the hash value of the data. When the data changes so does it’s hash value. Cryptographic hash function’s properties are used as a consensus mechanism by block-chain. Cryptographic hash acts as a digest and digital fingerprint for a definite quantity of data. SHA (Secured Hash Algorithms) is a type of cryptographic hash function issued with general characteristics of a cryptographic hash function. The SHA256 algorithm is a part of SHA-2 algorithm cluster, which generates a 256-bit message. Primarily 2 phases are involved in algorithm’s calculation process: message preprocessing and main loop. In Message preprocessing stage, binary bit and message length are filled irrespective of message size. Later the message is divided into several 512 bit message blocks. Later in main loop phase, each message block is processed by a compression function. The input of the current compression function is the output of the previous compression function, and the output of the last compression function is the hash value of the original message. 5 As the SHA series function, the first step RIEPEMD, the first and most important stage is message complement. The complement method is identical to the SHA series algorithms. Core of RIEPEMD is compression function (which is loop), where each loop consists of 16 step functions. Using different original logic functions in each loop, the processing of the algorithm is divided into two different cases, with five of the two original logic functions running in reverse order. After all 512-bit packet processing is completed, the resulting of 160-bit output is the hash value of the original message. Hash functions can be used for performing block and transaction integrity verification. In blockchain, header of a each block contains hash value of the information of previous block. Any user can compare calculated hash value with the stored hash value. This helps in calculating integrity of the information of previous block. Similarly, hash functions are used to generate public-private key pairs. A data structure called hash pointer contains some data information and password hashes associated with the information. Hash pointer is used to verify that the information is tampered. As shown in the picture below, the blockchain is a list of hash pointers, each of which is connected by using a hash value. It is verified according to the hash value whether the data contained in the block is changed, thereby ensuring the integrity of the block information. Figure 1: Block Chain Structure The blocks contain all the data information of the whole network, composed of headers containing metadata and block body containing all the transaction data. The header encapsulates the hash of previous block, current block solution random number, Merkel root, timestamp and current block’s difficulty target. Block body contains a list of transactions for storing transaction information. Previous Hash Block hash is a key segment of the blockchain. The hash field is the value of the data information of the previous block, and all the nodes that are sequentially connected. A main chain from the creation of the block to the current block is formed. Not only information of previous block, but each block can also verify the data contained in the block according to the previous block hash. 6 Nonce The header information of each data block contains a random number whose initial value is 0. The node running the bitcoin mining machine continuously performs a SHA256 operation on the overall data of the block. When the SHA256 value calculated by the current random number does not meet the requirements, then the random number is increased by one unit, and the SHA256 operation is continued. Until the SHA256 value is less than the current data block SHA256 value, then a new data block is generated and the P2P network accepts the new data block. Therefore, the process of generating a new block is actually a process of calculating the SHA256 value and comparing it with the target value. This process of bitcoin data block generation is called Proof-of-Work. Timestamp Write time of the block data is stored in the timestamp of the current data block header. Blocks on the main chain are chronologically ordered. Timestamp can be used as a proof of the existence of block data, helping to create a database that is not tamper able and unforgettable. Target Target is used to make the computing power of entire network approximately the difficulty level required to generate 1 block every 10 minutes. Target is automatically re-valuated on the results of the past 2 weeks. Target is determined by the SHA256 value in the block. The value in the block header should fall within the controller target range to increase or decrease the target. Merkle Root Merkle Tree was first introduced to verify the integrity of large scale data. The Merkle tree typically contains the transaction database for the block, the root hash of the block header, and all branches along the underlying block data to the root hash. Tree operation groups the data of the block and inserts the generated new hash into the Merkle tree. Until the last root hash is left and recorded as the Merkle root of the block header, it is finally constructed into a tree structure. Bitcoin uses a double SHA256 hash function, which is to pass two SHA256 hash operations on the original data of arbitrary length, and use 256-bit binary digits for unified storage and identification. 7 Figure 2: Merkle Tree Transaction List Transaction list contains many details of transaction record, including the transaction number, time of each transaction, bitcoin amount, payer and many other information. Each Bitcoin is written and received together, so each bitcoin can be traced back. 4 Public Key System and Bitcoin The main use of cryptography in blockchain technology has to do with encryption of two types: Symmetric and Asymmetric. Symmetric encryption uses a single key that needs to be shared among all the people who need to receive the message. Asymmetric encryption, also called Public Key encryption uses a pair of public key and a private key to encrypt and decrypt messages when communicating. Asymmetric encryption is prevalent in the modern day, even though it has comparatively lower processing speed and encryption strength. The ledgers that are stored in the common blockchain generate a private key using an inbuilt algorithm, that can only be decoded by the public key available with the users that need to access their storage block. Elliptic Curves Cryptography is a common public key encryption algorithm. The security depends on the difficulty of the elliptic curve discrete log problem. The public key encryption algorithm used in the blockchain is SECP256K1 in the elliptic curve. SECP256K1 is based on an elliptic curve over a finite field. Due to its special construction, its optimized implementation can achieve a 30 percent improvement over other curves. 8 New encryption mechanism such as SHA256, SECP256R1, SECP256K1, KECCAK-256 (Ethereum) can effectively avoid the possibility of backdoors. These cryptic methods require unimaginable amounts of time and energy to crack them via brute-force. The key pair (private key, public key) is generated by public key encryption. In the payment link of bitcoin transactions, the recipient’s address is generated by a public key, called the bitcoin address, which is the payee. As shown in the figure below, the private key is a number, usually randomly selected, and the public key is generated by encrypting the private key by elliptic curve multiplication. A singleentry encrypted hash function is used to generate the bitcoin address through the public key. Figure 3: Bitcoin Address Generation Elliptic Curve Cryptography (ECC) is a key-based technique for encrypting data. ECC focuses on pairs of public and private keys for decryption and encryption of web traffic. ECC is frequently discussed in the context of the Rivest–Shamir–Adleman (RSA) cryptographic algorithm. RSA achieves one-way encryption of things like emails, data, and software using prime factorization. In contrast to RSA, ECC bases its approach to public key cryptographic systems on how elliptic curves are structured algebraically over finite fields. Therefore, ECC creates keys that are more difficult, mathematically, to crack. For this reason, ECC is considered to be the next generation implementation of public key cryptography and more secure than RSA. It also makes sense to adopt ECC to maintain high levels of both performance and security. That’s because ECC is increasingly in wider use as websites strive for greater online security in customer data and greater mobile optimization, simultaneously. More sites using ECC to secure data means a greater need for this kind of quick guide to elliptic curve cryptography. When a user wishes to generate a public key using their private key, they multiply their private key, a large number, by the Generator Point, a defined point on the SECP256K1 curve. Thanks to the Discrete Log Problem, dividing a public key by the Generator Point cannot yield a private key. 9 All elliptic curves are equations with a specific template: y 2 = x3 + ax2 + b Because the y component of the equation is squared, this elliptic curve is symmetric across the x-axis, and for each value of x, there are two values of y, one of which is odd while the other is even. This allows public keys to be identified simply by the x-coordinate and the parity of the y-coordinate, saving significant data usage on the blockchain. 5 Digital Signature and Currency Trading Signature Algorithm and Verification Algorithm forms digital signature system. Signature Algorithm generates digital signature on the message which is controlled by signature key. Signature key is kept secret by the signer which is verified by the Verification Algorithm. Verification Algorithm and it’s controller verification key are public, so anyone who needs to verify the signature can easily verify it. Blockchain which is underlying technology in a Cryptocurrency system involves the digital currency owner hashing the content of the previous transaction order of the digital currency and the address of the next owner. The data is signed with its own private key is appended to the end of the transaction list and sent to the recipient who verifies the received information, and then verify the owner of the transaction. Each transaction in the blockchain is recorded with its previous owner, current owner and next owner of the currency. This enables tracing back of money, avoiding double payment, false transactions and other discrepancies. Figure 4: Signing and Verification of the transaction 10 From the above figure we can understand that user 2 performs transaction with user 3. Let’s say that user 2 pays 50 bitcoins to user 3, the amount and the source of the bitcoin are recorded on the transaction slip. The 50 bitcoins which user 2 has are from user 1, and therefore to complete the transaction of user 2 to user 3, it is necessary to record the source of the bitcoin, the amount of the payment and the digital signature of the user 2. Signature is mainly completed by the player, by hashing the transaction data information of the previous transaction to obtain its hash value. Then the payer encrypts this obtained hash value with its own private key. The encrypted data is sent to the recipient simultaneously as the digital signature of the previous transaction data message and the previous transaction data. After receiving the information, the receiver will verify it by using the same hash function as the previous step to obtain the hash summary. Finally, the payer’s public key is used to decrypt the additional digital signature of the previous step to obtain another hash digest. Validity is ensured by comparing the 2 summaries. If the contents of both are same than recipient can confirm the order is valid. 6 Blockchain Consensus Mechanism Accounting nodes in the blockchain network which are used to confirm the transaction information, thereby ensuring the consistency of the data which is determined with the consensus mechanisms. Proof of Work was used by early Bitcoin Blockchain. This mechanism heavily relies on the computing power of the nodes to determine the consistent accounting for the bitcoin network distributed accounting. Each node in PoW mechanism needs to rely on its own power to solve the SHA256 calculation problem to find a suitable random number Nonce, so that the SHA256 hash value of the block header is smaller than the setting value of the difficulty target in the block header: H(n||h) ≤ t. H is the SHA256 hash function; n is the random number Nonce; h is the block header data, mainly including the previous block hash, Merkle root, etc.; t is the difficulty target - the smaller the t value, the more complex is the n value found. Node that is first found obtains the accounting rights of the new block. The consensus process of PoW is the blockchain process is as follows: 1. Every new transaction is broadcast to all nodes in the blockchain network. 2. Each node gathers all the transactions received since the previous block was generated to calculate the Merkle’s root of the block header based on these transactions. This method is followed to create a new block. Nonce of the block header is increased from 0 to 1 until twice the SHA256 hash value of the block header are less than or equal to set value of the target. 3. Calculations are performed simultaneously at whole network’s nodes. Node which finds the correct random number first will gain the billing rights of the new block and mining reward and broadcasts the block to the entire network. 11 4. After receiving the block, other nodes verify the validity of the transaction and the random number Nonce encapsulated in the block. If correct, the block is added to the local blockchain and the next block is built based on the current block. With never ending development in these fields and emergence of new cryptocurrencies, researchers have proposed various mechanisms that can be used to get the results without being restricted by computing power. For Example: PoS and DPoS, as well as some distributed consistency algorithms such as Raft, PBFt etc. have their own pros and cons while having different application scenarios. 7 Security Issues in Blockchain Any node that joins the blockchain network can obtain a complete copy of the global ledger that stores the transaction information publicly. Potential attackers pose threat to users transaction privacy and identity privacy by analyzing transaction records. For example, the fund balance and transaction details of a specific account, flow of funds and so on can easily be interpreted from the public transaction records. The identity privacy threat means that the attacker can obtain the identity information of the trader by combining some background knowledge based on analyzing the transaction data. Coinjoin, ring signature, Zero-Knowledge Proof etc. are used to counter this privacy attack. Dash uses coin measure where a coin node consolidates multiple transactions into single transaction, hiding the relationship between the payment address and the payer’s details. However, the effect of coin depends on the number of users participating in the coin, and high number of transactions make the system susceptible to threats. Also the mechanism of coin is easy to crack and analyst can easily analyze the private information using certain methods which prevents coin measure producing desired results. Therefore, a cryptographic mechanism is added to ensure the security of mixed currencies. Identity information of sender is hidden by ring signature which recognizes the verification signature without knowing identity information. Zero-Knowledge proof proves the verification of the transaction without leaking the transaction and additional information. Data security and privacy protection in the blockchain are severely challenged, and advanced cryptography techniques can effectively solve such problems, but there are still weak links. The private key is generated by the random number generator in the computer system, which is called pseudorandom, has certain regularity, and has the threat of being cracked. SHA-2 algorithm doesn’t have an effective method to crack all these algorithms, but once its cracked, the privacy and security of all data in the blockchain will cease to exist. In future research, it is necessary to develop a coin-rich mechanism under the protection of cryptography mechanisms, and to minimize the performance requirements. A more secure and reliable cryptographic encryption algorithm may be needed to improve and guarantee the security of the blockchain. 12 8 Conclusions This study introduces the main applications of cryptography in the blockchain and analyzes a few existing problems. From the blockchain infrastructure, the blockchain technology is simplified and analysed. The cryptography technology is introduced to elaborate the blockchain and the existing security problems in the blockchain are looked at. We observe that digital encryption technology runs through the blockchain backbone and is the core technology of the system. This paper emphasizes that the research of cryptography plays a decisive role in the development of blockchain, and prospects the future research direction of blockchain technology. 13