Prediction of Ethereum Blockchain ERC-20 Token Standard Smart Contract Vulnerabilities Using Source Code Metrics: An Ensemble Learning Approach

Prediction of Ethereum Blockchain ERC-20 Token Standard Smart Contract Vulnerabilities Using Source Code Metrics: An Ensemble Learning Approach

Nemitari Ajienka, Richard Ikechukwu Otuka
DOI: 10.4018/978-1-6684-3855-8.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this study, firstly, a dataset of 10,476 annotated vulnerable ERC-20 standard token smart contracts (belonging to a set of 33 common smart contract vulnerabilities) has been collected from a publicly available repository. Secondly, using the SolMet smart contract metrics measurement tool, the object-oriented software attributes (i.e., metrics) from each smart contract's source code has been extracted. Lastly, using the source code metrics and the vulnerability annotations (i.e., labels) as the input in supervised machine learning (classification) algorithms, the accuracy of each individual algorithm is evaluated against the accuracy of an ensemble classifier (namely voting). The model accuracies demonstrate the feasibility of identifying and prioritising smart contracts for further inspection prior to deployment to the blockchain network. The ensemble classifier performed better (accuracy = 0.79) compared to each classifier when used individually.
Chapter Preview
Top

Introduction

Ethereum can be viewed as a huge transaction-based state machine, where its state is updated after every transaction. A Smart Contract (SC) is a program deployed and stored in the Ethereum blockchain by a contract-creation transaction. Given the immutable nature of the blockchain, once a SC has been deployed it cannot be modified or updated. In addition, many identified bugs and vulnerabilities in smart contracts have led to significant financial losses (e.g., $50 million the case of the DAO hack), which raises serious concerns about smart contract security. As such, there is an inevitable need to better maintain smart contract code and ensure its high reliability prior to deployment especially smart contracts that are used to create and hold millions of US dollars’ worth of digital currencies. A popular standard for such smart contracts on the Ethereum blockchain network is the ERC-20 token standard. The security of these smart contracts is the focus of these chapter by using the source code quality measurements of the chapter as the input in supervised machine learning models to predict the vulnerability contained within each smart contract source code.

Based on this premise, this chapter has the following objectives:

  • 1.

    Collect a sample of smart contract source code which have already been deployed to the public Ethereum blockchain network via the etherscan blockchain explorer API.

  • 2.

    Extract the source code attributes/metrics for each smart contract

  • 3.

    Append the most prevalent identified vulnerability of bug in each smart contract to the software metrics dataset

  • 4.

    Use the source code metrics and the dependent variable (the vulnerability labels) as the input in supervised machine learning algorithms to create models that can predict a smart contracts vulnerability based on its source code attributes.

  • 5.

    Compare the accuracy obtained when the machine learning algorithms are used independently and when the algorithms are combined to form an ensemble algorithms/classifier.

Top

Background

Blockchain Technology is the technology at the core of the decentralised Bitcoin payment system (with Bitcoin as its native cryptocurrency) as well as the Ethereum blockchain platform (with Ether as its native cryptocurrency). When compared to Bitcoin, Ethereum permits the development and deployment of decentralised and distributed applications called smart contracts (with Solidity being the main programming language for smart contract development) which are self-executed pieces of code that run on the EVM (ethereum virtual machine). Once deployed via a smart contract creation transaction1, these smart contracts cannot be altered or modified due to the immutable nature of the blockchain technology which does not permit modification of transactions (Li et al., 2017) to prevent people from performing double spending as an example (i.e., double spending 10 bitcoins by transferring to Party A from Party C and sending the same 10 bitcoins in another transaction to Party B from C again while C’s initial balance was 10 bitcoins). As such, software with blockchain smart contracts at their core require an effective software development process with security as a top priority.

Various use cases for smart contracts or decentralised applications (as they are called due to the nature of the decentralised blockchain network) have been seen over the years. Including escrow smart contracts, decentralised microinsurance, decentralised finance (with lending and repayments), etc. Most importantly, a more popular use case is the creation of alt coins or alternate coins which are digital currencies built on top of the ethereum blockchain network apart from the native Ether. These digital currencies are alternatives to the native Ether and are created via smart contracts. Some of these have also been used in crowdfunding blockchain-based startups in what is popularly referred to as an Initial Coin Offering (ICO). Just like Ether, these currencies can hold value, be transferred from one part to another and can be received. There are various smart contract standards created by the Ethereum community for creating these alt coins (Liu et al., 2021). A more popular one is the ERC-20 token standard2 which sets out a specific list of functionalities any ERC-20 alt coin must implement, i.e., three optional (token name, symbol, decimals up to 18) and six compulsory (totalSupply, balanceOf, transfer, transferFrom, approve, allowance) functionalities.

Key Terms in this Chapter

Decentralised Autonomous Organisation (DAO): A new type of self-governing organisation that leverages smart contracts on the Ethereum blockchain. In return for their early support, participants receive DAO tokens that allow them to vote on important decisions.

Cryptocurrency: A digital currency whose security is guaranteed through the use of cryptography.

ERC-20 Token Standard: ERC-20 has emerged as the technical standard; it is used for all smart contracts on the Ethereum blockchain for token implementation and provides a list of rules that all Ethereum-based tokens must follow.

Decentralised: No central authority.

Public Blockchain Network: A blockchain that grants read access and ability to create transactions to all users.

Smart Contract: A contractual agreement built on computer protocols, whose terms can be executed automatically.

Alt Coin: A cryptocurrency other than bitcoin or ether built on top of the Ethereum blockchain network via smart contract source code.

Ethereum: A blockchain that runs smart contracts using its own cryptocurrency, the Ether.

Ether: The cryptocurrency used on the Ethereum blockchain.

Complete Chapter List

Search this Book:
Reset