PageRank and HodgeRank on Ethereum Transactions: A Measure for Social Credit

Mathematical ranking plays a critical role in the era of the internet and bigdata. Google’s PageRank is well-known as a trillion-dollar algorithm. Definitely, algorithmic ranking frameworks are found on every search engine. In this paper, the article shall investigate how PageRank can be applied in the blockchain space to build up reliable and verifiable social credit and reputation systems. It is expected to provide a measure of credibility complementary and parallel with FICO, which is not applicable for individuals lacking credit information in financial institutions. Moreover, the approach proposes an unbiased method of interpreting and measuring real social interaction and reputation ranking on a blockchain network. The authors envision a future of payment based on cryptocurrencies (especially stable coins) and digital fiats; thus the proposed credit scoring framework shall be helpful for P2P credit and lending networks, possibly for decentralized finance (Defi) applications.

• We present a new approach to interpreting and measuring real social interaction and reputation ranking on a public blockchain network. • For the first time, ranking algorithms (PageRank and HodgeRank) are investigated and implemented in the blockchain space to build up a reliable and verifiable social credit and reputation system. • Finally, We design an efficient data pipeline and compute process to analyze public blockchain transaction data for future studies.
The rest of this paper is organized as follows. The conceptual background and related works are described in detail in Section II. Section III presents the methodology and experimental results. Finally, Section IV gives the conclusions and the future study.

CoNCEPTUAL BACKGRoUNd ANd RELATEd woRKS Credit Score and Reputation Systems
Credit score is a critical concept and an attractive object of research and application in the financial industry. The first model to score personal credit is FICO introduced in the U.S (1989) by Fair Isaac Corporation. It's been the most popular framework for decades, used by a vast group of American financial institutions (Shweta Arya et al., 2013).
FICO score gives lenders a hint of "how likely a person is to repay a loan". Higher FICO score, better credibility, hence lower risk. We do not describe the FICO model in detail but summarize it as follows ( fig. 1).
• Payment history (35%) shows how a person has paid his accounts over the length of his credit.
Bankruptcy, judgments, charge-offs, and late payments will lower the FICO score. • Debt size (30%) refers to the total debt but is precisely measured by different metrics including the debt to limit ratio, number of accounts with balances, the amount owed across different types of accounts, and the amount paid down on installment loans. • Length of credit history (15%) evaluates the average age of the accounts on a report and the age of the oldest account. • Credit mix (10%) appreciates the history of managing different types of credit, for example, installment, revolving, consumer finance, and mortgage. • New credit (10%) will consider new credit accounts in a (short) period of time because this exposes greater risk.  (table 1) has many criticisms, although it has been used widely in the U.S for a long time. The credit score may imply the discrimination of the effect between favored and disfavored groups and broaden the moral and economical gaps between rich and poor people. Technically, the exact formulas to compute the scores are not a good indicator of risk. Of course, if a person hasn't used any credit, or doesn't have any payment history recorded in the banking systems, his credit score will be nearly zero. Although FICO and similar credit scoring systems have been popular in the U.S for many decades, they have several limitations.
• Data breach: Equifax, one of the three biggest credit bureaus in the U.S, suffered a massive data breach, affecting nearly 148 million customers. Attackers could exploit the gathered credit card information to spend somewhere or for other bad purposes. • Limited access: Credit scoring is not designed for unbanked people at all, even for bank account holders with no credit spending and payment history. • Limited data range: Credit scoring is based on credit reports only, ignoring a lot of valuable information, for example, personal profiles, properties, or assets possessing.
The Chinese social credit system, since 2014, has had a much different approach (McWilliams et al., 2020). It intends to standardize the assessment of citizens' economic and social reputation (or social credit) as comprehensively as possible. The system gathers many types of behavioral data of individuals to analyze, then applies punishment upon fraudulent commits (e.g. government services, hospitality), to education and banking services (e.g. credit rating, lower/higher interest rate of loans). Experts have shown their concern about the Chinese social reputation system as it may harm citizens' privacy rights. Moreover, the centralized database and servers, by nature, have some cybersecurity issues and are targets for attacks.

Blockchain Application in Credit Industry
Colendi (2019) builds an application for the credit and micro-lending industry. Colendi Core is based on a distributed identity and credit scoring protocol. The firm acquires telco (mobile phone) data, social media, and trusted partners, then scores user credit ranking, offering Colendi Score on its platform connecting lenders and borrowers. The algorithm process is depicted in figure 2.

670-739 Good
The score is near or slightly above the average of consumers and most lenders accept this as a good score.

740-799 Very Good
The score is above the average, indicating a dependable borrower.

Figure 2. The algorithm process of Colendi credit scoring system
The protocol investigates W ij , the "social connectivity strength" between two nodes i and j to define social relation strength and contribution to social credibility. It is a compound of relation scalar strength between two nodes, denoted by W ij relation and dependent bilateral social relation activity, denoted by W ij activity . It reads (for a weighting factor b ) The Colendi Score of a node i is computed as Where d the damping factor, N the total number of users (or nodes), a the normalizing coefficient, S j i the Colendi Score of the j th neighbor of the node i S max , the maximum Colendi Score of all nodes.

PageRank
The mathematics of PageRank, which originated from the work of L. Page et al., (1999), are entirely general and apply to any graph or network in any domain. Thus, PageRank is now regularly used in bibliometrics, social and information network analysis, and link prediction and recommendation. It's even used for system analysis of road networks, as well as biology chemistry, neuroscience, and physics. PageRank is the first ranking algorithm that used a mathematical method to order search results instead of Yahoo hierarchical search. PageRank implied the foundation of Google. Then it has become an active and extensive research topic. PageRank is well-known as a trillion algorithm. In fact, it has been an extensive and vibrant research area among scientists and technologists, with wide applications in search engines and other fields for decades.
PageRank aims to give universal ranking scores (as nonnegative real values) among sites in a Web graph. It is essentially based on the idea that the more referring linkages, the more important a website seems to be. Among millions of websites, the relative importance of a site is evaluated and ranked by hyperlinks referring to it. PageRank mathematically models web pages as vertices and hyperlinks as edges of a directed graph. Let A be an adjacency matrix among N pages in a Web graph. Then PageRank is a vector R that assigns real rating values over the pages based on the adjacency. Mathematically, R cAR = is an eigenvector associated with the largest eigenvalue c. In practice, the PageRank model reads Where d is the damping factor (usually d = 0 85 . ), E is the teleportation matrix. The parameters, d means a probability of random surfer from a certain page to another, and E is to ensure that H is a Markov chain (i.e. stochastic matrix), hence the algorithm to estimate R certainly converges.

HodgeRank
HodgeRank was introduced by (Jiang et al., 2011) as a promising tool for the statistical analysis of ranking, especially for datasets with incomplete and imbalanced information. To apply this method, the required data is in the form of pairwise comparisons, meaning each voter would have rated items in pairs. Pairwise comparisons are natural unbiased methods due to the arbitrariness of the rating scale by adopting a relative measure (Dym et al., 2002). They have been popular in psychology, management science, social choice theory, and statistics. HodgeRank yields an orthogonal decomposition of the edge flows of the pairwise graph into three subspaces: a gradient subspace, a harmonic subspace, and a curly subspace. The combinatorial Hodge decomposition reads Where the C G 1 ( ) is the space of edge flows on a graph G. The im(grad) denotes the subspace of pairwise rankings that are the gradient flows of score functions. Gradient subspace comprises the globally consistent or acyclic pairwise rankings. The ker(div) denotes the subspace of divergence-free pairwise rankings whose total in-flow equals total out-flow for each alternative. The ker(curl) denotes the subspace of curlfree pairwise ranking with zero flow-sum along any triangle, which corresponds to locally consistent pairwise rankings. The ker k (D ) denotes the subspace of harmonic pairwise rankings. The im(curl*) denotes the subspace of locally cyclic pairwise rankings that have non-zero curls along triangles.
To apply HodgeRank in the Ethereum transaction network context, we describe the transaction graph ( fig 3) in terms of pairwise comparisons. Each node on the network is a candidate for ranking data. The direction and amount of ETH of each transaction are the edge direction and the edge weights of the graph of the pairwise comparison. When performing actual computations, we generally store these graphs as matrices, the graph is used mainly to illustrate the general principles. The HodgeRank then projects our data onto gradient subspace which contains no intransitive comparisons, resulting in a global ranking of the network. The harmonic and curly subspace contains the residual components which capture the inconsistencies and intransitive relations in the data, indicating a less reliable rating.

dataset and Preprocessing
With the hype on blockchain technologies and particularly the Ethereum blockchain, the number of statements, actions, and transactions in the network are increasing quickly, and many BigData challenges arise. However, transactions are raw data and one cannot take advantage of them for further analysis. Fortunately, Google designed the BigQuery Ethereum public datasets that contain the Ethereum blockchain data (Allen et al., 2018). The datasets describe all Ethereum terms including blocks, transactions, contract messages, high-value data, derivatives-token transfers, smart contract methods, etc. The BigQuey structure allows users to access the Ethereum blockchain via SQL and find meaningful insights which can be fed for further analysis using graph databases, visualization tools, and machine learning frameworks. (fig 4) The Ethereum datasets in the format of BigQuery structure contain all historical data of the Ethereum public blockchain network and are updated daily. For the sake of the experiment, we limited the sample dataset containing around 34153 addresses, including value transactions, timestamps, and spent gas fees for each transaction. The dataset is extracted during the mid of 2021 when Defi applications have a rapid development, addresses and activities on the Ethereum network are diversified and more meaningful in survey and evaluation.
Inspired by the Colendi protocol, the credit scoring of a node is based on its social connections and activities on the blockchain-based network. The Ethereum transaction graph consists of nodes (addresses), and the connections (edges) among them are transactions. A transaction, for example, a node sends ETH or any token to another, has a corresponding edge added to the graph with the direction of the transaction flow. We consider multiple edges between two nodes with the same direction as a single. Besides, all transactions from an address to itself (self-loops) are omitted.
We applied some classic graph analysis algorithms to get meaningful insights into the datasets. Some indicators are counted and used as a basis to compare with PageRank scores such as the number of received (in) transactions (NoR), the number of transferred (out) transactions (NoT), the total of inout transactions (node degree), the spent gas for receiving transactions, the spent gas for sending transactions, the total spent gas. Fig 5 shows the top 100 nodes based on the total transactions of nodes (node degree). The majority of transactions are concentrated in a small number of major nodes. These nodes can be predicted as exchanges, dapps, that play an important role in the network. Thus, they can achieve high credit scores. In fact, across all over 30000 nodes sampled, the majority of nodes only perform a few transactions. Fig 6.a depicts the distribution of the dataset according to the node degree. We divide the dataset into 4 folds with the mean transaction number being 193, 15, 4, 1, and the number of nodes 100, 200, 700, and 29000 respectively. In addition, the amount of ETH transferred is collected to contribute to the dataset for the HodgeRank investigation. As mentioned above, addresses on the Ethereum trading network are treated as vertices on the pairwise comparison graph, and the amount of ETH is the weight of the edges. The global ranking vector of the ETH transaction pairwise graph is found by solving the least squares problem of the graph Laplacian (Jiang et al., 2011). However, due to the computational overhead of the batch algorithm, the Hodge dataset is limited to 5000 transactions and 6315 addresses. Fig 6.b depicts the distribution of the dataset according to the amount of ETH transferred. We also split the dataset into 5 sets with an average ETH amount of 95.23, 2.0, 0.25, 0.01, 4.6e -6 , the numbers of nodes are 100, 200, 700, 1000, and 4315 respectively. The distribution shows that a large amount of ETH transferred is concentrated in a small number of addresses, which possibly contributes to the high value and reputation of the network.
One key factor that affects the PageRank performance is the number of connected clusters on the graph and the number of nodes in these clusters. For isolated clusters with a smaller number of nodes, the nodes in these clusters should have smaller credit scores. However, these nodes have the ability to work as sinks and absorb the PageRank score, resulting in poor ranking results. The damping factor is included to prevent sinks from absorbing the PageRank of the nodes connected to the sinks. It is the probability of being on a random node after restarting. Reducing the tiny, isolated clusters containing few nodes will not significantly affect the entire transaction network, and selecting the appropriate damping factor can give a better ranking result.

Figure 6. Distribution of (a) group of the dataset according to the number of transactions, and (b) group of the dataset according to the amount of ETH transferred
Fig 7 is the result of the clustering algorithm, describing the distribution of the number of nodes and isolated clusters on the entire dataset. As shown in the figure, the number of isolated clusters with more than 10 nodes accounts for only 5% of the total number of clusters but achieves 83% of the number of nodes in the entire network. Therefore, conducting experiments on clusters with a large number of nodes makes more sense for credit scoring. The number of nodes in a cluster can be considered as an indicator to investigate and evaluate the efficiency of PageRank.

Results
We applied PageRank to the post-processing Ethereum transaction dataset to examine the ranking performance. The ranking results based on PageRank are compared with some basic analysis indicators mentioned above such as NoT, NoR, node degree, and spent gas. We also examine the damping factor value and isolated clusters to find the parameter value for the best-ranking performance. Finally, based on PageRank score results, the high-ranking nodes with high credit scores are identified, proving the correctness of the ranking experiment.

Kendall's Tau Correlation
Kendall's tau correlation coefficient, developed by Maurice Kendall, is a nonparametric measure of the strength and direction of association that exists between two variables measured on at least an ordinal scale (Abdi, 2007). It is usually used as a measure of rank correlation. Intuitively, the Kendall correlation between two variables will be high when observations have a similar rank between the two variables and low when observations have a dissimilar rank between the two variables. The PageRank algorithm was run on the chosen Ethereum data sample with 34153 nodes (addresses), and 29742 edges (transactions). Damping factor is initially selected with a value of 0.85. Kendall's tau score between PageRank, NoT rank, NoR rank, Node degree rank, and spent gas rank are given in Table 2.  Table 2

Ranking vs Indicator Kendall's tau correlation scores
PageRank-NoR rank 0.8975 PageRank-NoT rank -0.082 PageRank-Node Degree Rank 0.416 PageRank -receiving spent gas rank 0.894 PageRank-Total spent gas rank -0.349 Table 2 shows that PageRank has the highest correlation for the ranking of nodes with the NoR rank and the rank of spent gas for received transactions. That is, the node with the higher number of received transactions and the greater amount of spent gas for received transactions, the higher the score.
Kendall's Tau correlation results with HodgeRank are shown in Table 3. On the Ethereum transaction network, there are a lot of transactions has zero value in ETH transferred. This phenomenon happened because some smart contracts have been implemented without transferring any ETH. These transactions lead to zero weights on the edge of the graph. To overcome this problem, we fill the zero value with the mean value on that transaction. The result of the HodgeRank algorithm on the original dataset and the modified dataset is then called HodgeRank (1) and HodgeRank (2) respectively.
We also compare the ranking result with some basic indicators such as the amount of sending ETH, and the amount of receiving ETH. As shown in Table 3, the correlation between HodgeRank and some basic indicators is pretty low, so HodgeRank can be considered as an indicator to evaluate the important role of a node on the graph. However, because of the expensive cost of computation of the batch HodgeRank, only a small number of entities are examined. Besides, the identification of HodgeRank showed a poor result, so in this study, we focus more on optimizing the parameters of PageRank algorithms.

Damping Factor Investigation
The experiment with d = 0 85 . on the entire dataset shows that some nodes achieving high ranking scores have low social connections and belong to some small clusters. A large number of tiny, isolated clusters can be considered sinks, hence they absorb ranking. With the large damping factor, the probability of being on a random node is low, resulting in the node in the sinks achieving a highranking score. To obtain a better PageRank result, we vary on a range of damping factor points from 0.1 to 0.9. Besides, the isolated tiny clusters are also eliminated. This means that the PageRank is only performed on large clusters, which are more meaningful for the network. PageRank results are compared with NoR rank through Kendall's tau correlation.

Ranking vs Indicator Kendall's tau correlation scores
HodgeRank (1)  's tau score between PageRank and NoR rank according to damping factor and filtering of the isolated tiny clusters. The results show that the smaller the damping factor leads the greater the correlation score. In fact, with a small value of the damping factor, the high-ranking nodes can also be better identified. Nodes with low connections and belonging to small clusters are gradually removed from the high-ranking group. In particular, with the clusters having higher 100 nodes and d < 0.5, the correlation score becomes more stable showing little change in the ranking results. Changing the damping factor and filtering out isolated tiny clusters helps to achieve better ranking results. Through this experiment, we choose d = 0.5 and filter out clusters with less than 10 nodes to perform PageRank to identify nodes with high credit scores.

High Ranking Nodes Identification
The higher the rank of the node, the higher the credit score of these nodes. Such nodes can be considered as entities that play a crucial role in the network, often having the ability to identify and trust to perform transactions such as deposits, loans, and other complex financial services. The nodes with low credit scores will have fewer benefits to use financial systems. After having PageRank results with reasonable parameters, we identify high-ranking nodes. Table 4 depicts the identification results of the top 20 nodes.
The results show that most of the high-ranking nodes are identifiable. These nodes are of different types like exchanges, dapps, NFT marketplaces, and protocol contracts. This proves that the algorithm is effective in calculating credit scores for entities on the network, being able to identify nodes with important and trusted roles.

CoNCLUSIoN ANd FUTURE STUdy
This study is modest compared to the big landscape of Blockchain space and the credit industry. We have tried to approach the problem and give some experimental results on ranking entities (addresses) based on transactions on the Ethereum blockchain. The research possibly has potential applications in the social credit and reputation systems in Blockchain and crypto space. The corresponding author published an investigation (Thuat Do et al., 2019) on blockchain ranking framework that applied PageRank, HodgeRank to compute ranking scores based on transactions. The results were proposed to build a reputation mechanism adding on top of Proof of Stake consensus. This paper is a continuation of the mentioned research but focuses deeper on the PageRank algorithm and transactions on Ethereum.
In the future, we shall continue and extend the research on a larger dataset, on multiple blockchains, and examine several mathematical ranking algorithms (e.g. online PageRank, online HodgeRank). As batch HodgeRank is expensive, new directions of HodgeRank such as online HodgeRank (Xu et al., 2013) can be the solution to increase the number of transaction data and improve the performance of HodgeRank in future studies.
A Public Github repo of the sample dataset and the code is available on (Do, 2022).