Introduction to The Graph — Decentralizing Data

3 min readJun 10, 2021

This is the introduction of my report on The Graph Protocol that was published by Messari here. If you prefer reading my twitter thread, it’s there.

Solving The Problem of Accessing On-Chain Data

One advantage of open and decentralized protocols is that they are public: everyone can see what’s happening on the networks. However, accessing and using this public data can be more complex than it appears. Today we have two options: going through all the history of the blockchain by yourself (e.g. running a full node), or querying a block explorer like Etherscan.

Option one is pretty resource consuming: it takes time, you need to store a copy of the entire blockchain, and to be connected at all times. Option two relies on a centralized third-party, it isn’t trustless.

Both options will also work differently for each blockchain and protocol, increasing a lot the complexity of having complete data in the cross-blockchain environment we see emerging today.

The Graph aims to solve that through a decentralized query protocol. If you want to dig deeper into the why, you can read their article comparing traditional APIs and SQL to The Graph Protocol and GraphQL.

A Gentle Introduction to The Graph

The Graph is a decentralized indexing protocol for Web3. It enables querying blockchain data without being connected to a blockchain or having to rely on a centralized third-party. In simpler terms, it’s a decentralized API protocol for blockchains and their decentralized applications.

A Short Example

Imagine a decentralised application built on top of Aave that would need access to the protocol’s data. Up until now, Aave had to build and maintain a centralized API on their servers to allow others to access and use the data. With The Graph, Aave developers will write a subgraph manifest (data schema), and multiple indexers will index Aave’s data, fetching it directly on the Ethereum network, actually creating a decentralized API for Aave.

Some advantages of the Graph are that dapps won’t have to maintain their APIs themselves, protocol data will be decentralized, and it allows developers to have a common query structure and language for all protocols of any open blockchain.

How it works

Data indices, called subgraphs, are built from a subgraph manifest — a document describing which data from a specific protocol needs to be indexed and how — so it can later be queried easily by users and applications. Each subgraph can be queried with a standard GraphQL API call. GraphQL is an open-source data query and manipulation language for APIs initially developed by Facebook.

Once these mapping instructions, from blockchain events to how the data is stored, are recorded in a Graph Node, the node listens for any changes on the chain and updates its subgraph accordingly.

Then, each indexed subgraph can be queried like a traditional API through its GraphQL endpoint, and the data is fetched from a decentralized network of indexers. You can find an example of a script I wrote querying data from an Aave subgraph here.

If you’re interested in learning more about The Graph, you should read their docs or join their Discord. Feel free to contact me on Twitter if you have any question!