In the previous lessons, we have learned how we identify(address) data in Distributed Systems such as IPFS, using the content addressing approach.

In this section, we will learn how we link data in Distributed Systems.

But let's first see why do we need to do link data, anyways?

Take an example of this webpage that you are reading. If you remove all the links(URLs) from it, you end up with ONLY plain boring text. There will be no image, videos, hyperlinks. If you want to see an image or want to go to another webpage, you will have to type the address(URL) for the image or the webpage. Sounds exhausting, right?

This doesn't sound like the web that we want. Thus, we need links(URLs, in case of location addressing) to connect different data together.

Similarly, the distributed web also needs links. We do this by the help of Directed Acyclic Graphs or DAGs. These structures(DAGs) allow us to do for data what URLs and links did for HTML web pages.

Now, before diving more into data structures like Merkle-trees and DAGs, let's see what is a data structure.

The Basics: Data Structures

Whether you're a programmer or not, you're surrounded by data structures every day. Lists, dictionaries, and catalogs all help us organize information and take into account the relationships between various pieces of data.

From Wikipedia:

In computer science, a data structure is a data organization, management and storage format that enables efficient access and modification. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.
Decentralized data structures

On the decentralized web, where we access data directly from our peers rather than from a central authority, we need specialized data structures that allow us to verify and to link between various pieces of content.

Data structures shared through decentralized systems need to be verifiable. On a single system such as your own laptop, you have a much greater degree of trust in the data structures you work with in-memory or on disk. But in a decentralized system(where you also get data from other people's laptops), you have less, or possibly zero, trust among peers(as the data shared by other people may contain viruses).

Large data structures must also be able to be spread out among many peers and linked together to allow for decentralization. In the same way that any web page can link to another web page in a different location, decentralized data structures enable a web of interlinked data.

Now as we have understood the significance of data structures, let's see how we link data on the centralized web and what are the problems with the current approach.