Content addressing and linked data

Content addressing is a way to find data in a network using its content rather than its location.

To understand this let's take an example of a library with a lot of books.

When you look for a book in the library, you often ask for it by the title. That's content addressing because you're asking for what it is.

If you were using location addressing to find that book, you would ask for it by where it is: "I want the book that's on the second floor, first stack, third shelf from the bottom, four books from the left." If someone moved that book, you would be out of luck!

Now, let's move from our library to the world's biggest library, the internet.

It's the same on the internet and on your computer. Right now, content is found by location(using location addressing), such as…

https://simpleaswater.com/logo.ico
Location Addressing

This location addressing approach has a number of problems:

  • Let's suppose I am living in Turkey and simpleaswater.com is hosting a clone of Wikipedia. As Wikipedia is banned here, simpleaswater.com website will receive a lot of traffic. This will make the website slow and could probably crash it.
  • As Wikipedia is banned, government will try to shut down any clone of Wikipedia in Turkey. If the government finds out, it will be easy for the government to shut down the website(censorship), as there is only one server hosting it.

If you recall, these are the same problems that we mentioned in the Why do we need IPFS section.

To solve this problem, we use content addressing.

The way we do is by taking the content of the file and hashing it. Try uploading an image to IPFS and get the hash using the below button. In the IPFS ecosystem, this hash is called Content Identifier, or CID.

And as we saw in the previous lesson, we get a unique sequence of letters like one given below for logo.ico.

QmPAwR5un1YPJEF6iB7KvErDmAhiXxwL5J5qjA3Z9ceKqv

Now you can use this hash to access the image from any computer that has the image with the above hash(given that the computer is running IPFS).

Content Addressing

Take a look for yourself:

http://gateway.simpleaswater.com/ipfs/QmPAwR5un1YPJEF6iB7KvErDmAhiXxwL5J5qjA3Z9ceKqv

So, now we are not going to a folder on a specific server(location addressing), but asking anyone in the IPFS network who has a file, which will have the hash value QmPAwR5un1YPJEF6iB7KvErDmAhiXxwL5J5qjA3Z9ceKqv.

So, if the people of Turkey were using content addressing to access the data on the internet, then the above problems will be solved:

  • Everybody who is using the Wikipedia website can also share the webpages from their computer. So, the load on simpleaswater.com will decrease. So, you will get the webpages faster.
  • If the government finds out, it will be easy for them to shut down the website(censorship-resistant), as there is only one server hosting it.

So, in this lesson, we learned how we can create a more resilient, censorship-resistant and faster internet using content addressing.

Playing with CIDs

In the next section, we will study about InterPlanetary Naming System.