Download and Installation IPFS Cluster
In order to run an IPFS Cluster peer and perform actions on the Cluster, you will need to obtain the ipfs-cluster-service
and ipfs-cluster-ctl
binaries. The former runs the Cluster peer. The latter allows to interact with it:
- Visit the download page for instructions on the different ways to obtain
ipfs-cluster-service
andipfs-cluster-ctl
. - Place the binaries in a place where they can be run by an unattended by an
ipfs
system user (usually/usr/local/bin
). IPFS Cluster should be installed and run alongipfs
(go-ipfs). - Consider configuring your systems to start
ipfs
andipfs-cluster-service
automatically (beware to check that you need to ensure your cluster is fully operational and peers discover each other beforehand). Some sample Systemd service files are available here: ipfs-cluster-service, ipfs.
Initialization
To create and generate a default configuration file, a unique identity for your peer and an empty peerstore file, run:
$ ipfs-cluster-service init --consensus <crdt/raft>
This assumes that the ipfs-cluster-service
command is installed in one of the folders in your $PATH
.
If all went well, after running this command there will be three different files in $HOME/.ipfs-cluster
:
service.json
contains a default peer configuration. Usually, all peers in a Cluster should have exactly the same configuration.identity.json
contains the peer private key and ID. These are unique to each Cluster peer.peerstore
is an empty file used to store the peer addresses of other peers so that this peer knows where to contact them.
The --consensus
flag chooses whether to initialize the configuration with a raft
or a crdt
section. All peers should be initialized in the same way. The choice between raft
and crdt
depends on multiple factors and affects how the cluster is started and the peerset modified. We have gathered more in-depth explanations in the Consensus Components section.
The new service.json
file generated by ipfs-cluster-service init
will have a randomly generated secret
value in the cluster
section. For a Cluster to work, this value should be the same in all cluster peers. This is usually a source of pitfalls since initializing default configurations everywhere results in different random secrets.
If present, the CLUSTER_SECRET
environment value is used when running ipfs-cluster-service init
to set the cluster secret
value.
Remote configuration
ipfs-cluster-service
can be initialized to use a remote configuration file accessible on an HTTP(s) location which is read to obtain the running configuration every time the peer is launched. This is useful to initialize all peers with the same configuration and provide seamless upgrades to it.
A good trick is to use IPFS to store the actual configuration and, for example, call init
with a gateway url as follows:
$ ipfs-cluster-service init http://localhost:8080/ipns/config.mydomain.com
(a DNSLink TXT record needs to be configured for the example above to work. A regular URL can be used too).
Do not host configurations publicly unless it is OK to expose the Cluster secret. This is only OK in crdt-based clusters which have configured trusted_peers
to other than *
.
Trusted peers
The crdt
section of the service.json
file includes a single *
value for the trusted_peers
array. By default, peers running on crdt-mode trusts all other peers. In raft
mode, all peers trust all other peers and this option does not exist.
Read more about trusted peers in the Security and Ports guide.
The peerstore file
The peerstore
file will be maintained by the running Cluster peer and will be used to store known-peer addresses. However, you can also pre-fill this file (one line per multiaddress) to help this peer connect to others during its first start. Here is an example:
/dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt
/dns4/cluster2.domain/tcp/9096/ipfs/QmdFBMf9HMDH3eCWrc1U11YCPenC3Uvy9mZQ2BedTyKTDf
/ip4/192.168.1.10/tcp/9096/ipfs/QmSGCzHkz8gC9fNndMtaCZdf9RFtwtbTEEsGo4zkVfcykD
Ports
By default, Cluster uses:
9096/tcp
as the cluster swarm endpoint which should be open and diallable by other cluster peers.9094/tcp
as the HTTP API endpoint9095/tcp
as the Proxy API endpoint
A full description of the ports and endpoints is available in the Security guide.
Settings for production
The default IPFS and Cluster settings are conservative and work for most setups out of the box. There are however, a number of options that can be optimized with regards to:
- Large pinsets
- Large number of peers
- Networks with very high or lower latencies
Additionally to the settings mentioned here, the configuration reference contains detailed information for every configuration section, with extended descriptions of what each value means.
IPFS Configuration
IPFS deamons can be optimized for production. The options are documented in the official repository:
Server profile for cloud deployments
Initialize ipfs using the server
profile: ipfs init --profile=server
or ipfs config profile apply server
if the configuration already exists.
Pay attention to AddrFilters
and NoAnnounce
options. They should be pre-filled to sensible values with the server
configuration profile, but depending on the type of network you are running on, you may want to modify them.
Datastore settings
For very large repos, consider enabling the Badger datastore. You can convert between datastores using ipfs-ds-convert
(instructions). Badger should be significantly faster for very large pinsets, at the expense of memory.
Increase Datastore.BloomFilterSize
according to your repo size (in bytes): 1048576
(1MB) is a good value (more info here)
Do not forget to set Datastore.StorageMax
to a value according to the disk you want to dedicate for the ipfs repo. This will affect how cluster calculates how much free space there is in every peer.
Connection manager settings
Increase the Swarm.ConnMgr.HighWater
(maximum number of connections) and reduce GracePeriod
to 20s
. It can be as high as your machine would take (10000
is a good value for large machines). Adjust Swarm.ConnMgr.LowWater
to about a 25% of the HighWater value.
File descriptor limit
The IPFS_FD_MAX
environment variable controls the FD ulimit
value that go-ipfs
sets for itself. Depending on your Highwater
value, you may want to increase it to 8192
or more.
IPFS Cluster configuration
The service.json
configuration file contains a few options which should be tweaked according to your environment, capacity and requirements.
cluster
section
When dealing with large amount of pins, you may further increase the cluster.state_sync_interval
and cluster.ipfs_sync_interval
. These operations will perform checks for every pin in the pinset and will trigger ipfs pin ls --type=recursive
calls, which may be slow when the number of pinned items is huge.
Consider increasing the cluster.monitor_ping_interval
and monitor.*.check_interval
. This dictactes how long cluster takes to realize a peer is not responding (and potentially trigger re-pins). Re-pinning might be a very expensive in your cluster. Thus, you may want to set this a bit high (several minutes). You can use same value for both.
Under the same consideration, you might want to set cluster.disable_repinning
to true
if you don’t wish repinnings to be triggered at all on peer downtime and want to handle things manually when content becomes underpinned. replication_factor_max
and replication_factor_min
allow some leeway: i.e. a 2⁄3 will allow one peer to be down without re-allocating the content assigned to it somewhere else.
raft
section
These options only apply when running raft-based clusters.
If you are planning to re-start all Raft peers at the same time (for example, after an upgrade), consider setting raft.wait_for_leader_timeout
to something that gives ample time for all your peers to be restarted and come online at once. Usually 30s
or 1m
.
If your network is very unstable, you can try increasing raft.commit_retries
, raft.commit_retry_delay
. Note: more retries and higher delays imply slower failures.
For high-latency clusters (like having peers around the world), you can try increasing heartbeat_timeout
, election_timeout
, commit_timeout
and leader_lease_timeout
, although defaults are quite big already. For low-latency clusters, these can all be decreased (at least by half).
For very large pinsets, increase raft.snapshot_interval
. If your cluster pins or unpins very frequently, increase raft.snapshot_threshold
.
crdt
section
These options only apply when running crdt-based clusters.
Reducing the crdt.rebroadcast_interval
(default 1m
) to a few seconds should make new peers start downloading the state faster, and badly connected peers should have more options to receive bits of information, at the expense of increased pubsub chatter in the network.
You can edit the crdt.cluster_name
, as long as it is the same for all peers.
restapi
section
Adjust the api.restapi
network timeouts depending on your API usage. This may protect against misuse of the API or DDoS attacks. Note that there are usually client-side timeouts that can be modified too if you control the clients.
The API can be disabled by removing the configuration section.
ipfshttp
section
Adjust the ipfs_connector.ipfshttp
network timeouts if you are using the ipfs proxy in the same fashion as the restapi
.
The Proxy API can be disabled by removing the configuration section.