Off-Chain Data Storage: Ethereum & IPFS
This blog has moved to https://didil.substack.com/
The Ethereum gas costs issue
Most Decentralized Apps running on the Ethereum Platform need to store/retrieve data, just like conventional or “centralized” apps do using PostgreSQL, MongoDB, Redis, etc. The EVM (Ethereum Virtual Machine) does indeed allow us to save variables/state in permanent storage. Let’s look at this simple Solidity contract:
pragma solidity ^0.4.17;contract Database {
bytes x; function write(bytes _x) public {
x = _x;
}
function read() public view returns (bytes) {
return x;
}
}
I’ve deployed this contract on Rinkeby test net and generated 1024 of random bytes using https://www.random.org then stored 1kB of data using the write function. The resulting transaction can be seen here : https://rinkeby.etherscan.io/tx/0x6575badcafbc4db521e82904fa14b04bd8e862de1c82f62e064e699d0f90ebe3
The Gas used amounted to 754,365 @ 20Gwei Gas price = 0.0150873 Ether. At the time of writing this post (Oct 17, 2017) the Ether price is currently 328.79 USD/ETH. So storing 1kB of data would have cost $4.96 to run on the Ethereum Main Net. That means ~ 5 Million USD / GB !
Alternatives
Saving a few bytes to the EVM is ok but for larger chunks of data the costs are probably too high for most projects. One solution is to modify our data storage strategy and save the data off-chain (as opposed to the on-chain approach we took above). There are multiple off-chain storage options: IPFS and Swarm are 2 popular ones. I’ll use IPFS in this post but Swarm works equally well.
Enter IPFS
Looking at the wikipedia article on IPFS :
InterPlanetary File System (IPFS) is a protocol designed to create a permanent and decentralized method of storing and sharing files
IPFS allows p2p storage and we can use it as a distributed file system to store data.
Low Cost Data Storage Strategy
Saving data on IPFS provides a unique hash. Instead of storing the data on the contract, we’ll only store the hash on the contract and then we can use the hash to retrieve the data.
In production we’d need to create our own IPFS node, but INFURA provides a node for developers which we can use for free.
Here is a js snippet you can try out on https://npm.runkit.com/ to save data to IPFS :
const IPFS = require(‘ipfs-mini’);
const ipfs = new IPFS({host: ‘ipfs.infura.io’, port: 5001, protocol: ‘https’});
const randomData = “8803cf48b8805198dbf85b2e0d514320”; // random bytes for testing
ipfs.add(randomData, (err, hash) => {
if (err) {
return console.log(err);
}
console.log(“HASH:”, hash);
});
this should return the hash “Qmaj3ZhZtHynXc1tpnTnSBNsq8tZihMuV34wAvpURPZZMs” which we can use to query our data:
const IPFS = require(‘ipfs-mini’);
const ipfs = new IPFS({host: ‘ipfs.infura.io’, port: 5001, protocol: ‘https’});
const hash = “Qmaj3ZhZtHynXc1tpnTnSBNsq8tZihMuV34wAvpURPZZMs”;
ipfs.cat(hash, (err, data) => {
if (err) {
return console.log(err);
}
console.log(“DATA:”, data);
});
and this should return our data : “8803cf48b8805198dbf85b2e0d514320”
One remark is that the hash string size is independent of the data size, which means we can store large data chunks or files on IPFS (I couldn’t find a current size restriction) without increasing our Ethereum transaction costs !
I’ve used our previous contract to store the IPFS hash generated above :
https://rinkeby.etherscan.io/tx/0x53ae68a0f7302d8808d836c560f54f83b2b870f02b136338c8abde03f2e3cfb9
The gas usage has decreased to 40,907 @ 20Gwei Gas price = 0.00081814 Ether = 0.27 USD
We now have a much more acceptable storage cost and it should be pretty much constant independently of what we’re storing on IPFS !
Example Project: Stone Dapp
I’ve built a small proof-of-concept project around this idea called Stone Dapp, feel free to check it out :
- Github : https://github.com/didil/stone-dapp
- Live version (Rinkeby) : https://stone-dapp.firebaseapp.com
P.S.: In the examples above I’ve set the Gas Price to 20 GWei. To help you choose the Gas Price you want to pay, you could check out http://ethgasstation.info . ETH Gas Station provides transaction confirmation time estimates and other useful network stats.