Simplifying the STORJ Whitepaper

Here we’ll quickly breakdown what Storj is and the system’s consensus protocol, potential vulnerabilities, and projected future state — all straight from the whitepaper in a way we call can understand!

Here’s my annotated version of the white paper with notes for your reference!

What is Storj?

Stoj (pronounced “Storage”) is peer-to-peer cloud storage network implementing client-side encryption (encrypting the data before it gets to data storage). They focus on disrupting the centralized, third party data storage model by eliminating the overhead costs of data farms, and, instead, incentivizing normal people with additional hard drive space to offer zed space to the network. Such a decentralized approach prevents third party services from “owning” your data, greatly decreases storage costs, and makes data on the network will be resistant to censorship, tampering, unauthorized access, and data failures (via client-side encryption).

What are Shards?

Storj stores shards across a distributed and decentralized network of storage nodes (“farmers” — those who are renting their hard drives to store data). Here’s a quick description of shards:

  • A shard is a portion of an encrypted file to be stored on this network.
  • Shard size is a negotiable contract parameter. To preserve privacy, it is recommended that shard sizes be standardized as a byte multiple, such as 8 or 32 MB.

Storj Consensus Protocol

The Storj protocol enables peers on the network to negotiate contracts, transfer data, verify the integrity and availability of remote data, retrieve data, and pay other nodes. Here are some notable qualities of zed protocol:

  • Storj is built on Kademlia, a distributed hash table (DHT). Kademlia creates a distributed network with e cient message routing and other desirable qualities. Storj extends this message protocol with its own calls
  • As the set of shards in the network grows, it becomes exponentially more difficult to locate any given shard set without prior knowledge of their locations — implying that security exponential scales with the linear growth of the network
  • Each Node ID in the Storj network is also a valid Bitcoin address, which the node can spend from
  • Nodes sign all messages, and validate message signatures before processing messages.
  • Data owners are responsible for everything — negotiating contracts, pre-processing shards, issuing and verifying audits, providing payments, managing file state via the collection of shards, managing file encryption keys, etc.
  • Storj provides a standard format for issuing and verifying proofs of retrievability via a challenge-response interaction called an audit or heartbeat.
  • Storj is payment agnostic
  • To facilitate on-disk storage for farmers, Storj implements a local file store called KFS. KFS is an abstraction layer over a set of LevelDB (a key value store) instances that seeks to address scaling problems.
  • Data is transferred via HTTP. Farmers expose endpoints where client applications may upload or download shards
  • Storj Bridge API is an abstraction layer that streamlines the development process. The Bridge API uses public-key cryptography to verify clients. Rather than the Bridge server issuing an API key to each user, users register public keys with the Bridge.

Potential Vulnerabilities

  • Clients have to purposely implement data redundancy schemes due to potentially volatile nature of networks consistency — this creates a high learning curve where clients can more easily lose their data due to farmer node inconsistencies (like just randomly deciding not to be on the network anymore)
  • The issue with the farmer node storage network is consistency and future scalability — not that the network is not economically scalable, but the fact that there will come a time where the profitability of being a storage node will no longer exist due to increased electricity costs. The point of network consistency calls out the fact that no one has control over the storage node’s availability — that it can be turned off or broken at any time.
  • Data is transferred via HTTP. Farmers expose endpoints where client applications may upload or download shards — Farmer nodes can be hacked by exposing their IP addresses.
  • Storj Bridge serves as central point of failure (Bridge is designed to store only metadata)
  • Spartacus attacks, or identity hijacking, are possible on Kademlia.
  • Sybil attacks (somewhat)
  • The Google attack, or nation-state attack (a hypothetical variant of the Sybil attack carried out by an entity with extreme resources)
  • The Honest Geppetto attack — The attacker operates a large number of puppet nodes on the network, accumulating trust and contracts over time. Once he reaches a certain threshold he pulls the strings on each puppet to execute a hostage attack with the data involved, or simply drops each node from the network.
  • Eclipse Attack — An eclipse attack attempts to isolate a node or set of node in the network graph, by ensuring that all outbound connections reach malicious nodes.
  • Other Attacks documented in white paper!

CEO of Emerging Impact, Former Head of ConsenSys Social Impact, @Goldman Alum, @Cisco Alum, @TFA Alum, Activist, Intense Autodidact