Authentication and integrity, in this instance, are provided by IPFS.
If you don't trust that IPFS is providing you with the correct content, then you also can't trust that it's providing you with the correct code.
If you don't trust the IPFS node, it could just ship off your decryption key to a remote server. There is no getting around trusting IPFS to work correctly. And if you trust IPFS to work correctly then there is no benefit to authenticated encryption.
But you can run your own IPFS node locally. Then you're only trusting it as much as you trust your local system. If you don't trust your local system then you can't trust anything.
I'm not deeply familiar with IPFS, but on a cursory look it appears to be accurate that it optionally[1] offers authentication via HMAC for file objects. Can you indicate how this works in hardbin? I don't see any mention of it in the GitHub readme - is it activated by default or is it assumed that a user has it activated in their own copy of IPFS?
On a meta note, can I ask why you didn't go with something like Tahoe-Lafs[3] if you were looking for a secure, decentralized file store system? I don't immediately know that it would have been better per se, but I'm not quite sure what IPFS provides that you can't get otherwise.
EDIT: I'm not trying to grief you here, but there are three people in this thread already (myself included) who know security very well (professionally so), and I want to point out comments like this one and the GitHub issue that was opened are good-faith attempts at that.
There is no HMAC used in Hardbin. You have to trust that the IPFS path you are accessing is correct, otherwise all bets are off.
The code and data are served out of the same place. Anybody who can modify the data can modify the code. There is no benefit whatsoever from trying to do any more authentication.
I'll have a look at SJCL, thanks.
I've not played with Tahoe-LAFS, I don't know anything about it.
I fully intend to fix the non-uniform key generation. That is a bug.
I'm yet to hear a convincing argument that the unauthenticated encryption is a problem.
Ah, I understand, thanks for clearing that up. So each IPFS node in your connection path must support data authentication.
I have two suggestions, then:
1. Implement data integrity and authentication directly in Hardbin, instead of offloading it to IPFS, such that it can be used with both confidentiality and integrity even if assumed-hostile nodes are part of the connection path. To do this I'd recommend using HMAC-SHA256 for authentication, and AES-CBC is probably fine for the encryption. You could combine the authentication and encryption with something like AES-GCM or AES-OCB, but I personally wouldn't do that.
2. Explicitly state upfront and center in the readme that Hardbin is currently abstracting the duty of authentication and integrity to IPFS using that sort of terminology. Now it's clear to me why you're saying that an untrusted IPFS path shouldn't be used, but if you used that sort of language it would be more "formally" expressed cryptographically speaking. The fact that you're mentioning that attackers cannot manipulate files but not explicitly describing how authentication works (e.g. as you do with encryption) is an antipattern.
There's a bit of a runaround here - Hardbin is designed to be an "encrypted, secure pastebin", but Pastebin is inherently an antagonistic medium for file authentication, which you'll really need for file integrity. It's designed to be fairly anonymous, which you have to trade off in some way if you want real file integrity.
The difficulty in this is that "encryption" on its own only offers confidentiality, and in modern cryptography that level of assurance is relatively rarely used in complete cryptosystems. It's not necessarily helpful to have confidentiality across a connection without also having integrity. So you can add decentralized components and encryption to it via Hardbin, but (in my opinion), you're not significantly adding a ton of value to it because I'm not really clear what the use case is where you want to securely share files in a decentralized manner, but you also don't really mind if the files are not protected against manipulation.
> Ah, I understand, thanks for clearing that up. So each IPFS node in your connection path must support data authentication.
No, I still think you don't understand.
You only need to trust your local node. If another node on the path modifies the data, your local node will reject it because it won't match the hash.
Content is addressed in IPFS by its content hash. The Hardbin "URLs" are content hashes.
EDIT: I see the confusion! When I said you need to trust the "IPFS path", I was talking about a path like "/ipfs/QmXyE...". Not a connection path. You need to trust the IPFS path (i.e. content hash), and your local node, but nothing else.
Ahhh...so basically, this is a system where you inherently trust the integrity only of your own computer. Yes, sorry about that I misunderstood what you meant when you used "connection"...in my head I'm thinking of stateful networks, not a filesystem.
That's correct, but to provide some color to the comment since it doesn't really explain why it's correct:
Hash functions can provide integrity against accidental errors or very simple manipulations related to XORing the message or digest. This might not be useful against an active attacker, but technically speaking it is a (weak) form of integrity, which is why we use checksums.
Message authentication codes (MACs) are required to assure integrity against active attackers because you need to elevate to authentication, and hash functions cannot provide authentication (at least not on their own). Authentication is a stronger notion in security than integrity because it simultaneously guarantees and requires integrity - there is no point in having integrity without authentication, because you'd be assuring data integrity without assuring the data origin, which short circuits the integrity problem for an attacker. Conversely, you can't have authentication without data integrity because an attacker could simply forge the data origin.
> Authentication is a stronger notion in security than integrity because it simultaneously guarantees and requires integrity - there is no point in having integrity without authentication
the "Authentication" in "Message Authentication Code" really is a term of the past that we now use to mean cryptographic integrity.
Another way to see this: integrity is not a security provided by hash functions in the cryptography sense of it.
You don't exactly need clairvoyance to predict this outcome.