Creating a cli tool to easily encrypt (and decrypt) data into multiple parts

TachiCrypt
TachiCrypt on Github

The why?

Well - let's say it's a combination of curiosity and a very old idea that's stuck in my head for years.

While the usual encryption tools you know (veracrypt and so on) will create some form of one big encrypted blob, i always wanted to have a tool which allows me to encrypt data into multiple parts instead of just one.

Such a tool would allow to either shared the data with another person over multiple channels instead of just one, or store the encrypted data on multiple different locations. In both cases the idea is to make it harder for an attacker to access/crack the encrypted data because he does not just need to intercept/find (and crack) that one encrypted blob - but rather find all of them.

At this point you might think - to build such a tool you must be at least have a sufficient experience with cryptography in order to make it actually "safe". Well yes and no. While i won't guarantee that this tool will be some sort of "holy grail" of data safety, i still can give it a try.

The how?

Concept

So as i explained in the "The why?" section, the idea is to have a tool which allows me to encrypt (and decrypt) file(s) into multiple encrypted parts. Let's summarize more in detail what else this includes in order to make it a useful tool.

Architecture

In order to achieve all the points listed in the concept, i came up with the following architectural concept

When encrypting data, the tool will read all the to-encrypt data into memory (files(s) and/or directories) and then create a base64 zip from it.

Then the tool will split said zip into X parts in which X is the amount of desired resulting encrypted parts you defined when running the command. Since the length of the base64 zip string will not always be perfectly divisible by the amount of parts provided, we need to implement it in a way that the tool will add some padding to the last part in order to keep them all same length.

As soon we got all the same-length split parts, the tool will use random data from the system to

It will than encrypt all the parts with the passkeys and store them using the random filenames in the desired output directory.

Now we are going to use these passkeys and filenames together with the original sequence of the parts to create an index that allows us to decrypt the parts again. Additionally we add the information about the length of the padding that we added to the last part (INT 0-X). With that we should be able to decrypt the parts and put them back into the correct order to unpack the base64 zip again.

At this point the tool will prompt the user to input a passphrase. This passphrase will be used to encrypt the index and store it as a file called "masterlock".

And last but not least we again use random data gathered from the os to modify the created and lmod dates of all the encrypted parts, so this information can't be used to infer the original order.

We now got X encrypted file parts + masterlock file as a result. This data can be store on multiple different locations or shared over multiple different channels, and only with all them together you are able to restore the original data.

To decrypt the data, you put all the parts and the masterlock in the same directory. You run the tool which will first prompt you for your masterlock passphrase. After inputting it, the tool will decrypt the masterlock, use the index stored data to decrypt the parts, put them back in the original order and than unpack the resulting base64 zip to your desired output directory.

Simple hah?

Stumbling blocks

First of all, as mentioned earlier im not an expert in cryptography - therefor i don't want implement my own half-baked encryption. Instead, i want to rely on something that's know to be secure. After some research i decided to run with AES GCM.

Also, im a friend of vanilla code - and i hate dependency hells. Therefor i wanted to use only golang native libraries and keep the project completely free of any third party library. Makes the project cleaner and also prevents the possibility that any third party library might compromise the tools safety. While this tool is rather simple and for the most part this worked out fine, i had some trouble to implement a way of reading the users password input from cli in a way that does not show the password. For the first alpha i decided to keep the password input unhidden, tho this for sure is one of the main priorities when enhancing the tool in future iterations.

And, while other tools like the previously mentioned veracrypt will create some sort of mountable volume that can be altered, the tool's architecture won't allow to do so. Im aware of this limitation and im fine with it. It's not meant to be a daily driver to access and alter the protected data.

Use cases

There are two major possible use cases for such a tool.

First - if you want to share some data with a second party, but don't trust a single channel for sharing it (MITM etc) you are able to use as many different channels to do so. One part shared via mail, another via slack, another uploaded to a private sharing space and so on. And than for example providing the masterlock file on a storage device to your target party. This way an attacker has a quite hard time.

Second - if you want to long term store some data on multiple different locations. You can put the parts on usb sticks/NAS/microsd's or any other storage you can access and put them on different locations. So if a thief or any other person tries to get hold of your data, he needs to find them all.

Future plans

Well, at this point i want to point out that this is a hobby project and i don't have any fixed plans on when and how much time i will spend to enhance the quality and safety of this project. It started as an experiment, and i won't promise it will ever reach "production" quality.

This said, the following points would be my main priorities when putting further effort in enhancing it

In case you think this tool is a cool idea and want to contribute - feel free to add an issue or even fork it to go with a pull request. Tho - please don't be offended if i won't merge everything you suggest. There are certain things that i want to keep the way it is, like not adding third party libraries, and even if you might think that the tool might benefit of your idea, its probably better to first open an issue and make sure the proposed enhancement has a chance to me merged. :)

Conclusion

While this tool might not be some revolutionary replacement for your current tooling, it still was an interesting topic to dive into.

As for all my private projects/experiments i learned a lot about a new topic - especially when thinking about all the small things that enhance such a tools safety (for example the modification of the created and lmod timestamps).

Even tho i would not recommend to use this tool for your current vital and high security data - im happy for every star you might drop and issue with suggestions for possible enhancements you might create. The more people will show interest in the project, the higher the probability that i will spend time to enhance the project and maybe even reach something like production quality at some point.

As always, I hope this was an interesting read and maybe sparked your curiosity to have a look at TachiCrypt on Github or give a similar idea that was stuck in your head a try.

So long and thanks for all the fish