📄 readme
字号:
2c - twocrypt ------------- Copyright (C) 2003 by Michal Zalewski <lcamtuf@coredump.cx> GNU GPL, I guess. Go get yourself a copy of the license.1) What is this?---------------- 2c is a simple symmetric file encryption utility that has but one interesting feature - it is capable to embed an additional file within an encrypted data. This is done in a way that cannot be detected without knowing the passphrase protecting the other file. The design is such that the fact of using this method alone does not constitute a credible evidence of data hiding (IANALBMSUTDO). BE WARNED: this is an early prototype code. It very well may have some terrible, terrible flaws that can expose or even destroy your data. Use wisely, test, review, send back comments.2) Huh!?-------- The idea is to have an encryption utility that makes it possible to have one chunk of encrypted data that can be decoded to several alternative contents, so that, should the creator of the encrypted file be forced to disclose one key protecting the file, the remaining portion of data remains safe and its presence is not evident. In the simplified implementation I used, only two chunks of data can be stored in a single encrypted file, and the recovery of the "deeper" chunk requires you to know both passwords - but this can be changed. In essence, there are two methods to approach the problem of creating an encrypted file that can be decrypted to a set of alternative files. First method would be based on the observation that, in theory, any data can be decrypted to any plaintext representation. An encrypted message that reads "ZHfmj4tu97kKXctl3z" could mean both "kill the president" and "must buy more food". The problem with this approach, however, is that it is not practical. Even if we do not take the effort that needs to be invested in finding two matching keys that would produce the desirable plaintext - that varies depending on the encryption algorithm - the length of the key would have to be close to the shortest possible representation (a compressed form, most likely) of the plaintext message. When hiding files this way, this becomes a considerable problem, because it is not acceptable to store two sizeable, password-protected keys on the disk - that would be a clear evidence of user's intent. Thus, most of the key information would have to be derived from the passphrase, and passphrases are generally very short compared to the size of data protected. The other approach is to combine two separate passphrase-protected files into one. The problem is quite obvious - it's not much different from storing two separate files on the disk, the attempt to hide some information would be obvious.3) So...?--------- The solution this package uses is a modified version of the latter approach by selecting an algorithm that - quite legitimately - uses some random data to generate the output to encrypt a single file. We can, but do not have to, replace the randomness with some information that is meaningful, but retains the statistical characteristics of a random data. And a headerless encrypted file is a perfect example of an information that can't be told from a meaningless entropy obtained from the OS (granted both the encryption and the system RNG are not flawed). The tool itself is an implementation of such an algorithm that has an option to be configured to obtain "randomness" from another encrypted stream. In absence of the additional stream, it implements a regular encryption scheme that generates output that is no different from the "covert data" scenario. Since good encryption is essentially indistinguishable from random data, the presence of the hidden file is not evident. To make sure we do not run into problems, however, we routinely encrypt the randomness just the way we encrypt the input. This way, the random mode is guaranteed to exhibit the same characteristics and be virtually indistinguishable. Furthermore, the algorithm is a legitimate one, and just the fact it is being used does not indicate the intent to conceal any information. The algorithm in question is an interesting experiment that originated as an attempt to circumvent US cryptography export embargos. Instead of transforming the data with a key, the algorithm relied on leaving the data not modified, and padding bits it with random information instead, so that the chances the next bit comes from an entropy source are just as good as that it belongs to the actual stream. As long as you don't know the method used to determine which bits are just a padding, you cannot get the original message. The knowledge of this method substitutes a key. This approach started some interesting discussions, but was not particularly popular in the real world because of the overhead it caused, causing the output file to be usually about twice the size of the original. Still, when size is not a constraint, the algorithm is an interesting method that is generally very difficult to attack, because it is impossible to determine what is the actual message. The tool implements this algorithm. In the default encryption mode, the primary passphrase is mixed with a seed #1 and transformed with a one-way function to initiate a "bit walk" stream key, which is used to decide where to insert random bits, and where to insert original ones. Just knowing the key does not enable you to derive the password used to generate it, so even if the attacker knows what data was "mixed" with randomness and can deduct some properties of the key, the remaining sections of the file are not affected and the password is not compromised. But, just to be sure, we can go one step further and generate another key by mixing the primary passphrase with a seed #1 and transforming it to a hash. This way, the attacker does not know either the data, nor its placement, as long as the passphrase is not known to him. Furthermore, both keys are independent, so no assumptions about the relation between the original data and the placement of the crypted information can be made. In the "extra data" mode, the operation is generally the same, except that the secondary file is also encrypted with a secondary key and used instead of the random data for as long as the stream is available (once exhausted, the algorithm fails back to randomness). To make sure that the data is indistinguishable from "real" randomness, it is routinely compressed prior to encryption, and the randomness is always routinely encrypted with a bogus key. Because both the randomness and the compressed secondary file are crypted using the same algorithm, the output is indistinguishable from the random data. Of course, you cannot hide more data than you would normally use padding, and the program refuses to do so. The decryption generates a stream key from the primary password, as well as the primary file key, and splits the bit stream into two container files. The first file is decrypted with the key. The other bin is either discarded, if the user says there is no hidden information in there, or decrypted with a secondary password. There is no way for the program to tell if the second bin contains some useful information, of course. While the other container can be retrieved once the primary password is known (this is not a limitation of this approach, if you are willing to live with slightly larger files, you can also implement random bit mixing in there), it cannot be accessed without the password, should there be any password, so you're safe as long as the base cipher is fairly decent.4) What ciphers can it use?--------------------------- Right now, it generates stream keys by seeding an internal state with the password and some other goodies, and using MD5 to calculate the next state. Some output information is also carried to the next cycle for MD5, but rest assured, the next seed is not leaked as long as MD5 works well. The stream is then simply XORed with the data to be encrypted, when encrypting each file, that is. This is a very slow and not necessarily the sexiest algorithm ever, but it should work fairly well, and was a quick hack. If you feel better with an algorithm that is better known, you can either replace encrypt_file and decrypt_file with whatever you want, or simply crypt the secondary file with some other algorithm prior to storing it. Feel free to send patches and comments, of course. 5) Yeah, whatever. Does it work?-------------------------------- Yes. Be warned the implementation is very simplified at the moment. While there is some memory and disk wiping, it is generally not guaranteed to be too robust and you get no guarantee of data wiping if the program crashes or is terminated with Ctrl-C. Do not rely on this too badly. The tool assumes you have a sane environment, that /dev/urandom works and delivers some good randomness (it does not have to be perfect, just not predictable and with good statistical properties), etc. Linux is just fine, using a good RNG seeded with external events. Some things should be changed - it would be good to be able to support more files, have some kind of nested bit-padding, make all passwords work independently, and so on. A better selection of file ciphers wouldn't hurt ;-) The implementation does not support excessively bit files. Only few hundred MB is acceptable, because file bit offset counters use 32-bit integers. Only x86 is supported at the moment, some changes to wipe_memory() and types.h are necessary to support other architectures.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -