Redesign of Tox’s Cryptographic Handshake

In 2017, Jason A. Donenfeld (known for WireGuard®) reported an issue in Tox’s handshake [1]. This issue is called “Key Compromise Impersonation” (KCI). I will try to explain the issue as simple as possible:

In Tox you don’t register an account (e.g. with username and password), but instead your identity is solely based on (asymmetric) cryptographic information, a so-called asymmetric key pair. Such a key pair consists of a public part (public key) and a private part (private key). The public part, as the naming suggests, is public and contained in your ToxID which you share with your contacts to be able to communicate with them via Tox. The private part, again as the name suggests, needs to stay private! If someone gets in possession of your private key, they stole your Tox identity. This could, for example, be the case if someone got physical access to your computer or successfully installed malware on your system, e.g. a so-called trojan horse, to be able to extract data from it. If this happens, you will most likely have multiple problems and your Tox identity may be just one of them. The password you enter when you create your Tox profile, e.g. when you first start qTox client, is used to encrypt your profile and also your private key on your disk. If you start qTox, you need to enter your password to decrypt your private key, to be able to communicate via Tox. Your private key is then stored unencrypted in memory (i.e. RAM) while qTox is running. This means an attacker either needs to get access to your password (steal or crack it) or to read your Tox private key from memory while your Tox chat client is running.

If someone successfully stole your Tox identity (i.e. this private key), they are you – at least in the context of Tox. So they can successfully impersonate you in Tox. Now in this case the KCI vulnerability leads to “interesting” behavior. It is clear that someone who stole your identity is able to impersonate you. But because of the KCI vulnerability, they may also be able to impersonate others to you. This means, to exploit this vulnerability in practice, someone not only needs to successfully steal your private key, but additionally:

Know the ToxIDs of your Tox friends to be able to impersonate them to you.
Control the network connection between you and your friend. This could be the case e.g. if they are in the same (public) WiFi as you, or via the Internet – which is way harder and is most likely only possible for state actors (e.g. the NSA).
Implement their own version of toxcore because it’s not possible to exploit this issue with the current implementation. There is no public exploit available which can just be used.

In summary, KCI is exploitable, but with a huge effort.

Anyway, this is a real vulnerability and it should be fixed. The current Tox handshake implementation is not state-of-the-art in cryptography and it also breaks the “do not roll your own crypto” principle. As a solution, there is a framework called Noise Protocol Framework (Noise, [2]) which can be used to create a new handshake for Tox. More precisely, the application of Noise will only change a part of Tox handshake — the so-called Authenticated Key Exchange (AKE). Noise-based protocols are already in use in e.g. WhatsApp, which uses it for encrypted client-to-server communication, and WireGuard®, which uses it for establishing Virtual Private Network (VPN) connections. Noise protocols can be used to implement End-to-End Encryption (E2EE) with (perfect) forward secrecy (which is also the case with the current Tox implementation), but further adds KCI-resilience to Tox.

Tobi (goldroom on GitHub) wrote his master’s thesis (“Adopting the Noise Key Exchange in Tox“) on the KCI issue in Tox, designed a new Handshake for Tox based on NoiseIK and implemented a proof-of-concept (PoC) for this new NoiseIK-based handshake by using Noise-C [3]. This PoC has a few drawbacks, which is why it should not be used in practice (see Appendix). If you want to know more about his master’s thesis, see the update in the initial KCI GitHub issue [4].

He applied for funding at NLnet foundation and their NGI Assure fund to continue his work on Tox and to be able to implement a production-ready Noise-based handshake for toxcore. Fortunately, this application was successful [5]. NGI Assure is made possible with financial support from the European Commission’s Next Generation Internet programme (https://ngi.eu/).

The objective of this project is to implement a new KCI-resistant handshake based on NoiseIK in c-toxcore, which is backwards compatible to the current KCI-vulnerable handshake to enable interoperability and smooth transition. The main part of this project is to implement NoiseIK directly in c-toxcore to remove Noise-C as a dependency (as the only other dependency for c-toxcore is NaCl/libsodium) which was used in the PoC and therefore improve maintainability of c-toxcore (see Appendix).

The tasks in this project are:

Implementation of a Noise-based AKE for the use in Tox’s handshake in c-toxcore
- This task is to implement and test an AKE for the Tox handshake based on a Noise protocol (most likely Noise_IK_25519_ChaChaPoly_SHA512, but it may change due to new insights in c-toxcore).
Implementation of a symmetric transport phase encryption based on a Noise-based AKE/handshake
- This task includes the implementation of a symmetric transport phase encryption based on the secret key(s) generated during the Noise-based AKE during the Tox handshake and evaluation of Noise’s rekey feature.
- Subtasks:
  - Decision of which symmetric cipher to use for the Tox transport phase encryption (e.g. ChaCha20-Poly1305 or XSalsa20-Poly1305)
  - Evaluation of Noise’s rekey feature to allow for session rekeying to reduce the volume of data encrypted under a single cipher key (cf. section 11.3 of Noise specification, revision 34). This may not be applicable to be implemented in c-toxcore (e.g. changing keys may be expensive). Also, it may not be necessary for ChaCha20 / XSalsa20. Further, it’s dependent on how the transport phase encryption will be implemented.
Support for the Noise-based AKE and the KCI-vulnerable AKE for backwards compatibility
- This task includes the implementation of a mechanism to fall back to the KCI-vulnerable handshake if one of both peers uses a legacy c-toxcore version to provide backwards compatibility for the handshake transition phase (e.g. via cookie phase (request and response) of Tox’s handshake).
Error handling and testing, Documentation, Blog posts

The plan is to implement this new handshake until July 2023. Since it’s not a trivial task, there are still some obstacles:

In Noise it is necessary to differentiate between the initiator and responder of a handshake. Due to the architecture of Tox it is possible that both peers initiate and respond to a handshake at the same time.
Tox is P2P and UDP-based. Therefore packets can be received out-of-order or be lost altogether. In the Noise specification this is only considered for transport messages (cf. [6]).
- “Note that lossy and out-of-order message delivery introduces many other concerns (including out-of-order handshake messages and denial of service risks) which are outside the scope of this document.” (cf. [6])

Both points are not ideal for a handshake based on NoiseIK (i.e. it would be way easier to implement it in a client-server model using TCP), but it should be possible to work this out.

Tobi is available in #toktok (libera.chat) as tobi/@tobi_fh:matrix.org and ready for any input, questions, remarks, discussions or complaints.

Appendix

The PoC shouldn’t be used in practice/in production because it should be improved in the following aspects (for details see chapter five of Tobi’s thesis [4]):

The PoC was implemented by using the Noise-C library [3]. Instead of using the library, NoiseIK or more specifically the Noise_IK_25519_ChaChaPoly_SHA512 protocol will be implemented directly in c-toxcore. This will remove Noise-C as a dependency for toxcore (i.e. the only other dependency is NaCl/libsodium) and therefore improve maintainability. Additionally this will reduce the number of possibly vulnerable source lines of code.
- Notes on maintainability: Noise-C has a lot of code/functionality which is not necessary for c-toxcore. Also Noise-C is currently (?) not actively maintained. If NoiseIK is implemented directly in c-toxcore it is only necessary to maintain the c-toxcore codebase and it is not necessary to care about Noise-C.
The PoC implementation uses the ChaCha20-Poly1305 AEAD cipher during the AKE/handshake and XSalsa20-Poly1305 during the transport phase. In this project it should be evaluated if it’s possible to also use ChaCha20 during the transport phase to only use one cipher instead of two different ones (XSalsa20 is not supported by the Noise framework). For details see chapter four of Tobi’s thesis.
Further testing of NoiseIK handshake behavior and improved error handling.
The PoC implementation is not backwards compatible with the current KCI-vulnerable Tox handshake. In this project a mechanism should be added to enable interoperability with clients based on an old c-toxcore version.

References:

[1] Tox Handshake Vulnerable to KCI #426: https://github.com/TokTok/c-toxcore/issues/426
[2] Noise Protocol Framework: https://noiseprotocol.org/
[3] Noise-C Library: https://github.com/rweather/noise-c
- https://rweather.github.io/noise-c/index.html
[4] More information on Tobi/goldroom’s master’s thesis: https://github.com/TokTok/c-toxcore/issues/426#issuecomment-759511593
[5] “Adopting the Noise Key Exchange in Tox” project description/information: https://nlnet.nl/project/Noise-Tox/
[6] Noise specification, 11.4. Out-of-order transport messages: https://noiseprotocol.org/noise.html#out-of-order-transport-messages

“WireGuard” is a registered trademark of Jason A. Donenfeld.