Terrapin is an attack against ssh's binary packet protocol. It is a prefix truncation attack, that is, it permits deleting some initial subset of the supposedly-protected data. It posits an attacker with full control over the octet stream between the peers, able to inspect, delay, delete, modify, and insert octets at will. This is the strongest type of attacker normally considered as a crypto opponent, but it also is the kind of attacker ssh is supposed to be able to defeat. To understand Terrapin, we need a little background. ssh is built on top of a so-called binary packet protocol, BPP. The BPP starts with some cleartext message exchanges, during which algorithm negotiation occurs and a shared secret is generated using Diffie-Hellman (or something philosophically similar; the BPP does not actually require that Diffie-Hellman be used). Each side then computes bulk encryption keys based on the shared secret and the info that went into the algorithm negotiation and the Diffie-Hellman inputs. This provides protection against an attacker meddling with (for example) the algorithm lists, because any such meddling will mean that the data one peer uses will differ from the data the other peer uses, causing them to derive different encryption keys, leading to connection failure. This initial exchange is called `key exchange', abbreviated `kex'. The data stream in each direction is broken into variable-sized messages. Each message has a sequence number, starting frokm zero; these are implicit (that is, they do not appear on the wire anywhere). However, once encryption starts, ssh provides integrity protection by applying a MAC to each message; this MAC includes not only the explicit message contents but also the sequence number. Terrapin operates by inserting an IGNORE message into one data stream (for ease of language, I'll write as if it's always the server->client one; that one is the higher-value target) during the cleartext phase, then dropping the first message sent by the server after encryption starts. (It has to be the first message, since the MACs include the sequence number; thus, not dropping the first message will cause its MAC to fail with overwhelming probability.) While the Terrapin paper mentions the possibility of injecting more than one IGNORE and dropping more than one initial message, it does not describe attempting that, probably because it would not be useful against the implementations they were working with. From a theoretical point of view, this breaks the BPP's intent to provide integrity protection, since the supposedly-protected data stream seen by one peer differs from that seen by the other, without the BPP's checks raising any alarm. For it to be exploitable in practice, though, dropping the first packet (or, more generally, the first N packets for some N) after kex has to be useful. Simply sabotaging the connection is not interesting; an attacker with the capabilities posited can do that much more easily already. The major practical use of this is that some implementations send an ExtInfo packet as the first packet after kex to indicate that certain extensions are supported, and some of the affected extensions are security-relevant, making this a downgrade attack. The Terrapin paper points out two protocol changes which should be done to address these weaknesses: the BPP should reset its sequence number when encryption begins, and the data included in key generation should include the full conversation before encryption begins, not just selected values from it. The best thing, of course, would be a protocol redesign, integrating the above suggestions. The major cost of doing that is that it would introduce a compatibility flag day, so there is a desire to look for mitigations which are compaptible with the installed base. There are various such. Deleting the first message from the encrypted keystream requires knowing how many octets it occupies. In most current implementations, this is fixed, or has only a few possible values, because that message is fixed or largely fixed. But one mitigation is to make that message an IGNORE, with a randomly-chosen length, forcing the attacker to guess its size. (To avoid revealing it via side-channels such as TCP segment sizes, various techniques can be used; for example, the sender could send a randomly-chosen number of randomly-sized IGNOREs which collectively add up to about 2K of data, writing the entire batch as a single TCP send.) This adds only some six to ten bits of difficulty, but reducing the attack success chance by a factor of 64 or more is worth doing. Furthermore, even if the attacker does manage to delete one or more IGNOREs successfully, it won't affect anything in practice. Also, some crypto modes are more vulnerable than others. GCM modes, for example, are immune, because they use their own internal sequence number, which amounts to doing a sequence number reset as far as the crypto is concerned. But other modes are exceptionally vulnerable; for example, ChaCha20-Poly1305 is typically implemented in a way that gives Terrapin a 100% chance of success. (This actually is not a flaw in ChaCha20-Poly1305 per se, but rather in how it is typically integrated into the ssh crypto framework.) Thus, avoiding vulnerable crypto can serve as a mitigation. Even simply rejecting connections that contain IGNOREs among the pre-crypto data stream can be effective.