[SIGCIS-Members] Question about the Tiltman break in Lorenz cypher (1941)

Sun May 23 19:38:08 PDT 2021

I wonder if the problem is in part that Copeland got the teleprinter character set wrong. Some years ago I was working from the version given in his appendix when trying to understand the quirks in the interaction of German and the teleprinter code that explained why the deltas between the bit patterns for successive characters were both much more likely to be 0 than 1. That is what underpinned Tutte's method, used by Colossus, which exploited the fact that the deltas between successive characters were not fully scrambled by the Lorenz equipment. So there was a clear statistical signature that marked the correct positions for the first two "Chi wheels." (That analysis underpinned my attempt to give an accurate but non-mathematical account of the process in https://cacm.acm.org/magazines/2017/1/211102-colossal-genius/fulltext).

I was having some difficulty in squaring tables in archival reports giving the most common two letter sequences in the encoded messages with the bit patterns for the deltas between them. Then I realized, or my collaborator Mark Priestley pointed out, that the discrepancies were caused by errors in the teleprinter alphabet Copeland included as appendix 2 on pages 248-9. Five years on I can’t remember if there was just one error or several. But I do remember that the frequency of EI in the plaintext was very high so I just  rechecked that his entries for those letters. I see that the code Copeland gives for I (oxxox) does not match the one given at https://en.wikipedia.org/wiki/Baudot_code. The other clue is that Copeland has the same bit pattern in successive lines, identified first as coding P and then as coding I. They can't both be right. 

Looking at the https://billtuttememorial.org.uk/codebreaking/teleprinter-code/ page that Emmanuel mentions, I notice a similarly blatant error: the page identifies both M and N as being coded by (ooxxx). Which doesn't match Copeland or Wikipedia. So it appears that both the Tutte fan site and the Copeland book present incorrect teleprinter alphabets.

Hence Emanuel shouldn't discount the possibility that the cyphertexts are correct but the teleprinter code to bit conversions are wrongly given in both the sources he is relying on. Having said that, given the rather low standards of proofreading we're seeing here it wouldn't shock me if the cypher text sequences were also given wrongly in Copeland and copied from one website to another with errors included. I don't have time to check back to primary sources, but if I was doing serious work on the technicalities of this, I'd either dig into the archival sources or at least work as much as possible from Reeds, James A., Whitfield Diffie, and J V Field. Breaking Teleprinter Ciphers at Bletchley Park. An Edition of General Report on Tunny With Emphasis on Statistical Methods (1945). Piscataway, NJ: IEEE Press/Wiley, 2015  (https://www.amazon.com/Breaking-Teleprinter-Ciphers-Bletchley-Park/dp/0470465891). This is a heavily annotated and supplemented reprint of a formerly secret report written as the operation was being wound down at the end of the war. Chapter 41-44 give a thorough description of the original methods used prior to mechanization. (Though not, as far as I can see on a quick look, the full text of the messages in question).

Regarding one of Brian's comments, IIRC "addition" was simply what the BP staff called the operation we're more used to thinking of as XOR. So that might not be a source of error. They also spoke of dots and crosses rather than 1 and 0. It’s just more natural for us today to XOR 1s and 0s than to add dots and crosses. Brian is right about the shift codes, which helped to create some of the regularities in the deltas exploited by the codebreakers. IIRC, however, the shift codes get encrypted just like the other characters, rather than being stripped out by the Lorenz equipment prior to encryption as he suggests.

As I wrote in the CACM article:

When setting wheels it was easy to determine statistically whether a particular combination yielded natural language or random noise. Trying every possible combination of positions across the 12 wheels was clearly impossible: the war would be over, and the Earth swallowed up by the sun, long before the job was finished. But flaws in the design of the Lorenz machine made it possible to break the job into tractable steps. Each channel was encrypted by the successive action of two code wheels. The first set, known at Bletchley Park as the chi wheels, rotated to the next position each time a character was read, whereas the second set, the psi wheels, turned only when directed to do so by two "motor wheels." Decrypting a Tunny message posed two main challenges. First to set the chi wheels, using this information to generate "dechi," text encrypted only by the psi wheels. Then to set the psi and motor wheels, using this information to generate plain text.

Because the psi wheels did not always rotate, their contribution to the cipher text often repeated from one character to the next. This, Tutte realized, gave a statistical method to set the chi wheels without making any assumptions about the psi wheels. Whether the wheels moved or not they still masked the distinctive character distributions of German text. But whenever the psi wheels did not rotate, the deltas between successive characters would pass through them unchanged.

Analyzing a sample of decrypted messages showed him that the distribution of deltas was far from random. For example, in German "ei" is a very common string. E is (1,0,0,0,0) and I is (0,1,1,0,0). Their delta, (1,1,1,0,0) had a frequency of 5.9% -- almost twice as common as in a random distribution. The delta between two repeated characters, (0,0,0,0,0), occurred 4.6% of the time. German has many double "s" characters, and teleprinter operators often pushed the shift and unshift keys, encoded as their own characters, twice to make sure that they were received.

When the distribution of deltas in successfully dechied text was plotted the same tell-tale "bulges" in the distributions of deltas appeared. If the psi wheels moved about half the time the peaks and valleys would be half as tall but follow the same pattern. However, if dechi was produced with the wrong wheel settings the distribution of deltas should be close to random, with all combinations occurring about 3.1% of the time.

Setting chi wheels meant generating dechi with different wheel combinations and looking for a non-random distribution of deltas. The five chi wheels could take 22 million combinations, but because each acted on only one of the five bit channels that encoded each character there was no need to consider all wheels simultaneously. The five most common deltas were (1,1,0,1,1), (1,1,1,0,0), (0,0,0,1,0), (1,1,1,1,1), and (0,0,0,0,0). In each case first two channels were either (1,1) or (0,0) and so added to 0. Maybe your school didn't teach you that 1+1=0, but in this context "addition" meant the logical operator XOR. Tutte devised a very simple method: (a) use all 1,271 possible wheel start positions to generate dechi for channels 1 and 2, (b) for each dechi stream count the number of positions where deltas for channels 1 and 2 add to 0, (c) take the wheel settings with the highest count. Once settings for the first two wheels were found the process was repeated to identify the others. If everything went well, this would set all five wheels by processing the encrypted message about 2,400 times.

Best wishes,

Tom

-----Original Message-----
From: Members <members-bounces at lists.sigcis.org> On Behalf Of Brian E Carpenter
Sent: Sunday, May 23, 2021 6:52 PM
To: E. Lazard <Emmanuel.Lazard at dauphine.psl.eu>; members at sigcis.org
Subject: Re: [SIGCIS-Members] Question about the Tiltman break in Lorenz cypher (1941)

Emmanuel,

On 24-May-21 09:47, E. Lazard wrote:

> Dear all,

> 

> I’m looking for some original information on the famous "Tiltman break" which led to the cryptanalysis of the Lorenz cipher en 1941.

> ( <https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Lorenz_cipher> https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Lorenz_cipher 

> < <https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Lorenz_cipher> https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Lorenz_cipher>)

> ( <http://www.eprg.org/computerphile/lorenz-combined.pdf> http://www.eprg.org/computerphile/lorenz-combined.pdf 

> < <http://www.eprg.org/computerphile/lorenz-combined.pdf> http://www.eprg.org/computerphile/lorenz-combined.pdf>)

> 

> The story is: the British intercepted two messages sent with the same key (HQIBPEXEZMUG) also called a "depth".

> When adding the two cipher texts with the exclusive-or function, the key cancels out and what is left is the exclusive-or of the two plain texts.

> From there, brigadier John Tiltman found the two messages by trying various likely pieces of plaintext and found that the first message started with the word SPRUCHNUMMER (message number) and that the second message also used the same word but shortened out as SPRUCHNR.

> 

> EVERY SINGLE WEBSITE and the Copeland book "Colossus" list the two intercepted cypher texts as:

> 

> C1 = JSH5N ZYMFS 01151 VKU1Y U4NCE JEGPB

> C2 = JSH5N ZYZY5 GLFRG XO5SQ 5DA1J JHD5O

> 

> and their exclusive-or as:

> 

> D  = ///// //FOU GF14M AQSG5 SEKZR /YWHE

Indeed,  S and 5 combine to give V according to  <https://billtuttememorial.org.uk/codebreaking/teleprinter-code/> https://billtuttememorial.org.uk/codebreaking/teleprinter-code/

> My problem is that IT DOES NOT ADD UP!

> The U in 10th position is not the correct result, it should be a V.

> (S is 10100, 5 is 11011, so their exclusive-or is 01111 which is V)

However, there are two problems with your comment:

1) The code for 5 is actually 11110, the same as the code for T, but in figure shift. I assume that the Lorentz system removed the figure shift and letter shift codes before starting the crypto work.

11011 is indeed the figure shift code, so the actual bit stream would have contained 11011 11110 11111.

2) The appropriate operation is not XOR. It's what Bletchley Park called "addition", as described at the above web site. While that doesn't explain this U/V error, which I suppose started as a transcription error, it probably explains the other errors you mention.

Regards

   Brian Carpenter

> And I found other issues with all examples using the cypher text, the messages, the key… I always have several letters which are wrong.

> 

> So I’m wondering if I’ve misunderstood something or have the cypher texts been incorrectly written down once and everybody just copied them without checking?

> 

> Anybody has genuine information or can point me to some source?

> 

> Regards

> Emmanuel Lazard

> 

> 

> 

> _______________________________________________

> This email is relayed from members at sigcis.org, the email discussion 

> list of SHOT SIGCIS. Opinions expressed here are those of the member 

> posting and are not reviewed, edited, or endorsed by SIGCIS. The list 

> archives are at  <http://lists.sigcis.org/pipermail/members-sigcis.org/> http://lists.sigcis.org/pipermail/members-sigcis.org/ 

> and you can change your subscription options at 

>  <http://lists.sigcis.org/listinfo.cgi/members-sigcis.org> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org

> 

_______________________________________________

This email is relayed from members at sigcis.org, the email discussion list of SHOT SIGCIS. Opinions expressed here are those of the member posting and are not reviewed, edited, or endorsed by SIGCIS. The list archives are at  <http://lists.sigcis.org/pipermail/members-sigcis.org/> http://lists.sigcis.org/pipermail/members-sigcis.org/ and you can change your subscription options at  <http://lists.sigcis.org/listinfo.cgi/members-sigcis.org> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sigcis.org/pipermail/members-sigcis.org/attachments/20210523/2872c55a/attachment.htm>