-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path1984-08-01_ken-thompson_reflections-on-trusting-trust.html
387 lines (383 loc) · 16.8 KB
/
1984-08-01_ken-thompson_reflections-on-trusting-trust.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.20">
<title>Reflections on Trusting Trust</title>
<link rel="stylesheet" href="../css/paq-dark.css">
</head>
<body class="article">
<div id="header">
<h1>Reflections on Trusting Trust</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>TURING AWARD LECTURE</p>
</div>
<div class="paragraph">
<p>(original: <a href="https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf" class="bare">https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf</a>)</p>
</div>
<div class="paragraph">
<p><em>Communications of the ACM, August 1984, Volume 27, Number 8</em></p>
</div>
<div class="quoteblock epigraph">
<blockquote>
To what extent should one trust a statement that a program is free of Trojan horses? Perhaps it is
more important to trust the people who wrote the software.
</blockquote>
</div>
<hr>
<div class="paragraph text-center">
<p>KEN THOMPSON</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_introduction">INTRODUCTION</h2>
<div class="sectionbody">
<div class="paragraph">
<p>I thank the ACM for this award. I can’t help but feel that I am receiving this honor for timing and
serendipity as much as technical merit. UNIX<sup class="footnote">[<a id="_footnoteref_1" class="footnote" href="#_footnotedef_1" title="View footnote.">1</a>]</sup> swept into popularity with an industry-wide change from central main- frames to
autonomous minis. I suspect that Daniel Bobrow<sup class="footnote">[<a id="_footnoteref_2" class="footnote" href="#_footnotedef_2" title="View footnote.">2</a>]</sup>
would be here instead of me if he could not afford a PDP-10 and had had to “settle” for a PDP-11.
Moreover, the current state of UNIX is the result of the labors of a large number of people.</p>
</div>
<div class="paragraph">
<p>There is an old adage, “Dance with the one that brought you,” which means that I should talk about
UNIX. I have not worked on mainstream UNIX in many years, yet I continue to get undeserved credit
for the work of others. Therefore, I am not going to talk about UNIX, but I want to thank everyone
who has contributed.</p>
</div>
<div class="paragraph">
<p>That brings me to Dennis Ritchie. Our collaboration has been a thing of beauty. In the ten years
that we have worked together, I can recall only one case of miscoordination of work. On that
occasion, I discovered that we both had written the same 20-line assembly language program. I
compared the sources and was astounded to find that they matched character-for-character. The result
of our work together has been far greater than the work that we each contributed.</p>
</div>
<div class="paragraph">
<p>I am a programmer. On my 1040 form, that is what I put down as m y occupation. As a programmer, I
write programs. I would like to present to you the cutest program I ever wrote. I will do this in
three stages and try to bring it together at the end.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_stage_i">STAGE I</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In college, before video games, we would amuse ourselves by posing programming exercises. One of the
favorites was to write the shortest self-reproducing program. Since this is an exercise divorced
from reality, the usual vehicle was FORTRAN. Actually, FORTRAN was the language of choice for the
same reason that three-legged races are popular.</p>
</div>
<div class="paragraph">
<p>More precisely stated, the problem is to write a source program that, when compiled and executed,
will produce as output an exact copy of its source. If you have never done this, I urge you to try
it on your own. The discovery of how to do it is a revelation that far surpasses any benefit
obtained by being told how to do it. The part about “shortest” was just an incentive to
demonstrate skill and determine a winner.</p>
</div>
<div class="paragraph">
<p>Figure 1 shows a self-reproducing program in the C<sup class="footnote">[<a id="_footnoteref_3" class="footnote" href="#_footnotedef_3" title="View footnote.">3</a>]</sup>
programming language. (The purist will note that the program is not precisely a self-reproducing
program, but will produce a self-reproducing program.) This entry is much too large to win a prize,
but it demonstrates the technique and has two important properties that I need to complete my
story: 1) This program can be easily written by another program. 2) This program can contain an
arbitrary amount of excess baggage that will be reproduced along with the main algorithm. In the
example, even the comment is reproduced.</p>
</div>
<div class="listingblock">
<div class="title">FIGURE 1.</div>
<div class="content">
<pre>char s[ ] = {
'\t',
'0',
'\n',
'}',
';',
'\n',
'\n',
'/',
'*',
'\n',
(213 lines deleted)
0
};
/*
* The string s is a
* representation of the body
* of this program from '0'
* to the end.
*/
main( )
{
int i;
printf("char\ts[ ] = {\n");
for(i=0; s[i]; i++)
printf("\t%d,\n", s[i]);
printf("%s", s);
}
+--------------------------------------------------------------------------------------+
| Here are some simple transliterations to allow a non-C programmer to read this code. |
| = assignment |
| == equal to .EQ. |
| != not equal to .NE. |
| ++ increment |
| 'x' single character constant |
| "xxx" multiple character string |
| %d format to convert to decimal |
| %s format to convert to string |
| \t tab character |
| \n newline character |
+--------------------------------------------------------------------------------------+</pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_stage_ii">STAGE II</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The C compiler is written in C. What I am about to describe is one of many “chicken and egg”
problems that arise when compilers are written in their own language. In this case, I will use a
specific example from the C compiler.</p>
</div>
<div class="paragraph">
<p>C allows a string construct to specify an initialized character array. The individual characters in
the string can be escaped to represent unprintable characters. For example,</p>
</div>
<div class="listingblock">
<div class="content">
<pre>"Hello world\n"</pre>
</div>
</div>
<div class="paragraph">
<p>represents a string with the character “<em>\n</em>,” representing the new line character.</p>
</div>
<div class="paragraph">
<p>Figure 2.1 is an idealization of the code in the C compiler that interprets the character escape
sequence. This is an amazing piece of code. It “knows” in a completely portable way what character
code is compiled for a new line in any character set. The act of knowing then allows it to recompile
itself, thus perpetuating the knowledge.</p>
</div>
<div class="paragraph">
<p>Suppose we wish to alter the C compiler to include the sequence “<em>\v</em>” to represent the vertical tab
character. The extension to Figure 2.1 is obvious and is presented in Figure 2.2. We then recompile
the C compiler, but we get a diagnostic. Obviously, since the binary version of the compiler does
not know about “<em>\v</em>,” the source is not legal C. We must “train” the compiler. After it “knows”
what “<em>\v</em>” means, then our new change will become legal C. We look up on an ASCII chart that a
vertical tab is decimal 11. We alter our source to look like Figure 2.3. Now the old compiler
accepts the new source. We install the resulting binary as the new official C compiler and now we
can write the portable version the way we had it in Figure 2.2.</p>
</div>
<div class="listingblock">
<div class="title">FIGURE 2.2.</div>
<div class="content">
<pre> ...
c = next( );
if(c != '\\')
return(c);
c = next( );
if(c == '\\')
return('\\');
if(c == 'n')
return('\n');
...</pre>
</div>
</div>
<div class="listingblock">
<div class="title">FIGURE 2.1.</div>
<div class="content">
<pre> ...
c = next( );
if(c != '\\')
return(c);
c = next( );
if(c == '\\')
return('\\');
if(c == 'n')
return('\n');
if(c == 'v')
return('\v');
...</pre>
</div>
</div>
<div class="listingblock">
<div class="title">FIGURE 2.3.</div>
<div class="content">
<pre> ...
c = next( );
if(c != '\\')
return(c);
c = next( );
if(c == '\\')
return('\\');
if(c == 'n')
return('\n');
if(c == 'v')
return(11);
...</pre>
</div>
</div>
<div class="paragraph">
<p>This is a deep concept. It is as close to a “learning” program as I have seen. You simply tell it
once, then you can use this self-referencing definition.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_stage_iii">STAGE III</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Again, in the C compiler, Figure 3.1 represents the high level control of the C compiler where the
routine “compile” is called to compile the next line of source. Figure 3.2 shows a simple
modification to the compiler that will deliberately miscompile source whenever a particular pattern
is matched. If this were not deliberate, it would be called a compiler “bug.” Since it is
deliberate, it should be called a “Trojan horse.”</p>
</div>
<div class="paragraph">
<p>The actual bug I planted in the compiler would match code in the UNIX “login” command. The
replacement code would miscompile the login command so that it would accept either the intended
encrypted password or a particular known password. Thus if this code were installed in binary and
the binary were used to compile the login command, I could log into that system as any user.</p>
</div>
<div class="paragraph">
<p>Such blatant code would not go undetected for long. Even the most casual perusal of the source of
the C compiler would raise suspicions.</p>
</div>
<div class="paragraph">
<p>The final step is represented in Figure 3.3. This simply adds a second Trojan horse to the one that
already exists. The second pattern is aimed at the C compiler. The replacement code is a Stage I
self-reproducing program that inserts both Trojan horses into the compiler. This requires a learning
phase as in the Stage II example. First we compile the modified source with the normal C compiler to
produce a bugged binary. We install this binary as the official C. We can now remove the bugs from
the source of the compiler and the new binary will reinsert the bugs whenever it is compiled. Of
course, the login command will remain bugged with no trace in source anywhere.</p>
</div>
<div class="listingblock">
<div class="title">FIGURE 3.1.</div>
<div class="content">
<pre> compile(s)
char *s;
{
...
}</pre>
</div>
</div>
<div class="listingblock">
<div class="title">FIGURE 3.2.</div>
<div class="content">
<pre> compile(s)
char *s;
{
if(match(s, "pattern")) {
compile("bug");
return;
}
...
}</pre>
</div>
</div>
<div class="listingblock">
<div class="title">FIGURE 3.3.</div>
<div class="content">
<pre> compile(s)
char *s;
{
if(match(s, "pattern 1")) {
compile("bug 1");
return;
}
if(match(s, "pattern 2")) {
compile("bug 2");
return;
}
...
}</pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_moral">MORAL</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The moral is obvious. You can’t trust code that you did not totally create yourself. (Especially
code from companies that employ people like me.) No amount of source-level verification or scrutiny
will protect you from using untrusted code. In demonstrating the possibility of this kind of attack,
I picked on the C compiler. I could have picked on any program-handling program such as an
assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will
be harder and harder to detect. A well-installed microcode bug will be almost impossible to detect.</p>
</div>
<div class="paragraph">
<p>After trying to convince you that I cannot be trusted, I wish to moralize. I would like to criticize
the press in its handling of the “hackers,” the 414 gang, the Dalton gang, etc. The acts performed
by these kids are vandalism at best and probably trespass and theft at worst. It is only the
inadequacy of the criminal code that saves the hackers from very serious prosecution. The companies
that are vulnerable to this activity, (and most large companies are very vulnerable) are pressing
hard to update the criminal code. Unauthorized access to computer systems is already a serious crime
in a few states and is currently being addressed in many more state legislatures as well as
Congress.</p>
</div>
<div class="paragraph">
<p>There is an explosive situation brewing. On the one hand, the press, television, and movies make
heros of vandals by calling them whiz kids. On the other hand, the acts performed by these kids will
soon be punishable by years in prison.</p>
</div>
<div class="paragraph">
<p>I have watched kids testifying before Congress. It is clear that they are completely unaware of the
seriousness of theft acts. There is obviously a cultural gap. The act of breaking into a computer
system has to have the same social stigma as breaking into a neighbor’s house. It should not matter
that the neighbor’s door is unlocked. The press must learn that misguided use of a computer is no
more amazing than drunk driving of an automobile.</p>
</div>
<div class="paragraph">
<p><strong><em>Acknowledgment</em></strong>. I first read of the possibility of such a Trojan horse in an Air Force
critique<sup class="footnote">[<a id="_footnoteref_4" class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup> of the security of an early implementation of
Multics. I cannot find a more specific reference to this document. I would appreciate it if anyone
who can supply this reference would let me know.</p>
</div>
<div class="paragraph">
<p><span class="small">Author’s Present Address: Ken Thompson, AT&TBell Laboratories, Room 2C-519, 600 Mountain
Ave., Murray Hill, NJ 07974.</span></p>
</div>
<div class="paragraph">
<p><span class="small">Permission to copy without fee all or part of this material is granted provided that the
copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the
title of the publication and its date appear, and notice is given that copying is by permission of
the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or
specific permission.</span></p>
</div>
<div class="paragraph">
<p>© 1984 0001-0782/84/0800—​0761 75¢</p>
</div>
</div>
</div>
</div>
<div id="footnotes">
<hr>
<div class="footnote" id="_footnotedef_1">
<a href="#_footnoteref_1">1</a>. UNIX is a trademark of AT&T Bell Laboratories.
</div>
<div class="footnote" id="_footnotedef_2">
<a href="#_footnoteref_2">2</a>. Bobrow, D.G., Burchfiel, J.D., Murphy, D.L., and Tomlinson, R.S. TENEX, a paged time-sharing system for the PDP-10. <em>Commun. ACM 15</em>, 3 (Mar. 1972), 135-143.
</div>
<div class="footnote" id="_footnotedef_3">
<a href="#_footnoteref_3">3</a>. Kernighan, B.W., and Ritchie, D.M. <em>The C Programming Language</em>. Prentice-Hall, Englewood Cliffs, N.J., 1978.
</div>
<div class="footnote" id="_footnotedef_4">
<a href="#_footnoteref_4">4</a>. Unknown Air Force Document.
</div>
</div>
<div id="footer">
<div id="footer-text">
Last updated 2024-07-10 23:33:30 +0300
</div>
</div>
</body>
</html>