[Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases

R. A. Hettinga mac_crypto@vmeng.com
Tue, 6 Apr 2004 11:43:21 -0400

--- begin forwarded text

Date: Tue, 6 Apr 2004 10:33:54 +0100
To: "R. A. Hettinga" <rah@shipwright.com>
Cc: cryptography@metzdowd.com
Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to
authenticate software releases
User-Agent: Mutt/1.3.28i
From: Zefram <zefram@fysh.org>

R. A. Hettinga wrote:
>       In practice you'll probably find something that you can alter in
>the last few hundred KB but still the raw processing cost will be a few
>orders of magnitude harder than a simple hash collision problem.

This disucssion suggests a simple countermeasure: put something at the
beginning of the package that depends on all the significant content
of the package.  For example, the first file in the archive could be
a list of digests of individual files.  This means an attacker has to
either process the entire archive in searching for a collision or find
tweakable space not covered by the leading checksum file.

Having such dead space (not trivially checksummed) is quite likely in
an uncompressed archive.  For example, tar pads each file to an integral
multiple of its block size.  However, compression of the entire archive
will make such space more difficult to arrange.

A more extreme approach that removes that risk entirely is to take
the digest of two concatenated copies of the package.  Precalculation
can still be used to speed up the search for a collision, but the
unoptimisable tail is now guaranteed to be at least as long as the
entire package.  I see a couple of potentially useful variations on this:

	0. publish Digest(Digest(Archive) || Archive)
		(similar to Digest(Archive || Archive))

	1. publish Digest(Archive || Tail) where Tail is large and public
		(increases difficulty by a fixed amount that depends on
		the length of Tail -- suitable for small packages)

Are these approaches sound?  I'm a crypto-plumber, not a cryptographer.

I'm wondering whether we can define a new type of cryptographic primitive
here: the non-precomputable message digest.  The interesting feature of
an NPCMD would be that computing the digests of two related messages
cannot be optimised to take any less time than the computation of the
digests of two unrelated messages of the same sizes.  The suggestions
I made above are clearly not NPCMDs, but if one does

	M[0] = M   ; original message
	M[i+1] = Digest(M[i]) || M[i]
	D = Digest(M[n])   ; final digest

with Digest being a good conventional message digest, then this appears to
approach a NPCMD as n approaches infinity.  Of course, computation time
for this construction is linear in n (for a particular message size),
so this is not a practical way to achieve this goal.  Can a NPCMD be
done in a more reasonable time?  I imagine structures that might lead
to O(l*log(l)) time, where l is the message length, but that's just
my speculation.


--- end forwarded text

R. A. Hettinga <mailto: rah@ibuc.com>
The Internet Bearer Underwriting Corporation <http://www.ibuc.com/>
44 Farquhar Street, Boston, MA 02131 USA
"... however it may deserve respect for its usefulness and antiquity,
[predicting the end of the world] has not been found agreeable to
experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire'