[Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases

Arnold G. Reinhold mac_crypto@vmeng.com
Sun, 4 Apr 2004 06:17:55 -0500


The cryptographic hash function MD5 has long been used to 
authenticate software packages, particularly in the Linux/Unix/open 
source community. This has carried over to Apple's OS-X. The MD5 hash 
of an entire package is calculated and its value is transmitted 
separately from the package. Users who download the package compute 
the hash of the copy they received and match that value against the 
original.

Putting aside the question of how the the hash value is safely 
transmitted, there is a potential attack on this method due to the 
128 bit length of the MD5 hash output. If all the individuals having 
input to the creation of the original software package are 
trustworthy, then 128 bits provides adequate security. Someone trying 
to substitute a version of that package containing a malicious 
modification (Trojan horse, virus,  backdoor) would have to solve a 
128 bit problem to create an infected package that passed the hash 
verification. That is considered computationally infeasible, at least 
until the advent of quantum computing.

One might think the above argument proves MD5 is sufficient. After 
all, if an attacker had an agent working inside the organization that 
produced the package, that agent could simply incorporate the 
malicious software patch in the original package. However such an 
insertion is very risky. A sophisticated software company would 
likely have code reviews that would make introduction of the 
malicious code difficult. Use of a source control system makes is 
easy to track down whoever inserted the malicious change once it is 
discovered. The malicious code would be distributed to everyone, 
increasing the likelihood of detection. In an open source model, a 
defect in source code is particularly hard to hide. The agent risks 
being uncovered and and perhaps prosecuted, the organization he works 
for risks being identified and the technical means that the malicious 
code employed would be compromised.

A safer attack would be for the agent to insert an apparently 
innocent modification to the package, with the modification selected 
so that the MD5 hash of the package with the malicious code matches 
the hash of the officially released package. Since the attacker (or 
whomever he is working for) controls the malicious code, calculating 
the value of this modification is subject to a meet-in-the-middle 
attack and presents presents a 64-bit problem. Solving such a problem 
is within the means of a well-funded attacker today* and will become 
easier in the future.

The modification could be designed to get past code reviews in a 
number of ways. For example, 64 low order bits in a JPEG icon might 
be altered. The agent would have to make the last modification to the 
software package prior to release and perhaps send a final 
pre-release version of the package to someone on the outside who does 
the collision calculation, but those are hardly insurmountable 
hurdles.  In situations where new releases are relatively frequent, 
it may suffice for this attack to succeed only occasionally, allowing 
periodic entry into selected systems to recover private keys, for 
example.  The attacker merely submits modifications late in the 
release cycle and if his happens to be last then the full attack is 
mounted.

The obvious solution to this problem is to use a wider hash for 
package authentication. For example, SHA-256 would present an 
attacker using this approach with a 128-bit problem. Even SHA1 would 
be preferable, making such an attack an 80 bit problem.  If both MD5 
and SHA1 hashes are provided, the attacker faces the problem of 
forging them both. It costs almost nothing to provide a wider hash 
along with the MD5 hash whenever a new package is released. It seems 
the prudent thing for Apple to do.

Arnold Reinhold


* From: http://www.rsasecurity.com/rsalabs/faq/3-6-6.html

"Van Oorschot and Wiener [VW94] have considered a brute-force search 
for collisions (see Question 2.1.6) in hash functions, and they 
estimate a collision search machine designed specifically for MD5 
(costing $10 million in 1994) could find a collision for MD5 in 24 
days on average. The general techniques can be applied to other hash 
functions."

VW94]
P. van Oorschot and M. Wiener, Parallel collision search with 
application to hash functions and discrete logarithms, Proceedings of 
2nd ACM Conference on Computer and Communication Security(1994).

** SHA1 is available in OS-X as part of openssl. Type "openssl  sha1 filename"