Check out the new USENIX Web site.

Home About USENIX Events Membership Publications Students
CARDIS '02 Paper    [CARDIS '02 Tech Program Index]

Pp. 59-68 of the Proceedings

Breaking the Liardet-Smart Randomized Exponentiation Algorithm

Colin D. Walter

Comodo Research Laboratory

10 Hey Street, Bradford, BD7 1DQ, UK

colin.walter@comodo.net       www.comodo.net

 

 

Abstract.  In smartcard encryption and signature applications, randomised algorithms are used to increase tamper resistance against attacks based on side channel leakage.  Recently several such algorithms have appeared which are suitable for RSA exponentiation and/or ECC point multiplication.  We show that under certain apparently reasonable hypotheses about the countermeasures in place and the attacker’s monitoring equipment, repeated use of the same secret key with the algorithm of Liardet and Smart is insecure against any side channel which leaks enough data to differentiate between the adds and doubles in a single scalar multiplication.  Thus the scalar needs to be blinded in the standard way, or some other suitable counter-measures employed, if the algorithm is to be used safely in such a context.

 

Key words: m-ary exponentiation, Liardet-Smart randomized algorithm, ECC, addition chains, sliding windows, addition-subtraction chains, power analysis, SPA, SEMA, blinding, smartcard.

 

 

 

1    Introduction

 

Major progress in the theory and practice of side channel attacks [5, 6] on embedded cryptographic systems threatens to enable the capture of secret keys from single applications of cryptographic functions [10, 11, 14].  This is particularly true for the more computationally intensive functions such as exponentiation, which is a major process in many crypto-systems such as RSA, ECC and Diffie-Hellman. 

 

Timing attacks on modular multiplication can usually be avoided easily by removing data-dependent conditional statements [16], but, with timing variations removed, attacks which make use of data-dependent variation in power and electro-magnetic radiation become easier.  Initial attacks of this type required averaging over a number of exponentiations [8]. 

 

One counter-measure is to modify the exponent from e to e+rg where r is a random number, typically 32-bits, and g is the order of the (multiplicative) group in which the exponentiation is performed [5].  This results in a different exponentiation being performed every time.  However, if squares and multiplications can be distinguished during a single exponentiation, then use of the standard binary exponentiation algorithm immediately leads to exposure of the secret key.

 

For elliptic curve cryptography (ECC), the most efficient schemes for point addition and point doubling involve different numbers of operations in the field over which the curve is defined, and these numbers vary depending on the representation used for the curve.  A counter-measure which reduces the likelihood of distinguishing between these point operations involves equalising the number and type of the component field operations [12] or making the point addition look exactly the same as two point doublings [1]. 

 

However, squares and multiplications in the field behave differently [13] and so there is no reason to believe that such recoding will necessarily hide fully the distinction between point additions and point doublings: for example, in [12], field squares appear for point additions, but field cubes when the same formula is used for point doublings.  Side channels can distinguish these if the Hamming weight of arguments can be deduced.  So exponentiation algorithms are chosen in which there is still an ambiguity in the correspondence between multiplications (i.e. point additions in ECC terms) and properties of the secret key (such as bit or digit values).  M-ary exponentiation [4] for m > 2 provides one solution because each addition represents an unknown choice from a set of several non-zero digits.

 

If the same unblinded key value is re-used for many exponentiations, there is a danger that the repeated use of the same operand can be determined [14].  This would enable individual digits of the exponent base m to be identified and hence the key recovered.  Unfortunately, particularly for ECC as opposed to RSA,

applying the above exponent blinding technique is expensive when the secret key is typically only 192 bits.  It adds about 17% to the cost of point multiplication.  Hence randomised exponentiation algorithms may be a preferred option for ECC.

 

There are currently several algorithms which randomise the operations associated with specific inputs so that the exponentiation scheme is different on successive runs with the same data [7, 9, 17, 2, 3].  That of Liardet and Smart [7] uses a sliding window of random, variable width.  If the attacker’s equipment is insufficient to obtain information from a single EC point multiplication, then it seems that averaging over different multiplications with the same key would dilute any data dependency in the side channel leakage.  However, we will show here that if individual point multiplications do leak information about what operation is being performed, then the secret key can be obtained straightforwardly.  Indeed, one might even be better off with m-ary multiplication.

 

We begin by recalling the algorithm and looking at various parameters which might be chosen to improve efficiency or security.  Next, the assumptions about the attacker’s equipment and cryptosystem counter-measures are outlined.  These are initially quite tight to make the presentation of the attack easier.  The attack starts with extracting a least significant digit, and then uses this repeatedly to reconstruct one possible representation for the secret key.  An essential part of the discussion is an assessment of the probability that the attack can be completed successfully. Before concluding with some counter-measures and alternatives, we explain how the attack can still be performed in a more realistic environment where the side channel leakage is much poorer.

 

 

2    The Algorithm

 

This section contains a brief outline of the (exponentiation) algorithm of Liardet and Smart.  Because it generates an addition-subtraction chain rather than simply an addition chain, inverses have to be computed when it is applied.  This means that applications to RSA cryptography are unlikely because of the expense of computing inverses.  However, in elliptic curve cryptography (ECC), inverses are essentially for free.  Hence, we will assume the algorithm is applied to an additive group, such as that formed by the points on an elliptic curve, and use appropriate terminology.  Processing of the secret key k therefore produces a sequence of instructions which result in additions (A) and doublings (D) of group elements.

 

Suppose we wish to compute the element Q = kP for a given positive integer k (the secret key) and a given member P of some group E.  As in m-ary exponentiation, Liardet and Smart pre-compute the odd multiples iP of P for integers i Î (–½m, ½m] where m = 2R, and then employ the standard sliding windows technique but with a window which has a random width showing up to R bits.  In other words, k is recoded to obtain digits ki (0 ≤ i n) which are determined using a randomly-chosen variable base mi which divides m.  The digits are chosen in the order k0, k1, ..., kn and the digit representation knkn1...k1k0 satisfies

k = ((...((kn)mn1+kn1)mn2+...)m1+k1)m0+k0      (1)

The group element multiplication processes these digits

from most to least significant following the related scheme defined by

kP = m0(m1(...mn2(mn1(knP)+kn1P)+...)+k1P)+k0P      (2)

 

 

2.1  Code for the Key Recoding

 

More explicitly, if minmod is the function which returns a residue of minimal absolute value, the algorithm for choosing the digits is this:

 

Randomised Signed m-ary Window Decomposition [7]:

i ← 0 ;

While k > 0 do

{       If (k mod 2) = 0 then

{       mi ← 2 ;

ki  ← 0 ;

}

else

{       Randomly choose base mi Î {21,22,..., 2R} ;

ki ← k minmod mi ;

} ;

k ← (k–ki)/mi ;

i ← i+1 ;

} ;

 

Here, both 1 and  could be allowed as digits for base 1, but that involves the added complication of a random bit to decide which to select, and also (to avoid non-termination) restricting the choice to only 1 when k reaches 1.  Our attack would work also in these circumstances with few changes.

 

 

2.2  Efficiency Considerations

 

There are still some parameters to be chosen in the algorithm.  Varying these affects efficiency, but there are also security implications.  As we see later, certain choices will increase the difficulty of mounting the attack, forcing, in particular, more samples to be required. 

 

The value for R has the greatest effect on efficiency.  In elliptic curve applications, subtraction may have the same cost as addition.  Then it will be unnecessary to store the negative pre-computed multiples of the input point.  So only space for 2R2 multiples is likely to be required.  Increasing R improves speed, but with diminishing returns for the space required for pre-computed values.

 

No suggestions are made in [7] about how to choose the mi randomly.  A uniform distribution is not very efficient, and indeed perhaps the least secure under this attack.  It is most efficient to make the maximum possible use of the pre-computed values by choosing the maximum base size 2R always.  But, to maintain generality for later, suppose mi = 2j is chosen with probability pj when k is odd and p0 = 1 is the probability of selecting base 2 when k is even. 

 

Choosing pR = 1 means that mi = 2R whenever k is odd.  This yields the usual m-ary sliding window method with fixed m = 2R.  Taking R = 1 yields the usual binary “square-and-multiply” algorithm.  However, such choices would remove any non-determinism from the sequence of point operations.

 

Observe that biasing in the choice of mi does not change the uniformity in the distribution of residues k mod mi inherited from k, assuming k is randomly chosen.  This means that every new key value k generated during the recoding retains the same random properties: in particular, residues modulo 2j will be uniform for every key encountered. 

 

 

3    The Attack

 

3.1  Introduction & Initial Hypotheses

 

The purpose of randomised exponentiation algorithms is to frustrate side channel analysis by an attacker.  In particular, they are counter-measures against using knowledge of the exponentiation process to extract the secret key k.  Several different levels of leakage are possible, depending on the resources of the attacker.  A poor signal-to-noise ratio (SNR) means that many samples have to be taken, and averaging the side channel leakage is one way of improving the SNR.  So a critical parameter is whether or not the attacker’s equipment is good enough for him to extract sufficient meaningful data from the side channel trace of a single scalar multiplication.  If it is, then the standard key blinding described earlier suddenly fails to provide the data hiding protection afforded by averaging away local data dependencies.  Improved equipment and laboratory techniques mean that this barrier might soon be breached without excessive expenditure [10, 11].

 

The categories of leakage which could be considered include the following:

      i)   individual point operations can be observed on power, EM or other side channel traces;

     ii)   point doublings and point additions can be distinguished from each other;

   iii)   re-use of operands can be observed;  and

   iv)   operand addresses can be deduced.

Point (i) may hold simply because program instructions and data need to be fetched at the start of each point operation, and these cause different effects on the side channels than field operations.  Point (ii) may then hold as a result of different patterns of field operations for point additions than for point multiplications.  Properties (iii) and (iv) might hold as a result of being able to deduce Hamming weights of data and address words travelling along the bus.

 

Randomisation prevents the obvious averaging of the traces of many point multiplications which was used in initial power analysis attacks on the binary “square-and-multiply” algorithm.  Here every point multiplic-ation determines a different sequence of doublings and additions.  With matched code for additions and doublings, averaging may hide the difference between the two operations because they are no longer separated in time, but in current implementations such averaging will certainly reveal the start and end of the individual point operations which make up the scalar multiplication.

 

The attack described here requires the SNR to be good enough to extract some useful data from single multiplications on the curve.  Specifically, initially we assume that

·         Adds and doublings can always be identified correctly and distinguished from each other using traces obtained from side channel leakage for a single point multiplication, and

·         A number of traces are available corresponding to the same secret key value applied to different scalar multiplications.

Both of these hypotheses will be relaxed later to some extent, providing a more realistic scenario.

 

 

3.2  Overview of the Attack & Notation

 

The outline of the attack is as follows.  For simplicity, by the first hypothesis,

·         Every trace can be viewed as a word over the alphabet {A,D}.

Every occurrence of an A (add) in the trace splits the word into a prefix and a suffix which correspond to two integers kp and ks that are precisely defined in terms of k and the position of A in the trace.  All traces determine the same values to within ±1.  By looking at the patterns to the left of a given A, one obtains the residue of kp modulo a small power of 2, and hence a few bits of k.  Repeating this for the position of each A enables all the bits of k to be recovered.

 

Definition.  The position of a specific instance of character A or D in a trace word is the number of Ds which are to the right of the selected character.

 

We will exploit a close relationship between positions in which A appears in traces and bits which are 1 in the corresponding position of the binary representation of k.

 

In order to be able to give examples, we fix the character order of words over {A,D} to correspond to a left-to-right processing order.  Thus, the digit sequence 1202404, with most significant bit on the left, is processed from most to least significant bit, i.e. from left to right, and so would result in the word DADDDADD.  There are As in positions 2 and 5, and Ds in positions 0 to 5.  In fact, every A is paired with a preceding D with the same position, and so one could view the DA combination as a single character.  Then the position would correspond directly to a character index, counting from 0 at the right hand end, as in a binary representation.

 

The initial DA corresponds to the digit 12 and might be omitted if efficient initialisation takes place instead.  Assuming this is the case, we will delete any initial Ds but leave the initial A as an unambiguous reminder that there is an initial digit to take into account.  Thus words always commence with A.  In the above example, ADDDADD will be the word corresponding to the given digit sequence.

 

 

3.3  Properties of Key Digits

 

We now look at the sequence of digits generated by the Liardet-Smart algorithm.  The notation used here is the same as for a fixed base, and many standard properties have analogues.

 

Lemma 1.  Suppose knkn1...k1k0 is a digit represent-ation of k generated by the Liardet-Smart algorithm, with sequence mn, mn1, ..., m1, m0 of bases.  For some i, let kp(i) denote the integer corresponding to the prefix knkn1...ki and let ks(i) denote the integer corresponding to the suffix ki1...k1k0.  Then k = kp(i)m(i)+ks(i) where m(i) =  and |ks(i)| < m(i).

 

This lemma is obvious from the definition of the digit sequence given by equation (1) except, perhaps, for the last part.  That part follows easily by induction.  Digit ki is chosen with |ki| ½mi mi1.  This property for i = 0 starts the induction.  Then the induction step is

|ks(i+1)|  =  |kim(i)+ks(i)|  <  |(ki+1)m(i)|   mim(i)  =  m(i+1).

 

We will continue to use the notation m(i) for  and kp(i) and ks(i) for the key values associated respect-ively with the prefix knkn1...ki and the suffix ki1...k1k0 of the digit sequence.  The next lemma uses the equality and bound of Lemma 1 to identify two possible values for ks(i), corresponding to it being a positive or negative residue of k modulo m(i):

 

Lemma 2. With the previous notation, either

         i)   ks(i) = k mod m(i)  and  kp(i) = k div m(i),  or

       ii)   ks(i) = (k mod m(i))m(i) and  kp(i) = (k div m(i))+1

where mod returns the least non-negative residue, and div is integer division given by rounding down the real quotient.

 

This shows that, whatever choices are made for the base elements, a given digit suffix can determine only one of two possible values when the product of the corresponding base elements is fixed.  We would like at least the occurrence of the one which will make the corresponding prefix odd because it leads to an addition (A) which can be used to identify a corresponding point in the trace.

 

Lemma 3.  For all powers 2j k, there is a choice of base elements mi' and an integer i such that 2j = m(i).  On average, for at least 12R of all values of j, there are choices which make kp(i) odd, and, for at least half of all values of j, there are choices which make kp(i) even.

 

Proof.  The existence of the choice of basis elements is clear: taking mi' = 2 for all i' allows one to satisfy the equality for m(i) with i = j.  For that choice, ki is the usual index i bit of k, and takes either parity with equal probability.  It is the lowest bit of kp(i), and so kp(i) is the desired parity with probability ½. 

 

Other choices of base elements exist, and they may result in kp(i) being odd even when the bit of interest in k is even.  This increases the average number of cases for which oddness occurs.  Instead of choosing mj1 = 2, we try choosing mji' = 2i' for any i' with 1 i' R.  The ability to select them depends on a corresponding bit of kp(jR) being 1 (otherwise base 2 must be chosen).  These bits are independent and so the alternative bases can be chosen with probability ½.  Each will give odd parity to the next kp if chosen.  Hence the prefix key corresponding to 2j can be made odd with probability at least 12R. 

 

Of course, this argument just gives a lower bound on how many js will give rise to two key prefix and two suffix values.  It doesn’t guarantee that when both values are possible they will both appear with non-negligible frequencies.  The actual relative frequencies appear to depend on the lowest R bits of the kp corresponding to m(i') = 2jR.  However, in the next section, a lower bound on the ratio will be produced as necessary for each choice of these bits.

 

 

3.4  Recovering One Digit of k

 

In this section we show how to recover the least significant digit k0 and associated base m0 in one representation of k and how to identify the subset of traces which correspond to the associated prefix key kp such that k = kpm0+k0.  Exactly the same process yields other digits of k independently.  Those digits can then be assembled together to give k in the manner described in the next section.

 

If Tr is the full set of all sample traces, then we denote by Tri the set of traces obtained by taking each member of Tr and deleting the suffix to the right of, but not including, the D of position i.  Thus Tr0 = Tr.  Tri is partitioned into two complementary subsets: TriA which consists of those traces which terminate with A, and TriD which consists of those traces which terminate with D.  We need to identify one of these subsets for each digit choice so that its neighbour to the left can be selected correctly.  TriA always represents the odd choice for kp(i), but some traces in TriD may contain only some of the operations for the rightmost prefix digit, and so not represent any kp(i) properly.

 

The derivation here does make specific use of the fact that in this implementation  is not allowed as a digit for base 2.  Similar arguments apply when  is allowed, but there is a duality which leaves a complicating ambiguity between the two values of ±ks throughout the reconstruction process.  This is only resolved when the complete value of k is reconstructed and under the assumption that the true sign of k is known.

 

Lemma 4.  Select any trace for key k.  Then k is exactly divisible by 2i where i is the uniquely defined integer such that ADi is a suffix of the trace. 

 

Proof.  Clearly, if k is divisible by 2i then base 2 must be chosen for the lowest i digits, which are then all zeros.  This leads to a character sequence Di of i consecutive Ds as a suffix in every trace.  If k is not divisible by 2i+1 then, whatever the next choice of base, the digit will be non-zero and hence cause A to be appended to the sequence, yielding the suffix ADi. 

 

This result enables these i occurrences of D to be identified with i least significant digits 0, each of base 2.  Moreover, all traces confirm this conclusion.  So, removing the digits one at a time,

 

Lemma 5.  If every trace in Tr has final character D then we may take k0 = 0, m0 = 2 and the traces of Tr1 all represent the associated kp.

 

If k is odd, no digit has been deduced yet, and further work must be done.

 

Lemma 6.  Suppose k ≡ 1 mod 2i where i R.  Then  k ≡ 2i+1 mod 2i+1 if Tr contains a trace with suffix ADiA.  If k ≡ 2i+1 mod 2i+1 then the probability that Tr contains no trace with suffix ADiA is (1 pi')|Tr| where pi' = p1+p2+...+pi.

 

Proof.  If k ≡ 1 mod 2i+1 then a base of m0 = 2i+1 or larger will lead to suffix Di+1A.  However, a smaller base m0 = 2j will lead to suffix DjA with digit k0 = 1 and the forced selection of base 2 at least i+1j times.  This again leads to suffix Di+1A.

 

Now suppose k ≡ 2i+1 mod 2i+1.  A base of m0 = 2i+1 or larger again leads to suffix Di+1A.  However, the choice of base m0 = 2i means lowest digit k0 = 1 and next digit determined by k div 2i, which is odd.  Hence the suffix is ADiA for that choice.  Similarly, a base m0 = 2j with j < i, will lead to suffix DjA, digit k0 = 1 and kp ≡ 2i–j mod 2i–j+1.  So this choice is followed by the forced selection of base 2 exactly ij times with associated digit 0. The subsequent digit is then odd, resulting in the overall suffix ADiA. 

 

Thus, suffix ADiA guarantees k ≡ 2i+1 mod 2i+1 and it occurs precisely when the least significant base choice is 2j with ji.  These choices occur for a given trace with probability pi' = p1+p2+...+pi.  Hence suffix ADiA will not happen for any trace in Tr with probability  (1pi')|Tr|.

 

Lemma 7.  Suppose k ≡ 1 mod 2i.  If Tr contains a trace with suffix ADiA then k ≡ 2i+1 mod 2i+1, we may take k0 = 1 and m0 = 2i, and the traces of TriA all represent the associated kp.

 

This lemma deals with the recognisable instances of k ≡ 2i+1 mod 2i+1.  When base 2j is chosen for any ji, the suffix is ADiA for these cases.  As this occurs in pi' of cases, so we expect TriA to contain approximately pi'|Tr| elements.  

 

We will assume k ≡ 1 mod 2i+1 if there is no suffix ADiA but we know k ≡ 1 mod 2i.  By Lemma 6, this introduces a small probability of error which can be decreased by taking a larger sample if necessary, or by further analysis, such as through a more exhaustive analysis of suffixes and their expected frequencies than there is space for here.  Note, however, that if pi' = 0 then this choice of m0 will not resolve which residue mod 2i+1 is correct.  Hence an increase in security might be obtained by having p1 = p2 = ... = pi = 0 where i is as large as possible.

 

Theorem 1.  Assume each base 2i is selected with probability pi for odd key values, and digit  is only used for bases greater than 2.  Let pi' = p1+p2+...+pi and  = 1pi'.  Suppose k is a random odd integer that has generated trace set Tr and j (1 j R+1) is such that Tr contains no trace with suffix ADiA for any i < j.  Then k ≡ 1 mod 2j with probability .

 

Proof.  We prove this by induction on j.  For j = 1 the statement claims nothing, and so holds.  For the induction step, assume the statement holds for some j R.  Suppose also that Tr contains no trace with suffix ADiA for any i j.  By the induction hypothesis, k ≡ 1 mod 2j with probability .

 

Since k is random, the two possibilities for k mod 2j+1 are equally likely.  So, by Lemma 6, no occurrence of suffix ADjA means k ≡ 1 mod 2j+1 with probability .  This factor just needs multiplying into the product to obtain the claim for j+1 in place of j.

 

Theorem 2.  With assumptions and notation as in Theorem 1, suppose k is odd and j is minimal such that 1 j R+1 and Tr contains a trace with suffix ADjA.  Then k ≡ 2j+1 mod 2j+1 with probability

.

If j R we may take k0 = 1 and m0 = 2j and then the set of traces for the associated kp is TrjA.

 

Proof.  Theorem 1 shows that, for the given definition of j, k ≡ 1 mod 2j with probability .  If k ≡ 1 mod 2j+1 then, as in the proof of Lemma 6, all traces must terminate with suffix Dj+1A, which is not the case.  Hence k ≡ 2j+1 mod 2j+1 with the stated probability.

 

For the base m0 = 2j, k ≡ 1 mod m0 and so the associated digit is k0 = 1.  However, kp = k div 2j is odd, which forces the next digit to be non-zero.  Hence A is the next operation leftwards after the suffix DjA which corresponds to m0.  Thus, the relevant traces for the next digit are those of TrjA. 

 

The values of k for which no least significant digit has yet been assigned are those satisfying k ≡ 1 mod 2R+1.  Picking maximal base m0 = 2R gives k0 = 1 and makes kp even.  The associated set of prefix traces should be TrRD.  A possible difficulty with this definition is that for some traces removing the suffix DRA may split subsequences which correspond to one digit.  However, every choice of base 2i corresponds to a suffix DiA where i R, and must be followed by a number of instances of base 2 with digit 0 which makes the total modular division by at least 2R+1.  Hence the suffix DRA corresponds to the operations for a whole number of digits.  Therefore TrRD does indeed contain traces which represent only operations for sequences of complete digits, and so those traces all represent the same key value.

 

Theorem 3.  With the same assumptions and notation as in Theorem 1, suppose every trace in Tr has suffix DR+1A.  Then, with probability , k ≡ 1 mod 2R+1 and we may pick m0 = 2R.  For this choice k0 = 1, kp is the common key for TrRD, kp is even, and TrRD = TrR has the same cardinality as Tr.

 

 

3.5  Combining Digits to Recover k

 

For every position j at which there is an occurrence of A in some trace of Tr, the procedures of the previous sub-section can be applied to TrjA to obtain a base and digit at that point.  These digits are used when determining a digit sequence for k.  Starting at j = 0, the digits are selected iteratively.  As well as a digit and base, each trace set TrjA gives rise to another trace set defined at some position j' > j.  We will show that:

·         For this definition of j', the next digit is determined by whichever is appropriate of Trj'A or (TrjA)RD.

 

Here we need to check on the definition of the trace subsets.  If applied iteratively, the procedures above would actually determine smaller and smaller subsets: each time we apparently take a subset of the traces from the previous step.  However, because only two key values (one odd, one even) are associated with any position, every prefix which represents the operations of a complete number of digits must correspond to the odd key if it terminates with A and the even key if it terminates with D.  Of course, every trace prefix terminating with A must consist of the operations for a whole number of digits since A cannot appear in the middle of the sequence of operations for a single digit.  So every trace in Trj'A is generated from a key value which is common to them all.  Hence, the full set Trj'A can be used to determine the next digit, not just the subset of TrjA determined by the procedures above.

 

In the case of the prefix trace set Trj+RD, it is not clear which traces are generated by a complete key.  In some cases, the final D may not be the final operation of the digit sequence from which it was derived.  Hence, the subset (TrjA)RD must be used, not Trj+RD.  However, the construction observed that every such trace had suffix DR+1A.  So (TrjA)RD has the same cardinality as TrjA.  Hence the trace subsets bulletted above are indeed the correct ones to use for the key digits, and they do not progressively decrease in size.

 

The process of digit determination only begins to fail once a leading instance of A is encountered: Theorem 2 guarantees progress up to that point.  Traces are not all the same length.  Some will use a large base for the most significant digit.  Their initial Ds are deleted, giving them fewer instances of D overall, making their traces shorter.  These traces are simply discarded when fully processed.  The procedures above still apply to the subset.  Again, following Theorem 2, further digits can still be defined until the trace set becomes empty.  However, once the first (i.e. shortest) traces run out, the remaining key is representable by a single digit, so it is bounded in absolute value by 2R11.  Each increment of the position in the trace set reduces the representable key by a factor of 2.  Eventually, assuming there are enough traces, the initial A of the longest trace has a digit bounded by 1, and so must be 1.  Hence k is completely determined.  “Enough” traces would be present if, for example, base 2 were chosen for the most significant digit.  Insufficient traces just increases the number of possible values of k which may need testing by a small factor (under 2R–1).

 

 

3.6  The Probability of Error

 

We have been careful to obtain the probability of error in each digit in order i) to see if it is feasible to recover the key and ii) to see how the probabilities pi might be adjusted in the algorithm definition to provide improved security. 

 

The procedures of §3.4 define the probabilities in terms of the size of the trace set being employed at that time.  Generally, it is equal to the cardinality of a set of the form TrjA.  This is equal to |Tr| times the number of DAs in position j divided by the number of DAs or Ds in that position.  This can be approximated by |Tr| times the overall probability pA of DA divided by the overall probability pD of DA or D.  Since the choice of base m = 2i produces i1 occurrences of D followed by one of DA when i > 0, |TrjA| ≈ π |Tr| where

π =  =

For a uniform distribution this works out at π =  where typically we might expect R = 3; and for 2R-ary sliding windows it works out at π = .

 

In fact, the formula under-estimates the average size of TrjA.  Some positions do not have any occurrences of A, and we do not use the associated trace subsets.  This increases the average for those positions which do have occurrences of A.

 

Next, the distribution of base choices in the reconstructed key differs from that generated by the re-coding process.  Suppose k is odd for the set of traces at some point during the reconstruction.  In Theorem 2, the distribution of odd residues k mod 2R+1 is uniform.  So, neglecting the assumed small numbers of incorrectly assigned cases resulting from some of the possible suffixes not occurring, base 2j will be selected for the reconstructed key with probability 2j for 0 < j R and produce an odd next key.  Further, base 2R will turn up in the remaining 2R cases of odd keys but produce an even next key.  In half of all cases, an even key will lead to an even key.  Consequently, out of every 2R+2 digit choices in the reconstruction, on average 2R will be odd and 2 will be even.

 

By Theorems 2 and 3, the probability of the reconstructed key being correct is a product of factors of the form .  These factors can be approximated by .  Since choosing base 2j leads to j such factors, there is essentially one such factor for every bit of k.  The exceptions are where an even key causes base 2 and digit 0.  Then there is no doubt about the correctness of the digit 0.  This last case occurs for 2(2R+2)1log2k bits of the initial key k.  Otherwise, for odd keys, the relative frequency of different bases means that the factor  will appear on average for 2Ri(2R+2)1log2k bits if 0 < i < R and for 2(2R+2)1log2k bits if i = R. 

 

Because pR' = p1+p2+...+pR = 1, the factor for i = R is 1, and so can be ignored.  Hence,

 

Lemma 8.  The key k can be recovered with a probability approximately

where n = (1+21R)1log2k and π is as defined above. 

 

The property ≤ 1–p1 provides a lower bound for this product.  Consequently,

 

Lemma 9.  For a uniform distribution of base, the key k can be recovered with a probability at least

(1+π|Tr|)n

where  n = (1–21R)(1+21R)1log2k  and  π = .

 

For specific choices, it is possible to evaluate the product in Lemma 8 exactly.  A typical choice would be to have a key with 192 bits, R = 3 (which requires storing two pre-computed multiples, namely P and 3P), and a uniform choice of base, i.e. p1 = p2 = p3 = ⅓.  Then π = ⅓ and the product in Lemma 8 is just under 2–31 for |Tr| = 9.  So, if a key can be reconstructed and checked for correctness in unit time,

 

Theorem 4.  If doubles and adds can be distinguished on individual traces, and traces are captured from 9 applications of the same unblinded 192-bit key, then the Liardet-Smart algorithm with uniform selection of base 23 can be broken with a computational effort of about O(231).  With twice as many traces, the computational effort falls to under O(210).

 

Of course, the full force of all the patterns available and their relative frequencies has not yet been applied.  Hence the danger is probably substantially under-estimated.  Once a possible key has been recovered, there is considerable unused data in the traces that has not yet been used and can be investigated for checking purposes.  In the uniform case, about  of the data is so far unused – that in the complementary sets TrjD.  This contains information about digits whose bases were not aligned with those of the reconstructed representation of k.  Choosing a different base from that of the reconstruction process described above will provide confirmation about the correctness of each bit of k.  Indeed, each trace has to be consistent with some choice of bases, and the rightmost inconsistency in a trace will usually be very close to the rightmost bit in error.  There is insufficient space here to improve the probabilities which are a consequence of this approach, but the computational feasibility of the attack is already assured.

 

If the attacker is unable to distinguish clearly between adds and doubles, then the unused data vastly increases his ability to make corrections.  Moreover, as each digit is obtained through a purely local extraction of data from traces, it is easy to automate an exhaustive process to check for the overall best digit solutions using all traces, and hence prioritise the order for considering the most likely values for k.  However, for the data that has been used, any indistinctness between A and D is unimportant.  In this attack, it is only necessary to establish whether or not an A has appeared at each position.  The relative frequency of As means that the certainty of this can be determined with high degree just by increasing the number of traces sufficiently.

 

 

4    Counter-Measures

 

Our formulae for bounding the accuracy repeatedly used the probabilities of smaller bases much more than larger bases, and the accuracy improves when these probabilities are increased at the expense of the probabilities of larger bases.  This is consistent with the

greater ambiguity afforded by digits of larger bases.  Thus we recommend not using a uniform choice for the base, but employing a strong bias towards large bases, such as was illustrated in §2.2.  In the extreme, the standard, non-randomised, m-ary exponentiation technique is obtained, and this is not susceptible to the attack.

 

The cost of key masking is not entirely trivial in the context of ECC.  Adding a 32-bit random multiple of the group order to the key increases the point multiplication cost by some 17% for 192-bit keys, although it is a much smaller fraction of the total encryption cost.  Adding a smaller random multiple is probably ineffective if it results in a number of repetitions of the same key value within the lifetime of the key.  The highly repetitive nature of the traces resulting from the same prefix keys turning up again and again means that a duplicated key could be assumed if, and only if, traces matched closely enough.

 

The “double-and-add-always” method of computation provides a good measure of protection, but is expensive.  The attacker then has to determine whether or not the result of the addition is used before he can mount the attack.  This is much more difficult than distinguishing the two operations.  Hence traces will be susceptible to much more frequent errors, and a much greater number of traces will have to be recovered.

 

There are alternative randomised algorithms for which this type of attack does not apply, and others that display similar weaknesses.  That of Oswald and Aigner [9] can be attacked in a similar way.  Mist [15, 17] does not exhibit the same repetition of key values during key processing, and so may be a safer choice.  A new algorithm by Itoh et al. [18] may also be worthy of consideration.

 

 

5    Conclusion

 

It might have been hoped that the Liardet-Smart algorithm would avoid the cost of any additional counter-measures such as key blinding when the same secret key is repeatedly re-used, but this now appears not to be so.  Specifically, the key needs to be masked, or the pattern of adds and doubles has to be well hidden for individual point multiplications. 

 

Of course, there are many circumstances in which the algorithm is clearly of value, such as ECDSA, for which a different random key is used every time.  Then, for suitable parameter choices, the space of keys generating a given pattern of adds and doubles is infeasibly large, and so cannot be attacked successfully without additional data.

 

 

References

 

[1]     C. H. Gebotys & R. J. Gebotys, Secure Elliptic Curve Implementations: An Analysis of Resistance to Power Attacks in a DSP Processor, Cryptographic Hardware and Embedded Systems CHES 2002, B. Kaliski, Ç. Koç & C. Paar (editors), Lecture Notes in Computer Science, 2523, Springer-Verlag, 2002, to appear.

 

[2]     J. C. Ha & S. J. Moon, Randomized Signed-Scalar Multiplication of ECC to Resist Power Attacks, Cryptographic Hardware and Embedded Systems CHES 2002, B. Kaliski, Ç. Koç & C. Paar (editors), Lecture Notes in Computer Science, 2523, Springer-Verlag, 2002, to appear.

 

[3]     K. Itoh, J. Yajima, M. Takenaka & N. Torii, DPA Countermeasures by Improving the Window Method, Cryptographic Hardware and Embedded Systems CHES 2002, B. Kaliski, Ç. Koç & C. Paar (editors), Lecture Notes in Computer Science, 2523, Springer-Verlag, 2002, to appear.

 

[4]     D. E. Knuth, The Art of Computer Programming, vol. 2, “Seminumerical Algorithms”, 2nd Edition, Addison-Wesley, 1981, 441466.

 

[5]     P. Kocher, Timing Attack on Implementations of Diffie-Hellman, RSA, DSS, and other systems, Advances in Cryptology Crypto ’96, N. Koblitz (editor), Lecture Notes in Computer Science, 1109, Springer-Verlag, 1996, 104113.

 

[6]     P. Kocher, J. Jaffe & B. Jun, Differential Power Analysis, Advances in Cryptology Crypto ’99, M. Wiener (editor), Lecture Notes in Computer Science, 1666, Springer-Verlag, 1999, 388397.

 

[7]     P.-Y. Liardet & N. P. Smart, Preventing SPA/DPA in ECC Systems using the Jacobi Form, Cryptographic Hardware and Embedded Systems CHES 2001, Ç. Koç, D. Naccache & C. Paar (editors), Lecture Notes in Computer Science, 2162, Springer-Verlag, 2001, 391401.

 

[8]     T. S. Messerges, E. A. Dabbish & R. H. Sloan, Power Analysis Attacks of Modular Exponentiation in Smartcards, Cryptographic Hardware and Embedded Systems (Proc CHES 99), C. Paar & Ç. Koç (editors), Lecture Notes in Computer Science, 1717, Springer-Verlag, 1999, 144157.

 

[9]     E. Oswald & M. Aigner, Randomized Addition-Subtraction Chains as a Countermeasure against Power Attacks, Cryptographic Hardware and Embedded Systems CHES 2001, Ç. Koç, D. Naccache & C. Paar (editors), Lecture Notes in Computer Science, 2162, Springer-Verlag, 2001, 3950.

 

[10]  J.-J. Quisquater & D. Samyde, ElectroMagnetic Analysis (EMA): Measures and Counter-Measures for Smart Cards, Smart Card Programming and Security (E-smart 2001), Lecture Notes in Computer Science, 2140, Springer-Verlag, 2001, 200210.

 

[11]  J.-J. Quisquater & D. Samyde, Eddy current for Magnetic Analysis with Active Sensor, Smart Card Programming and Security (E-smart 2002), Lecture Notes in Computer Science, Springer-Verlag, 2002, to appear.

 

[12]  M. Joye & J.-J. Quisquater, Hessian Elliptic Curves and Side Channel Attacks, Cryptographic Hardware and Embedded Systems CHES 2001, Ç. Koç, D. Naccache & C. Paar (editors), Lecture Notes in Computer Science, 2162, Springer-Verlag, 2001, 402410.

 

[13]  C. D. Walter & S. Thompson, Distinguishing Exponent Digits by Observing Modular Subtractions, Topics in Cryptology CT-RSA 2001, D. Naccache (editor), Lecture Notes in Computer Science, 2020, Springer-Verlag, 2001, 192207.

 

[14]  C. D. Walter, Sliding Windows succumbs to Big Mac Attack, Cryptographic Hardware and Embedded Systems CHES 2001, Ç. Koç, D. Naccache & C. Paar (editors), Lecture Notes in Computer Science, 2162, Springer-Verlag, 2001, 286299.

 

[15]  C. D. Walter, Improvements in, and relating to, Cryptographic Methods and Apparatus, UK Patent Application 0126317.7, Comodo Research Laboratory, 2001.

 

[16]  C. D. Walter, Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli, Proceedings of CT-RSA 2002, Lecture Notes in Computer Science, 2271, Springer-Verlag, 2002, 3039.

 

[17]  C. D. Walter, Mist: An Efficient, Randomized Exponentiation Algorithm for Resisting Power Analysis, Proceedings of CT-RSA 2002, Lecture Notes in Computer Science, 2271, Springer-Verlag, 2002, 5366.

 

[18]  K. Itoh, J. Yajima, M. Takenaka & N. Torii, DPA Countermeasures by Improving the Window Method, Cryptographic Hardware and Embedded Systems CHES 2002, B. Kaliski, Ç. Koç & C. Paar (editors), Lecture Notes in Computer Science, 2523, Springer-Verlag, 2002, to appear.

 

 

 

 


This paper was originally published in the Proceedings of the Fifth Smart Card Research and Advanced Application Conference, November 21–22, 2002, San Jose, CA, USA
Last changed: 11 Oct. 2002 aw
Technical Program
CARDIS '02 Home
USENIX home