Empirical explorations of faster Fermat factorisation, part 6

by | Published

Introduction

This is the sixth in a series of technical articles presenting original work towards an open-source, free software implementation of the Fermat factorisation algorithm optimised for speed. This article uses concepts and terminology introduced in the previous articles. In particular it continues the explorations of Part 3.

See the Related section below for the other articles in the series. An explanation of some of the terminology is also replicated in the next section.

Part 3 included empirically-generated tables (Tables 1 - 3) of the increasing filter reduction ratio with the increasing number of LSDs considered in the filter, for the first few small prime bases. It also showed in its Figure 1 the empirically-generated mapping from the value of the LSD of N to the binary selection of series ‘A’ or ‘B’ for this increasing reduction ratio, for odd primes.

Part 4 reported the empirical discovery that the reduction ratios in the case of considering a single LSD were given by a simple expression, for all the primes tested. This meant that these values can be derived without iterating over all the possible values of LSD. It did not attempt mathematical proof of this expression but left it as a conjecture.

This article continues in this vein with two more empirical observations about reduction ratios, with no mathematical proof attempted. These observations might or might not turn out to be useful in optimising factorisation but they are discussed here on the premise that they are interesting in their own right.

Terminology

big square
Successful factorisation finds b2 - s2 = N where b2 is called here the big square. See Part 1.
GPL
GNU general public license
LSB
least significant bit
LSD
least significant digit
M
The input range, or modulus, of the filter. In effect, what happens during filtering is that a big presquare candidate is reduced mod M and the residue tested in the filter to discover whether it is viable. This is named after the variable in the YAFU code. See Part 3.
N
The number to factorise, of the form pq where p and q are large primes; possibly the modulus of an RSA public key.
presquare
The root x of any square x2 will be called the presquare.
RR
reduction ratio, defined in Part 3.
RSA
Rivest Shamir Adleman
small square
Successful factorisation finds b2 - s2 = N where s2 is called here the small square. See Part 1.
YAFU
Yet Another Factorisation Utility. See Part 2.

Empirical observation 1: for odd primes, the series A or B selector is the Legendre symbol

Figure 1 of Part 3 presented a mapping from the value of the LSD of N to the determination ‘A’ or ‘B’ for the reduction ratio series. Inspection shows it corresponds exactly the the Legendre symbol (l/p) where l is the LSD of N and p is the base. Series ‘A’ is selected where the Legendre symbol would be 1 and series ‘B’ where it would be -1.

Free open source GPL software used to empirically check this is available to download below. The conjecture held for all the cases for which I tested it, namely all values of LSD for all primes up to 127.

Empirical observation 2: the increase in stop patterns with increasing LSDs follows a simple pattern

Table 1 shows the increase in the number stop patterns in series A for each of the first odd prime bases at each increment in the number of LSDs considered in the filter. It was derived in the same way as the information in the Tables 1 - 3 of Part 3, but is presented here in a different form.

The increase in this table is normalised for the pattern space at the number of LSDs included in the filter. For example, for base 5 upon adding a second LSD, there is an increase of 8 in the number of stop patterns in a total pattern space of 52. This is explained in more detail next.

A simple pattern in these increases is apparent in this table. The initial number of stop patterns at one LSD is (p - 1)/2 where p is the odd prime base. This empirical observation was reported in Part 4. For example in base 5, there are 2 stop patterns out of a pattern space of 5.

When the second LSD is added, the increase in the number of stop patterns is four times the increase at the first LSD. For example in the case of base 5, the increase is now 8 stop patterns, out of a pattern space of 25. Since at 1 LSD there were already 2 stop patterns out of 5, that is to say 10 stop patterns out of 25, the total number of stop patterns at 2 LSDs is now 18 out of a pattern space of 25.

When the third LSD is added, the increase in the number of stop patterns is half the increase at the second LSD. In the case of base 5, the increase is now 4 stop patterns, out of a pattern space of 125.

Subsequent additions of LSDs continue in the same way, with the absolute increase in stop patterns upon adding an odd LSD half the increase upon adding an even LSD. This pattern is regular and the same for all odd prime bases for all numbers of LSDs for which I have tested it.

The absolute size of the increase in stop patterns does not grow but the denominator of the reduction ratio, the pattern space, grows exponentially with increasing number of digits. Hence there is little filtering achieved by adding further digits in primes beyond the first few.

Base 2, not shown here, also exhibits regular patterns when treated in the same way.

Table 1: The increase in stop patterns at each increment in the number of LSDs considered in the filter. This for series A in the first odd prime bases, with up to 6 LSDs shown. The increase here is normalised for the pattern space at the included number of LSDs, revealing its predictability.
base 1d 2d 3d 4d 5d 6d
3 1 4 2 4 2 4
5 2 8 4 8 4 8
7 3 12 6 12 6 12
11 5 20 10 20 10 20
13 6 24 12 24 12 24
17 8 32 16 32 16 32
19 9 36 18 36 18 36
23 11 44 22 44 22 44
29 14 56 28 56 28 56
31 15 60 30 60 30 60

GPL free software to download

The open-source software here generates an extended version of Figure 1 of Part 3 which shows which values of LSD in N map to series A and which to series B. At the same time it checks the conjecture of Part 4 and the empirical observation of this article that the selection of series A or B follows the Legendre symbol. It is a modified version of lsd.cc, which was published in Part 4.

It consists of the following files with the following SHA256 hashes at the following locations on this website.

3972dc9744f6499f0f9b2dbf76696f2ae7ad8af9b23dde66d6af86c9dfb36986  /download/GPLv3.txt
7c1643853e05a98fbf0ea59284e2aa6bc0043ee85b0de4a32cac809824775b98  /download/lsd2.cc
bec26e3f1db46594409c8ead98d078129547783c77093e60870c57a4a0ce0117  /download/stop2.cc
ad05fa368d32fc3cc3a7521eed6cb4a6a52ba196a39ec3e928580d1d3a687b84  /download/stop2.h

On GNU/Linux it has been built with

g++ lsd2.cc stop2.cc -o lsd2

and run with no arguments as follows.

~/work # ./lsd2
"xx", // 2
"xAB", // 3
"xABBA", // 5
"xAABABB", // 7
"xABAAABBBAB", // 11
...

Related