Innovation and design

Empirical explorations of faster Fermat factorisation, part 7

by Stephen Hewitt | Published 2 September 2022 | Last updated 4 September 2022

Introduction

This is the seventh in a series of technical articles presenting original work towards an open-source, free software implementation of the Fermat factorisation algorithm optimised for speed.

This article uses concepts and terminology introduced in the previous articles and is not intended to be understood without them. See the Related section below for the other articles in the series.

Part 3 reported a fourfold increase in speed after tuning the YAFU filter for a particular chosen value of number to factor. This article reports an empirical test of speed with tuning, across the full range of numbers to factor, representative of a random distribution.

Terminology

big square: Successful factorisation finds b² - s² = N where b² is called here the big square. See Part 1.
GPL: GNU general public license
LSB: least significant bit
LSD: least significant digit
M: The input range, or modulus, of the filter. In effect, what happens during filtering is that a big presquare candidate is reduced mod M and the residue tested in the filter to discover whether it is viable. This is named after the variable in the YAFU code. See Part 3.
N: The number to factorise, of the form pq where p and q are large primes; possibly the modulus of an RSA public key.
presquare: The root x of any square x² will be called the presquare.
RR: reduction ratio, defined in Part 3.
RSA: Rivest Shamir Adleman
small square: Successful factorisation finds b² - s² = N where s² is called here the small square. See Part 1.
YAFU: Yet Another Factorisation Utility. See Part 2.

Experiment

For each of the 24 categories of N a design value of filter modulus M was selected from the Appendix in Part 5. These values are shown in Table 1.

Two modified versions of the YAFU Fermat factorisation function, zFermat(), were created. One was with a compact filter table of size M/RR (as discussed in Part 3 and in Part 5) and one with the original YAFU filter table of size M.

The modified compact table version allows M to be changed at run time, so that it can be tuned.

The benchmark software factorises each test number in three different ways.

It uses the original YAFU zFermat() filter table size of M with the fixed YAFU M value of 176400.
For the tuned filter it categorises each number to be factorised into one of the 24 categories and uses the value of M shown in Table 1 for that category.
It uses the same version of zFermat() with a compact filter table, but with the original YAFU fixed value of M = 176400.

Table 1: The values of tuned filter modulus `M` chosen from the Pareto front in the Appendix in Part 5 for each of the 24 categories of number to factorise. The reduction ratio RR and the table ratio (TR) are fixed by the choice of M.
`N` category	`M`	RR	TR
AAA1	101606400	2362.5	43008
AAA3	793800	443.0	1792
AAA5	3175200	885.9	3584
AAB1	3763200	700.0	5376
AAB3	29400	131.2	224
AAB5	117600	262.5	448
ABA1	20321280	1653.8	12288
ABA3	158760	310.1	512
ABA5	635040	620.2	1024
ABB1	752640	490.0	1536
ABB3	5880	91.9	64
ABB5	23520	183.8	128
BAA1	14515200	1800.0	8064
BAA3	113400	337.5	336
BAA5	453600	675.0	672
BAB1	537600	533.3	1008
BAB3	4200	100.0	42
BAB5	16800	200.0	84
BBA1	2903040	1260.0	2304
BBA3	22680	236.2	96
BBA5	90720	472.5	192
BBB1	107520	373.3	288
BBB3	840	70.0	12
BBB5	3360	140.0	24

Results

Table 2: Empirical measurements of Fermat factorisation time for three different implementations: the original YAFU zFermat() function, zFermat() with a compact table with an untuned filter and zFermat() with a compact table with a tuned filter. Filter parameters and measured times are shown for 24 categories of number to factorise.
`N` category	Original	Untuned	Tuned	Untuned/Tuned	RR_tuned/RR_untuned
AAA1	185.8s	118.0s	11.8s	10.0	12.0
AAA3	156.9s	105.2s	46.9s	2.2	2.2
AAA5	104.1s	68.1s	17.0s	4.0	4.5
AAB1	96.7s	58.6s	13.9s	4.2	5.3
AAB3	105.1s	71.5s	71.5s	1.0	1.0
AAB5	206.1s	129.9s	73.8s	1.8	2.0
ABA1	89.2s	55.6s	5.5s	10.1	12.0
ABA3	140.6s	95.3s	42.5s	2.2	2.2
ABA5	258.1s	162.1s	40.9s	4.0	4.5
ABB1	170.5s	105.3s	24.1s	4.4	5.3
ABB3	352.5s	245.8s	247.9s	1.0	1.0
ABB5	269.6s	178.1s	102.0s	1.7	2.0
BAA1	128.7s	80.8s	8.3s	9.7	12.0
BAA3	219.4s	147.7s	66.1s	2.2	2.2
BAA5	208.8s	140.9s	36.3s	3.9	4.5
BAB1	346.8s	220.8s	52.5s	4.2	5.3
BAB3	309.6s	206.5s	210.2s	1.0	1.0
BAB5	155.1s	98.2s	56.6s	1.7	2.0
BBA1	268.0s	160.4s	16.2s	9.9	12.0
BBA3	230.3s	150.0s	67.2s	2.2	2.2
BBA5	310.1s	197.0s	49.4s	4.0	4.5
BBB1	402.2s	258.8s	58.8s	4.4	5.3
BBB3	278.9s	190.3s	200.3s	0.9	1.0
BBB5	293.8s	187.2s	107.9s	1.7	2.0
Mean	221.3s	163.3s	80.6s	2.0	-

The mean in Table 2 has been adjusted to reflect the fact that all categories of N occur with equal probability except for categories ending in '3' which occur twice as often. (This is because the '3' ending reflects either the LSB pattern 011 or 111 in N, whereas '5' reflects the bit pattern 101 only and '1' reflects the bit pattern 001 only, as described in Part 3)

Discussion

The results in Table 2 show that the version of the factorisation function modified to use a compact filter table (without tuning) factored the number in about 65-70% of the time of the version using the original sparse filter table, for every value of N.

As expected, the tuned version was faster than the untuned version, and multiplication in speed correlated with the calculated improvement in filter RR due to tuning. The last two columns of Table 2 illustrate this. However the correlation is not exact, and it is noticeable that the empirical improvement in almost all cases is only around 85% of the improvement in RR. For example for the category AAA1, the improvement in RR from tuning is a factor of 12 but the measured time reduction is only a factor of 10. For the category BBB5, the improvement in RR is 2 but the measured improvement is 1.7 Exceptions to this are the categories ending '3' where the RR is 1.0 or 2.2.

GPL free software to download

The open-source software here produces benchmark timings like the ones in Table 2.

It consists of the following files with the following SHA256 hashes at the following locations on this website.

/download/GPLv3.txt: 3972dc9744f6499f0f9b2dbf76696f2ae7ad8af9b23dde66d6af86c9dfb36986
/download/benchmark.cc: 2af8c3410899a696f6d3945993f5a0b2e227d4d41191912e1d157e18209d175f
/download/benchmark.mk: a68feaaaea73d4d9f6574de8ac61541dc49599d73f0e1591af52ab39e6b41c03
/download/check.c: 6d06e857886ab6290412af17dcf294de54908e8c4bf02f5c3c96df74fdf1b388
/download/generate.cc: 2daaa50eff3d0953a912c1b1ae8a089886264a37aea334eb4986108011601cbc
/download/optima.gh: 8037db0a4b7fc32e9e96741950e0d154e34c8580c680c22571fb1503522f4df8
/download/timings.cc: 0ac100ee43662d69f9695728a5a9cd79332d14bcfb752b80cb99c428deb02365
/download/timings.h: e855144097e59ba7036e18f0ceecc08b937b085e8d14266fc51ed2bb42be6f1b
/download/zf.h: b9f239a4e595122a27d0c715cc05136ea46f493146c996be33b2a41ad1d0003e
/download/zf1.c: ed0fd4971718c76b1ff7fe8ee641ddb4d674d4577f649e4afed103852af08b42
/download/zf2.c: 2de98117dbb6ea4066facf91d111a05431f045e7ec5dcf9113ecf1b5bc2deda9

The files zf1.c and zf2.c each contain a modified version of the zFermat() factorisation function from YAFU. This code generated the times recorded in the benchmarks in Table 2.

This build also requires a generated header file lsd2series.gh which is the output from the programme lsd2.cc, released in Part 6.

Empirical explorations of faster Fermat factorisation, part 7

Introduction

Terminology

Experiment

Results

Discussion

GPL free software to download

Related articles

Empirical explorations of faster Fermat factorisation, part 1

Empirical explorations of faster Fermat factorisation, part 2

Empirical explorations of faster Fermat factorisation, part 3

Empirical explorations of faster Fermat factorisation, part 4

Empirical explorations of faster Fermat factorisation, part 5

Empirical explorations of faster Fermat factorisation, part 6

Empirical explorations of faster Fermat factorisation, part 8

Empirical explorations of faster Fermat factorisation, part 9

How to remember a 128-bit key using ‘constrained choice’

Manual encryption with a one-time pad revisited

How to remember a provably strong password: a new way using ‘constrained choice’

Authenticating a public key from just a name - a feasibility study with GPL code

Lepron, a project to develop pronounceable pseudowords for representing binary strings - Part 1: introduction and goals

Which web search engines have their own index? Evidence from website visits

External links