[GNU Manual] [No POSIX requirement] [Linux man] [FreeBSD man]
Summary
factor - print prime factors
Lines of code: 2662
Principal syscall: write() via printf()
Support syscalls: None
Options: 2 (help and version)
Descended from sort in Version 4 UNIX (1973)
Added to Textutils in December 1994 [First version]
Number of revisions: 152
The factor utility is a collection of distinct algorithms that try to solve the prime factor problem. The possible procedures are determined at compile time for that machine. We'll walk through the general utility design, leaving the math details to more authorative sources. The algorithms include:
- Lucas primality test
- Miller-Rabin composite test
- Pollard-Brent rho
- Square form factorization
- Trial division
Since I'm not a mathematician, I won't try to add insight to these implementations except to recommend reviewing Montgomery reductions and Hensel forms. The bottom line is speed -- especially far large input.
The utility also includes multiple implementations of the same algorithms: with and without the GNU Multiple Precision Arithmetic Library
Helpers:do_stdin()
- Tokenize stdinfactor()
- Single-precision factor procedurefactor_insert_large()
- Add a factor for single-precision inputfactor_insert_multiplicity()
- Add several factor multiples for single-precision inputfactor_insert_refind()
- Reproduce and add the discovered primefactor_using_division()
- Single-word trial divisionfactor_using_pollard_rho()
- Single-word Pollard-Brent rho (up to 64-bits)factor_using_pollard_rho2()
- Double-word Pollard-Brent rho (up to 128-bits)factor_using_squfof()
- Square form factorizationgcd_odd()
- Single-word GCDgcd2_odd()
- Double-word GCDis_square()
- Returns the root or 0 if not a squareisqrt()
- Single-word square rootisqrt2()
- Double-word square rootlbuf_alloc()
- Initializes the line bufferlbuf_flush()
- Pushes the line bufferlbuf_putc()
- Adds a character to the line bufferlbuf_putint()
- Adds an integer to the line buffermillerrabin()
- Single-word Miller-Rabin algorithmmillerrabin2()
- Double-word Miller-Rabin algorithmmod2()
- Double-word modulusmp_factor()
- Multi-precision factor procedure using GMPmp_factor_clear()
- Freesmp_factors
mp_factor_init()
- Initializesmp_factors
mp_factor_insert()
- Add a multi-precision factormp_factor_insert_ui()
- Adds an unsigned integer to themp_factors
mp_factor_using_division()
- Multi-precision trial division using GMPmp_factor_using_pollard_rho()
- Multi-precision Pollard-Brent rho using GMPmp_millerrabin()
- Miller-Rabin using GMPmp_prime_p()
- Multi-precision primality testing using GMPmpz_va_init()
- Variadic multi-precision input parsingmulredc()
- Single-word Montgomery reductionmulredc2()
- Double-word Montgomery reductionpowm()
- Single-word modular exponentiationpowm2()
- Double-word modular exponentiationprime_p()
- Lucas primality test for single uint inputprime2_p()
- Lucas primality test for double uint inputprint_factors()
- Starts factorization for all inputprint_factors_single()
- Starts factorization for a single inputprint_uintmaxes()
- Prints integers potentially stored as multiple intsstrto2uintmax()
- Parses input numbers in to potentially multiple unsigned ints
add_ssaaaa()
- Adds unsigned double-word values (no carry)addmod()
- Single-word modular additionaddmod2()
- Double-word modular additioncount_leading_zeroes()
- Counts the number of MSB zeroescount_trailing_zeroes()
- Counts the number of LSB zeroesge2()
- Double-word greater-than or equalgt2()
- Double-word greater-thanlsh2()
- Double-word left shiftrsh2()
- Double-word right shiftsub_ddmmss()
- Double-word subtractionsubmod()
- Single-word modular subtractionsubmod2()
- Double-word modular subtractionudiv_qrnnd()
- Divides unsigned double-word valuesumul_ppmm()
- Multiplies unsigned double-word values
die()
- Exit with mandatory non-zero error and message to stderrerror()
- Outputs error message to standard error with possible process termination
Setup
Two important data structure defined are struct factors
and struct mp_factors
. Both are used to store factor information with the former used for single precision (up to 128-bit) values and the latter for multi-precision. In general, all multi-precision operations are prefixed with mp_*.
After common initialization, main()
allocates the line buffer (lbuf
) which holds the output string.
Parsing
Parsing factor only checks the default help and version options. Any arguments are assumed to be numbers to be factored. The validity of factors is checked during execution.
Parsing may fail if the user provides an unknown option. The result is a short error message followed by the usage instructions.
Execution
The execution path for factor is more convoluted than a typical utility due to using the preprocessor for control flow. The definitions to watch for are:
HAVE_GMP
- The GNU Multiple Precision Arithmetic Library is available to useUSE_SQUFOF
- Defines (and prioritizes) square form factorizationSTAT_SQUFOF
- Output statistics. Not a branching decision (only extends computation)
The exact execution path is determined by both the machine capability and the size of the input. Multiple numbers may be input for testing. The general goal is to minimize execution time .
The basic flow for a single input number begins by determining if the input is single-precision (expressable 128-bits) or larger.
Factoring single-precision input
- Check base cases (less than 2)
- Try for a quick kill with trial division
- Factor out powers of 2 (count number of LSB zeroes and shift)
- Check the first 669 primes from the difference table
- Check that any remaining value is significant (greater than 1)
- Test the remaining value for prime (if so, we're done)
- Factor the remaining composite with either square form (if available) or Pollard-Brent rho otherwise
Factoring values larger than 128 bits requires using GMP.
Failure cases:
- Factoring a value larger than 128-bits without GMP
- Attempting to factor a negative integer or non-integer
- Primality test failure
- Unable to write to output buffer