In this article, we recommend the number 89 as the default initial size for Java built-in hash table. This initial size allows the table size to remain prime after 5 expansions. During expansions up to 2^32 bit range, there are no divisors found smaller than 11.
The Java built-in hash table has an initial capacity that can be specified by the programmer. If the number of entries exceeds the threshold, the capacity is multiplied by the formula of 2n+1.
It is a common practice to recommend a prime number to be the size of a hash table. This way, a funnel will occur only if the keys are multiples of the prime number. On the other hand, if the hash table size is not a prime number, keys that are multiples of any of the factors of the hash table size will form a funnel.
Since Java's hash table expansion uses formula of 2n+1. It is natural to expect the best candidate for initial size to remain a prime under a number of table expansions. We will search for a number X with the following properties:
Large prime numbers end with the decimal digit 1, 3, 7 or 9. Any number ending with digit 0, 2, 4, 5, 6, 8 would not be a prime (eg. 10, 22, 34, 65, 78). Under the expansion of 2n+1, it can be shown that any number with digit ending in 1, 3, or 7 will become divisible by 5 after four expansions. (11 -> 23 -> 47 -> 95) Any hash table size that is divisible by 5 is very bad, as keys in multiples of 10 are common.
The search would be based on finding a prime number ending in the digit 9 that have the properties we desire.
Primes under 100 with ending digit of 9 consist of 19, 29, 59, 79, and 89. The expansion of 19 is 39, which is divisible by 3 - too small. The expansion of 29 is 59, a prime number. This is good, however the expansion of 59 is 119, which is divisible by 7 - somewhat small. We looked next at the number 79. The expansion of 79 is 159, which is again divisible by 3.
Lastly we look at the number 89. Fortunately 89 turns out to be a good choice. The following section lists the expansion of 89.
89 179 359 719 1439 2879 5759 = 13 * 443 11519 23039 46079 = 11 * 59 * 71 92159 = 157 * 587 184319 = 19 * 89 * 109 368639 = 43 * 8573 737279 1474559 2949119 5898239 = 643 * 9173 11796479 = 41 * 149 * 1931 23592959 = 13 * 1814843 47185919 = 11 * 4289629 94371839 = 881 * 107119 188743679 377487359 = 53 * 79 * 89 * 1013 754974719 = 29 * 1129 * 23059 1509949439 = 107 * 14111677
You can run this test yourself with this C program.
#includeint main() { int val = 89; int indx; char buf[80]; for (indx=0; indx < 26; ++indx) { sprintf(buf, "factor %d", val); system(buf); val += val + 1; } return 0; }
89 is a good size. There might be better sizes for hash tables that are known to be of large size. The above test program can be used to find them.
What does the Java library use as a default hash table size if you do not specify anything - is it 89? No, the source code indicates 101. This is a number we rejected in the initial step.