20080308, 07:45  #1 
Jun 2003
Oxford, UK
3·661 Posts 
Twin primes expected first k formula
I am having problems trying to find an empirical formula to calculate the expected k value of the first twin prime for a given n in k*2^n+/1. Can anyone help?

20080308, 20:33  #2  
Feb 2007
110110000_{2} Posts 
Quote:
but I don't know if the specification of the problem is sufficiently precise... what does "expected" mean here ? there must be some averaging done  on what ? should the estimation just take into account the size of n? or some properties like numdiv(), ... ? 

20080309, 01:37  #3 
"William"
May 2003
New Haven
23·103 Posts 
Do you mean empirical or heuristic?
For a quick and dirty heuristic we can ignore the additional information given by k, and observe the probability that any two odd numbers x and x+2 are a twin pair is 4C/ln(x)^2, so the expected value of k is approximately 0.182 n^{2} For a more refined estimate we can argue analogously to the twin prime constant. The twin prime constant can be developed by the argument that the probability that either number is prime is 1/ln(x). But the probabilities aren't independent. For p>2, the probability each is not divisible by p is (p1)/p, so the independent approximation would have the probability that neither is divisible by p as ((p1)/p)^{2}. But the real probability that neither is divisible by p is (p2)/p. So the independent probability needs to increased by the factor p*(p2)/(p1)^{2} to compensate for the error in divisibility by p. Doing the same thing for all values of p gives the twin prime constant. p=2 must be argued separatelyThe presence of "k" in the fomula changes this argument to: the probability that either number is prime is 1/ln(x). But the probabilities aren't independent. For p>3, the probability each is not divisible by p is (p1)/p, so the independent approximation would have the probability that neither is divisible by p as ((p1)/p)^{2}. But the real probability that neither is divisible by p depends on k. With probability 1/p, p divides k and hence divides neither number. With probability (p1)/p p does not divide k. In this case there are p1 values near x with one divisible p, so the probability neither of our pair is divisible by p is (p3)/(p1). Hence the probability neither of these is divisible by p isSo you could evaluate this adjusted twin constant. Or, if you really meant empirical rather than heurisitc, you could fit the known values of k to k(n) = x * n^{2}. 
20080309, 07:20  #4 
Jun 2003
Oxford, UK
1983_{10} Posts 
Thank you for your responses.
Apologies for using empirical loosely. Actually I mean empirical as well as heuristic! http://www.mersenneforum.org/showthr...211#post128211 posts 33 and 41 have the factual first k for n from 1 to 2000. Using 0.182 n^2 has the advantage that it is simple and depends only on n, but it appears to give results that are too low. For example, at n=2000, the formula gives k=728000, but a moving average over the previous 50 n gives a k value in excess of 1 million for all n>1740. An important observation is that it is a polynomial relationship to k and not logarithmic relationship. The dependence on k only comes from the fact that we are looking for the first k. This k has the additional property that no k smaller than it provides a twin. So I would think some formula might exist that would have to evaluate the product of probabilities of a twin that every k up the chosen value has, and then deducing a multiplier constant to the resulting probability product, to match the average observed values. How one does this I do not know. In terms of probability, one empirical hunch I had is that A=ln(ln(k/n)) > 2. 87% of A values for n from 1501 to 2000 fall in the range 1.5 to 2.1 most exceptions between 2 and 2.1 and the trend of A values looks to be increasing quite slowly  at 2000 A is approx 1.73 on a 50 moving average and 1.77 on a logarithmic trend line. Regarding variables, I am looking for a formula that depends only on n. My reason for pursuing this is to prove (at least to myself) that randomly targeting 333333 for example, is not the most efficient way to find a large twin. To begin with, 333333 might be a rogue n with no small k values that are twins. The A>1.5 observation, suggests that it is not worth testing small k, which has been done by the Twin Prime Search. It might be more sensible to start testing for a twin at close to the expected k and then expanding out (up and down) from there. Certainly testing more than one n seems sensible, as it is possible to apply portfolio theory to overcome the reliance on one n. 
20080317, 13:36  #5 
Jun 2003
Oxford, UK
3·661 Posts 
And here is another thing 
An analysis of the first k for n from 1 to 3000 shows a preponderance for 0mod5 and 0mod7 and a lesser preponderance towards 0mod11..... I can understand that k should be 0mod3, (except for n=2), otherwise, one of k*2^n+/1 will be 0mod3 if the other is prime k=xmod5: x count 0 1016 1 489 2 498 3 487 4 510 I suppose this is because k=5#.Y/2 where Y is some integer, guarantees that, if k*2^n1 is prime, then k*2^n+1 is not 0mod3 whereas any other random k might produce this condition on average 50% of the time. Results for xmod7: 0 591 1 387 2 400 3 428 4 417 5 403 6 374 and xmod11: 0 335 1 275 2 270 3 258 4 289 5 270 6 262 7 225 8 256 9 296 10 264 Last fiddled with by robert44444uk on 20080317 at 13:45 
20080324, 12:03  #6 
Jun 2003
Oxford, UK
3·661 Posts 
An analysis of the first 4000 k values, with found first twin k=y and n=x provides "best fit" equations, according to Microsoft Excel:
y = 0.3049x^1.9514 with R2 = 0.6926 The power series seems to consistently underestimate the moving average (over 100n), which appears to increasing along its own curve which mimics the polynomial: y = 0.2566x^2 + 383.78x  174906 (with R2 = 0.2964) rather well. (I am mystified why the power equation produces such a high R2 compared to the polynomial) Plotting the moving average (over previous 100 n) provides a best fit: y = 0.1498x^2.1119 with R2 = 0.9947 And average over 100n looking forward 50n and back 50n: y = 0.5442x^1.951 R2 = 0.9956 I am taking the actual data to much higher levels to see whether I can get a better feel for it. When the values are viewed graphically they appear like atoms bubbling off a substrate, especially when a relatively narrow bank of values is taken, such as, say n=3000 to n=4000 Last fiddled with by robert44444uk on 20080324 at 12:04 
20080325, 11:37  #7  
Nov 2003
2^{2}×5×373 Posts 
Quote:
You are trying to apply "curve fitting" without any use of the underlying mathematics. Anyone can fit a curve to data. Whether that curve is *meaningful* depends whether it models the underlying process(es). In this case, your curves have no meaning. I suggest that you look closely at Hardy & Wright's "Introduction to Number Theory" where they make conjectures about the distribution of twin primes. Theoretically, the number of twin primes of the form k*2^n +/ 1 less than N will be O(N/ [log(k*2^n+1)]^2) as either k varies or as n varies while the other is held fixed. The implied constant in the O() estimate can be computed (in theory) as a singular infinite product. Look again at Hardy & Wright's derivation of the twin prime constant. 

20080325, 12:44  #8 
Jun 2003
Oxford, UK
3×661 Posts 
Bob, thank you for the guidance and direction, I am certain it is spot on. Will look it up for sure from my ecopy of H&W, although I must admit, I am skating at the very edge of what I can actually understand with that book. Large parts of it are unintelligible to the layman! I am slowly getting comfortable with pi(2) given the excellent and confidence building paper by DA Goldston "Are there infinitely many twin primes?"
I will for sure come back when I understand the O and N in your formula Last fiddled with by robert44444uk on 20080325 at 12:51 
20080325, 13:54  #9  
Nov 2003
2^{2}·5·373 Posts 
Quote:
and will be hard to read for a "layman". What I say now is no reflection on you personally: I argue that a "layman" trying to do foundationless curve fitting to a mathematical process that the "layman" does not understand is a case of "trying to run before you walk". It is NOT something that a "layman" can reasonably do, any more than a "layman" can do (say) brain surgery. 

20080326, 04:32  #10 
Jun 2003
Oxford, UK
3×661 Posts 
Tried reading H&W last night, but maybe I should try in the mornings, and not with a glass of wine in my hand.
In answer to Bob, Maybe that is why I posted this here, knowing that it was beyond my abilities! If I knew how to arrive at the answer, why would I post? That is why I really appreciate the responses to the post. The reason for curve fitting is to ensure, at least in my mind, that a formula of the kind A*n^2, which has basis in the underlying mathematics, would be appropriate. The fact that I am getting best fit curve fitting of the form a*n^1.95 and b*n^2.11 suggest that in fact, A*n^2 is probably the underlying formula, even if it does not give best fit, it would be close. wblipp has placed A=0.182 with an underlying theoretical basis involving the Twin Prime Constant, but the results suggest A is in fact larger than that. Curve fitting over many thousands of values will at least give an approximation to A that can be retrofitted into a theory, I would have thought. I have the confidence I think, in the next set of curve fitting to search for a value of A. I will do this by fitting a curve A*n^2 to moving average values that look forward 50 values and back 50 values. At least I will establish something to mull over. Last fiddled with by robert44444uk on 20080326 at 04:34 
20080326, 07:23  #11 
Jun 2003
Oxford, UK
3×661 Posts 
Line of best fit of the form k=A*n^2 for the first 4621 n values, comparing the moving average of 50 values n25 to n+25, appears to be A=0.368, which is close to twice wblipps A=0.182
Now it will be interesting to back into the theory Applying to n=333333, and using A=0.364, then the average expected first prime, on average, would be at k=40.4G. The TPS site is sieving to 100G which is quoted at providing 90% certainty. Last fiddled with by robert44444uk on 20080326 at 07:59 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Mersenne Primes p which are in a set of twin primes is finite?  carpetpool  Miscellaneous Math  3  20170810 13:47 
The expected number of primes  Batalov  Computer Science & Computational Number Theory  5  20160811 01:17 
Expected number of primes in OEIS A007908  ewmayer  Probability & Probabilistic Number Theory  6  20151110 16:33 
I get 13% less primes than I expected:(  mart_r  Math  2  20101029 17:31 
Odds of prime / expected # of primes  gd_barnes  Riesel Prime Search  15  20101014 22:00 