This is only a partial answer but it makes some progress towards a faster way to find the required optimal expression (if it exists) which I think is now worth sharing. They are mostly heuristics to put bounds on the search parameters a,b,c
narrowing down the search window and a couple of new functional forms.
The most interesting new serendipitous discovery is that a single xor based pattern can apparently catch ~80% of known valid solutions (they would also be found by other rules). One rule to bind them all...
((x - c) ^ b) < a // -xor<
The objective is to reformulate the conditional range test.
((x >= A) && (x < B)) || (( x >= C) && (x < D))
where A < B < C < D and C > B+1 the expression can also be written more symmetrically:
(x >= A) ^ (x < B) ^ (x >= C) ^ (x < D)
I have made a full brute force attack on the entire range of valid possibilities word widths 3, 4, 5 & 6 and from those results it is possible to offer some heuristics and a couple of new canonical forms. It may be possible to do iterative deepening starting with these partial answers as cribs (work in progress).
@njuffa originally suggested:
((x & b) - c) < a // and-<
((x | b) - c) < a // or-<
((x ^ b) - c) < a // xor-<
My notation for the rules shows the logic function written longhand and arithmetic ones as themselves. I concur that `or` and `and` find equivalent solutions although one form might be preferred if one of a,b,c is zero. The corresponding equality tests do not create any new solutions (though might have a simpler form).
To get a feel for the patterns I histogrammed the frequency of occurrence of values ABCD in valid solutions. Brute force search of the entire parameter space is O(N^7)
so the problem gets out of hand quickly. The biggest problem is that to be sure of finding every solution all the failures have to be searched to exhaustion (very time consuming. I settled on a crude approximation that appear to get about 99% success rate.
These are summary results using Njuffa's original functions (I'll fill in gaps when I have results).
log2N | N | possible | solutions |
---|---|---|---|
3 | 8 | 35 | 10 |
4 | 16 | 1365 | 119 |
5 | 32 | 31465 | 852 |
6 | 64 | 595665 | ? |
7 | 128 | 10334625 | ? |
8 | 256 | 172061505 | ? |
testN 3 8 count = 35/4096 fraction = 0.008545 solutions 10
N A B C D
0 : 5 0 0 0
1 : 3 2 0 0
2 : 2 4 0 0
3 : 0 3 1 0
4 : 0 1 2 1
5 : 0 0 3 1
6 : 0 0 4 3
7 : 0 0 0 5
testN 4 16 count = 1365/65536 fraction = 0.020828 solutions 119
N A B C D
0 : 44 0 0 0
1 : 16 8 0 0
2 : 14 16 0 0
3 : 8 11 1 0
4 : 20 24 8 1
5 : 5 10 6 1
6 : 2 15 10 3
7 : 0 10 6 5
8 : 5 11 9 18
9 : 3 3 10 1
10 : 2 5 15 3
11 : 0 3 10 7
12 : 0 3 23 25
13 : 0 0 9 12
14 : 0 0 12 20
15 : 0 0 0 23
testN 5 32 count = 31465/1048576 fraction = 0.030007 solutions 852
[snip]
My canonical functions are:
bool mytarget(uint8_t x)
{
return ((x >= A) && (x < B)) || ((x >= C) && (x < D));
}
bool myproto_and(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x & b) - a) < c; // try a=0 first
}
We are in effect trying by bitwise manipulation create a trap against 0 or 255 so that both original target intervals alias onto the same narrow domain roughly of the form.
abs( x - e) < E
The heuristics I used to make everything go faster are fairly basic but seem to work bounds on ranges etc.
Crucially it accepts and prints the first solution found in FAST mode, but can also print all solutions or more usefully the first found and any where one of `a,b,c` are zero.
The prototype solutions are tested in the order of their frequency finding a solution. It turns out that one novel formula accounts for 85% of the solutions found and that almost all solutions appear to be findable using xor in combination with arithmetic operators. Only a tiny fraction require and/or. Table shows percentages to lg2N = 5.
-xor> | -xor< | xor-< | -and> |
---|---|---|---|
85 | 10 | 5 | <1 |
Additional heuristics on the initial values of the parameters a,b,c.
a and c are additive constants, b is a bitmask applied by a logical operator.
1. Try a = 0
first (this saves the subtraction if a solution exists)
2. Try c = (B-A)+(D-C)
first (seems to often be right)
3. If (B-A) == (D-C)
try also c = (B-A)
, c = 1
, or c= 9
Empirically it seems that rules 2 & 3 taken together cover all of the solutions for c \< N/2
.
Hack here. Try b = c = 255 - (B-A)-(D-C)-11
and loop on c++ until c==0
, and loop on b++ all 256 states.
Originally loops were over A
, B
, C
, D
in strict order. But looking at the solution tables there is merit altering it so that the widest region where a solution might most likely be found is probed first.
A = 0 ; A < N-5; A++
D = 255; D > A+4; D--
B = A+1; A < D-2; B++
C = D-1; C > B+1; C--
FAST mode accepts the first solution found but still probes all failing cases to exhaustion.
CRUDE mode assumes that if no solution found for A=0
, D=N-1
and any B,C
means no solutions. (still experimental and it breaks the total count)
I am worried by the asymmetry of the table. `0` and `1` symbols feel like they should be interchangeable. So the deficit of entries in the higher half of the table (and maybe the lower half too) must be due to missing solutions not found by the set of rules presently being used.
I realise that is too simplistic since the bits have a natural ordering in a number.
Looking for new rules to find the missing cases is worthwhile.
In addition I confirm that `AND &` vs `OR |` achieve essentially the same results. `XOR ^` gets some new ones. None of the other operator combos found anything new. So search focusses on just that pair.
In the search for new simple expressions to fatten up the solutions table I stumbled upon the following.
return (((x) - a) ^ b) \< c; // -xor\<
return (((x) - a) ^ b) \> c; // -xor\>
Swapping the order of operations generates a very decent crop of new solutions previously missing. In fact -xor< aka `xor_lt` finds almost as many solutions as `and` and `or` put together! There may be other fast functional forms using "~", ">>" or "*" that might also get some more. That is possibly another question entirely.
This is a table of how the latest code performs with the additional xor based tests added (which slows it down somewhat). There are hints that `xor_gt` becomes more important as the number of bits increases.
`xor` feels like it ought to be optimal for this trick since it is information preserving. The application of the functions has been ordered to maximise early cutoffs. New candidate functions are added to the end of the list.
lg2N | N | count | solutions | time/s | and | xor | xor_lt | xor_gt |
---|---|---|---|---|---|---|---|---|
3 | 8 | 35 | 19 | 6 | 5 | 5 | 0 | 9 |
4 | 16 | 1365 | 321 | 404 | 53 | 78 | 49 | |
5 | 32 | 31465 | 3040 | 15807 | 385 | 622 | 825 | 1199 |
6 | 64 | \>150000 |
The crude approximation to avoid exhaustive searches on hopeless cases. This risks missing some solutions but seems not to at least on the range of N that I have been able to test fully.
The vast majority will be fails. There are only about 172 million possible solutions for 8 bits out of a state space 2^32 ~= 4 billion.
Trading accuracy for speed with the lastest gofaster stripes all applied gives
lg2N | solutions | time /s |
---|---|---|
3 | 19 | 0 |
4 | 318 | 10 |
5 | 2769 | 147 |
6 | 21400 | 75690 |
Adding one extra functional match
return ((x + a) ^ b) \< c; // +xor\<
increased those scores to 20, 352, 3366 & 24842 respectively.
I'm still looking for any other functions that add to the score. This is the unpolished experimental C sourcecode for the brute force search. It iterates from bit depth 3 to 6. Run time of anything beyond 32
or lg2N = 5
is huge ~10-12 hours for 64
(beyond that impossible without a cluster).
#include <iostream>
#include <time.h>
unsigned int A, B, C, D, N; // globals used by mytarget and brute force
bool mytarget(uint8_t x)
{
return ((x >= A) && (x < B)) || ((x >= C) && (x < D));
}
#define DEBUGPRINT (1)
#define CASE (1)
#define FAST (1)
//#define CRUDE (1)
bool symmetric_form(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return (x >= A) ^ (x < B) ^ (x >= C) ^ (x < D); // naughty
}
// convention used
// a is the target for comparisons
// b is the bitmask
// c is the add/subtract constant
// return ((x [^&|] b) [-+*>>] c) [<=>] a;
// return ((x [-+*>>] c) [^&|] b) [<=>] a;
//
// heuristic for 'a' almost works - order of magnitude faster but misses 3 solutions out of 321 for bits 4
bool myproto_xor(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x ^ b) - c) < a;
}
bool myproto_xor_lt(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x - c) ^ b) < a;
}
bool myproto_xor_gt(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x - c) ^ b) > a;
}
bool myproto_plus_xor_lt(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x + c) ^ b) < a;
}
bool myproto_xor_eq(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x ^ b) - c) == a;
}
bool myproto_xor_and(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x ^ b) & b) == a;
}
bool myproto_and_xor(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x & b) ^ c) == a;
}
bool myproto_and(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x & b) - c) < a;
}
bool myproto_and_lt(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x - c) & b) < a;
}
bool myproto_and_gt(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x - c) & b) > a;
}
bool myproto_and_eq(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x & b) - c) == a;
}
bool myproto_or(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x | b) - c) < a;
}
bool myproto_or_eq(uint8_t x, unsigned a, unsigned b, unsigned c)
{
return ((x | b) - c) == a;
}
unsigned bruteforce_bc(unsigned a, const char* name, bool (*target)(uint8_t), bool (*proto)(uint8_t, unsigned, unsigned, unsigned), bool verbose = false)
{
unsigned b, b0, c, c0, pass = 0;
b0 = B - A + D - C+11;
b = 255 - b0;
b = b0 = 0;
// for (b = 0; b < 256; b++)
do
{
// for (c = 0; c < 256; c++)
c0 = B - A + D - C + 10;
c = 255-c0; // C - B;
c = c0 = 0;
do
{
uint8_t x = 0;
bool ref, res;
do
{
ref = target(x);
res = proto(x, a, b, c & 0xff);
if (res != ref) break;
} while (++x);
if ((ref == res) && (x == 0))
{
pass++;
#if DEBUGPRINT
if (verbose || (pass == 1) || (a == 0) || (b == 0) || c == 0)
if (DEBUGPRINT) printf("\nABCD %3u %3u %3u %3u [%02x,%02x,%02x,%02x] %s %3i abc %3u %3u %3u [%02x,%02x,%02x]", A, B, C, D, A, B, C, D, name, pass, a, b, c, a, b, c);
#endif
#if FAST==1
return pass;
#endif
}
c = (c + 1) & 0xff;
} while (c != c0);
b = (b + 1) & 0xff;
} while (b != b0);
return pass;
}
unsigned bruteforce(const char *name, bool (*target)(uint8_t), bool (*proto)(uint8_t, unsigned, unsigned, unsigned), bool verbose=false)
{
unsigned a, a0, pass = 0;
// if(DEBUGPRINT) printf("\n function %s ", name);
a0 = a = (B - A) + (D - C);
if (bruteforce_bc(a, name, target, proto)) return 1;
if ((B - A) == (D - C))
{
if (bruteforce_bc(a >> 1, name, target, proto)) return 1;
if (bruteforce_bc(1, name, target, proto)) return 1;
}
a = 256 - a;
// a = 0; // reinstate this line for full brute force mode
do
{
if (bruteforce_bc(a, name, target, proto)) return 1;
#ifdef CRUDE
break;
#endif
} while (((++a) & 0xff) != 0); // caution ad hoc may need tweaking
#if (DEBUGPRINT)
if (pass) printf("\n %s has %i solutions\n", name, pass);
#endif
return pass;
}
unsigned tryall(bool (*target)(uint8_t))
{
unsigned pass = 0;
pass += bruteforce("-xor>", target, myproto_xor_gt);
if (FAST && pass) return pass;
pass += bruteforce("-xor<", target, myproto_xor_lt);
if (FAST && pass) return pass;
pass += bruteforce("xor-<", target, myproto_xor);
if (FAST && pass) return pass;
pass += bruteforce("-and>", target, myproto_and_lt);
if (FAST && pass) return pass;
pass += bruteforce("+xor<", target, myproto_plus_xor_lt);
if (FAST && pass) return pass;
// insert new prototype to test here
// pass += bruteforce("-xor=", target, myproto_xor_eq);
// if (FAST && pass) return pass;
return pass;
}
void testN(int lgN)
{
unsigned int i, j, k, l;
unsigned long long N4 = N, count = 0, passcount =0;
unsigned hist[4][256]{0};
time_t start, end;
N = 1 << lgN;
N4 *= N4;
N4 *= N4;
printf("testN %i %u\n", lgN, N);
if (FAST) bruteforce("symmetric xor", mytarget, symmetric_form);
start = time(NULL);
for (i = 0; i < N - 4; i++)
{
A = i;
for (l = N-1; l >= A+3; l--)
{
D = l;
for (j = i + 1; j < l - 1; j++)
{
B = j;
for (k = j + 2; k < l; k++)
{
unsigned pass;
C = k;
count++;
// if (lgN < 5) printf("\n%I64u: ABCD[ %x %x %x %x ] #", count, i, j, k, l);
pass = tryall(mytarget);
if (pass)
{
hist[0][i]++;
hist[1][j]++;
hist[2][k]++;
hist[3][l]++;
passcount++;
}
}
}
}
}
end = time(NULL);
printf("\ncount = %llu/%llu fraction = %f solutions %llu in %llu seconds\n", count, N4, ((float) count)/N4, passcount, end-start);
printf(" N\tA\tB\tC\tD\n");
for (int i = 0; i < 256; i++)
{
unsigned sum=0;
for (int j = 0; j < 4; j++) sum += hist[j][i];
if (sum)
{
printf("\n%3i : ", i);
for (int j = 0; j < 4; j++) printf(" %4i ", hist[j][i]);
}
}
printf("\n");
}
int main()
{
for (int i = 3; i < 8; i++) testN(i);
}
Testing to bit depth 5 seems to be sufficient to find and validate new rules effectiveness (and fairly quick ~3 mins on an i5-12600). Suggestions for improvements to speed and accuracy and any new rules that find additional valid solutions are welcome.