基于汉明重量的索引
假设我们有一个整数的bitsize n=4;
我所描述的问题是如何根据汉明权重及其知道bitsize
值将数字索引到数组位置。 例如,一个包含16个用于bitsize 4的元素的数组将会/看起来像这样:
|0|1|2|4|8|3|5|6|9|10|12|7|11|13|14|15|
元素按其汉明重量(必要)分组,并根据大小(不必要)排序。 只要你可以采取例如3(0011)做一些操作并取回索引5,5(0101) – > 6等,就不需要排序。
将出现n
位的所有组合,并且不会重复。 例如, 3
bitsize将具有数组:
|0|1|2|4|3|5|6|7|
我最好有一个没有循环的解决方案。 或任何讨论simillar解决方案的论文。 或者最后只是抛出任何关于如何做到这一点的想法。
请注意,您可以使用以下函数枚举具有相同汉明重量的数字(按计数顺序):
int next(int n) { // get the next one with same # of bits set int lo = n & -n; // lowest one bit int lz = (n + lo) & ~n; // lowest zero bit above lo n |= lz; // add lz to the set n &= ~(lz - 1); // reset bits below lz n |= (lz / lo / 2) - 1; // put back right number of bits at end return n; } int prev(int n) { // get the prev one with same # of bits set int y = ~n; y &= -y; // lowest zero bit n &= ~(y-1); // reset all bits below y int z = n & -n; // lowest set bit n &= ~z; // clear z bit n |= (z - z / (2*y)); // add requried number of bits below z return n; }
例如,在x = 5678上重复应用prev():
0: 00000001011000101110 (5678) 1: 00000001011000101101 (5677) 2: 00000001011000101011 (5675) 3: 00000001011000100111 (5671) 4: 00000001011000011110 (5662) 5: 00000001011000011101 (5661) 6: 00000001011000011011 (5659) .....
因此理论上你可以通过重复应用来计算数字的索引。 然而,这可能需要很长时间。 更好的方法是“跳过”某些组合。
有两个规则:
1. if the number starts with: ..XXX10..01..1 we can replace it by ..XXX0..01..1 adding corresponding number of combinations 2. if the number starts with: ..XXX1..10..0 again replace it by XXX0..01..1 with corresponding number of combinations
以下算法计算具有相同汉明权重的数字中的数字的索引(我不打扰快速实现二项式):
#define LOG2(x) (__builtin_ffs(x)-1) int C(int n, int k) { // simple implementation of binomial int c = n - k; if(k < c) std::swap(k,c); if(c == 0) return 1; if(k == n-1) return n; int b = k+1; for(int i = k+2; i <= n; i++) b = b*i; for(int i = 2; i <= c; i++) b = b / i; return b; } int position_jumping(unsigned x) { int index = 0; while(1) { if(x & 1) { // rule 1: x is of the form: ..XXX10..01..1 unsigned y = ~x; unsigned lo = y & -y; // lowest zero bit unsigned xz = x & ~(lo-1); // reset all bits below lo unsigned lz = xz & -xz; // lowest one bit after lo if(lz == 0) // we are in the first position! return index; int nn = LOG2(lz), kk = LOG2(lo)+1; index += C(nn, kk); // C(n-1,k) where n = log lz and k = log lo + 1 x &= ~lz; //! clear lz bit x |= lo; //! add lo } else { // rule 2: x is of the form: ..XXX1..10..0 int lo = x & -x; // lowest set bit int lz = (x + lo) & ~x; // lowest zero bit above lo x &= ~(lz-1); // clear all bits below lz int sh = lz / lo; if(lz == 0) // special case meaning that lo is in the last position sh=((1<<31) / lo)*2; x |= sh-1; int nn = LOG2(lz), kk = LOG2(sh); if(nn == 0) nn = 32; index += C(nn, kk); } std::cout << "x: " << std::bitset<20>(x).to_string() << "; pos: " << index << "\n"; } }
例如,给定数字x = 5678,算法将仅在4次迭代中计算其索引:
x: 00000001011000100111; pos: 4 x: 00000001011000001111; pos: 9 x: 00000001010000011111; pos: 135 x: 00000001000000111111; pos: 345 x: 00000000000001111111; pos: 1137
注意,1137是具有相同汉明权重的数字组中的5678的位置。 因此,您必须相应地移动此索引以考虑具有较小汉明权重的所有数字
这是一个概念工作,只是为了开始讨论。
第一步是最困难的 – 使用近似值求解计算阶乘。
还有更好的想法吗?
Ideone链接
#include #include //gamma function using Lanczos approximation formula //output result in log base e //use exp() to convert back //has a nice side effect: can store large values in small [power of e] form double logGamma(double x) { double tmp = (x-0.5) * log(x+4.5) - (x+4.5); double ser = 1.0 + 76.18009173 / (x+0) - 86.50532033 / (x+1) + 24.01409822 / (x+2) - 1.231739516 / (x+3) + 0.00120858003 / (x+4) - 0.00000536382 / (x+5); return tmp + log(ser * sqrt(2*M_PI) ); } //result from logGamma() are actually (n-1)! double combination(int n, int r) { return exp(logGamma(n+1)-( logGamma(r+1) + logGamma(n-r+1) )); } //primitive hamming weight counter int hWeight(int x) { int count, y; for (count=0, y=x; y; count++) y &= y-1; return count; } //------------------------------------------------------------------------------------- //recursively find the previous group's "hamming weight member count" and sum them int rCummGroupCount(int bitsize, int hw) { if (hw <= 0 || hw == bitsize) return 1; else return round(combination(bitsize, hw)) + rCummGroupCount(bitsize,hw-1); } //------------------------------------------------------------------------------------- int main(int argc, char* argv[]) { int bitsize = 4, integer = 14; int hw = hWeight(integer); int groupStartIndex = rCummGroupCount(bitsize,hw-1); printf("bitsize: %d\n", bitsize); printf("integer: %d hamming weight: %d\n", integer, hw); printf("group start index: %d\n", groupStartIndex); }
输出:
bitsize:4
整数:14汉明重量:3
小组开始指数:11