VM /解释器的性能改进策略?

我用C语言编写了一个简单的VM,使用简单的指令切换,无需任何指令解码,但性能非常糟糕。

对于简单的aritmetic操作,对于相同的操作,VM比本机C代码慢大约4000倍。 我测试了一组长度为1000万的数组,第一组由程序指令组成,随机+ – * /操作,2个数组保存随机整数,第三个数组是操作目标存储。

我期待看到算术性能下降3-4倍,所以`4000x真的让我感到震惊。 即使是最慢的解释语言似乎也能提供更高的性能。 所以,我的方法出错了,如何在不使用JIT编译到机器代码的情况下提高性能?

实施是……基本上我能想到的最简单:

begin: { switch (*(op+(c++))) { case 0: add(in1+c, in2+c, out+c); goto begin; case 1: sub(in1+c, in2+c, out+c); goto begin; case 2: mul(in1+c, in2+c, out+c); goto begin; case 3: div(in1+c, in2+c, out+c); goto begin; case 4: cout << "end of program" << endl; goto end; default: cout << "ERROR!!!" << endl; } } end: 

更新:当我注意到我用于配置文件的QElapsedTimer实际上已被破坏时,我正在玩弄程序的长度。 现在我正在使用clock()函数,根据它,计算的goto实际上与本机代码相同,可能稍微低一点。 这个结果合法吗? 这是完整的来源(我知道它很难看,它毕竟只是用于测试):

 #include  #include  #include  #include  using namespace std; #define LENGTH 70000000 void add(int & a, int & b, int & r) {r = a * b;} void sub(int & a, int & b, int & r) {r = a - b;} void mul(int & a, int & b, int & r) {r = a * b;} void div(int & a, int & b, int & r) {r = a / b;} int main() { char * op = new char[LENGTH]; int * in1 = new int[LENGTH]; int * in2 = new int[LENGTH]; int * out = new int[LENGTH]; for (int i = 0; i < LENGTH; ++i) { *(op+i) = i % 4; *(in1+i) = qrand(); *(in2+i) = qrand()+1; } *(op+LENGTH-1) = 4; // end of program long long sClock, fClock; unsigned int c = 0; sClock = clock(); cout << "Program begins" << endl; static void* table[] = { &&do_add, &&do_sub, &&do_mul, &&do_div, &&do_end, &&do_err, &&do_fin}; #define jump() goto *table[op[c++]] jump(); do_add: add(in1[c], in2[c], out[c]); jump(); do_sub: sub(in1[c], in2[c], out[c]); jump(); do_mul: mul(in1[c], in2[c], out[c]); jump(); do_div: div(in1[c], in2[c], out[c]); jump(); do_end: cout << "end of program" << endl; goto *table[6]; do_err: cout << "ERROR!!!" << endl; goto *table[6]; do_fin: fClock = clock(); cout << fClock - sClock << endl; delete [] op; delete [] in1; delete [] in2; delete [] out; in1 = new int[LENGTH]; in2 = new int[LENGTH]; out = new int[LENGTH]; for (int i = 0; i < LENGTH; ++i) { *(in1+i) = qrand(); *(in2+i) = qrand()+1; } cout << "Native begins" << endl; sClock = clock(); for (int i = 0; i < LENGTH; i += 4) { *(out+i) = *(in1+i) + *(in2+i); *(out+i+1) = *(in1+i+1) - *(in2+i+1); *(out+i+2) = *(in1+i+2) * *(in2+i+2); *(out+i+3) = *(in1+i+3) / *(in2+i+3); } fClock = clock(); cout << fClock - sClock << endl; delete [] in1; delete [] in2; delete [] out; return 0; } 

Darek Mihocka在便携式C中创建快速解释器方面做了很好的深入研究: http ://www.emulators.com/docs/nx25_nostradamus.htm