iconv编码转换问题

我在将字符串从utf8转换为gb2312时遇到问题。 我的转换function如下

void convert(const char *from_charset,const char *to_charset, char *inptr, char *outptr) { size_t inleft = strlen(inptr); size_t outleft = inleft; iconv_t cd; /* conversion descriptor */ if ((cd = iconv_open(to_charset, from_charset)) == (iconv_t)(-1)) { fprintf(stderr, "Cannot open converter from %s to %s\n", from_charset, to_charset); exit(8); } /* return code of iconv() */ int rc = iconv(cd, &inptr, &inleft, &outptr, &outleft); if (rc == -1) { fprintf(stderr, "Error in converting characters\n"); if(errno == E2BIG) printf("errno == E2BIG\n"); if(errno == EILSEQ) printf("errno == EILSEQ\n"); if(errno == EINVAL) printf("errno == EINVAL\n"); iconv_close(cd); exit(8); } iconv_close(cd); } 

这是我如何使用它的一个例子:

 int len = 1000; char *result = new char[len]; convert("UTF-8", "GB2312", some_string, result); 

编辑:我大部分时间都会收到E2BIG错误。

outleft应该是输出缓冲区的大小(例如1000字节),而不是传入字符串的大小。

转换时,字符串长度通常会在整个过程中发生变化,直到事后才知道它会持续多长时间。 E2BIG意味着输出缓冲区不够大,在这种情况下你需要给它更多的输出缓冲区空间(注意它已经转换了一些数据并相应地调整了传递给它的四个变量)。

正如其他人所说,E2BIG意味着输出缓冲区不足以进行转换,并且您使用了错误的值来进行转发。

但我也注意到你的function还有其他一些问题。 也就是说,通过函数的工作方式,调用者无法知道输出字符串中有多少字节。 你的convert()函数既不会终止输出缓冲区,也不会告诉其调用者它写入outptr的字节数。

如果你想处理nul-terminates字符串(并且看起来你想要做什么,因为你的输入字符串是nul终止的),你可能会发现以下方法更好:

 char * convert (const char *from_charset, const char *to_charset, const char *input) { size_t inleft, outleft, converted = 0; char *output, *outbuf, *tmp; const char *inbuf; size_t outlen; iconv_t cd; if ((cd = iconv_open (to_charset, from_charset)) == (iconv_t) -1) return NULL; inleft = strlen (input); inbuf = input; /* we'll start off allocating an output buffer which is the same size * as our input buffer. */ outlen = inleft; /* we allocate 4 bytes more than what we need for nul-termination... */ if (!(output = malloc (outlen + 4))) { iconv_close (cd); return NULL; } do { errno = 0; outbuf = output + converted; outleft = outlen - converted; converted = iconv (cd, (char **) &inbuf, &inleft, &outbuf, &outleft); if (converted != (size_t) -1 || errno == EINVAL) { /* * EINVAL An incomplete multibyte sequence has been encoun- * tered in the input. * * We'll just truncate it and ignore it. */ break; } if (errno != E2BIG) { /* * EILSEQ An invalid multibyte sequence has been encountered * in the input. * * Bad input, we can't really recover from this. */ iconv_close (cd); free (output); return NULL; } /* * E2BIG There is not sufficient room at *outbuf. * * We just need to grow our outbuffer and try again. */ converted = outbuf - out; outlen += inleft * 2 + 8; if (!(tmp = realloc (output, outlen + 4))) { iconv_close (cd); free (output); return NULL; } output = tmp; outbuf = output + converted; } while (1); /* flush the iconv conversion */ iconv (cd, NULL, NULL, &outbuf, &outleft); iconv_close (cd); /* Note: not all charsets can be nul-terminated with a single * nul byte. UCS2, for example, needs 2 nul bytes and UCS4 * needs 4. I hope that 4 nul bytes is enough to terminate all * multibyte charsets? */ /* nul-terminate the string */ memset (outbuf, 0, 4); return output; }