如何将UTF-16转换为UTF-32并在C中打印生成的wchar_t？

我正在尝试打印出一串UTF-16字符。我暂时发布了这个问题，给出的建议是使用iconv转换为UTF-32并将其打印为一串wchar_t。

我做了一些研究，并成功编写了以下代码：

// *c is the pointer to the characters (UTF-16) i'm trying to print // sz is the size in bytes of the input i'm trying to print iconv_t icv; char in_buf[sz]; char* in; size_t in_sz; char out_buf[sz * 2]; char* out; size_t out_sz; icv = iconv_open("UTF-32", "UTF-16"); memcpy(in_buf, c, sz); in = in_buf; in_sz = sz; out = out_buf; out_sz = sz * 2; size_t ret = iconv(icv, &in, &in_sz, &out, &out_sz); printf("ret = %d\n", ret); printf("*** %ls ***\n", ((wchar_t*) out_buf));

iconv调用总是返回0，所以我猜转换应该没问题？

但是，印刷似乎很受欢迎。有时，转换后的wchar_t字符串打印正常。其他时候，它似乎在打印wchar_t时遇到问题，并且完全终止printf函数调用，使得即使是尾随的“***”也不会被打印。

我也试过用

 wprintf(((wchar_t*) "*** %ls ***\n"), out_buf));

但什么都没有打印出来。

我错过了什么吗？

参考：如何在C中打印UTF-16字符？

UPDATE

在评论中纳入了一些建议。

更新的代码：

 // *c is the pointer to the characters (UTF-16) i'm trying to print // sz is the size in bytes of the input i'm trying to print iconv_t icv; char in_buf[sz]; char* in; size_t in_sz; wchar_t out_buf[sz / 2]; char* out; size_t out_sz; icv = iconv_open("UTF-32", "UTF-16"); memcpy(in_buf, c, sz); in = in_buf; in_sz = sz; out = (char*) out_buf; out_sz = sz * 2; size_t ret = iconv(icv, &in, &in_sz, &out, &out_sz); printf("ret = %d\n", ret); printf("*** %ls ***\n", out_buf); wprintf(L"*** %ls ***\n", out_buf);

仍然是相同的结果，并非所有UTF-16字符串都被打印（printf和wprintf）。

我还能错过什么？

顺便说一下，我正在使用Linux，并且已经validationwchar_t是4个字节。

这是一个简短的程序，它将UTF-16转换为宽字符数组，然后将其打印出来。

 #include  #include  #include  #include  #include  #include  #include  #define FROMCODE "UTF-16" #if (BYTE_ORDER == LITTLE_ENDIAN) #define TOCODE "UTF-32LE" #elif (BYTE_ORDER == BIG_ENDIAN) #define TOCODE "UTF-32BE" #else #error Unsupported byte order #endif int main(void) { void *tmp; char *outbuf; const char *inbuf; long converted = 0; wchar_t *out = NULL; int status = EXIT_SUCCESS, n; size_t inbytesleft, outbytesleft, size; const char in[] = { 0xff, 0xfe, 'H', 0x0, 'e', 0x0, 'l', 0x0, 'l', 0x0, 'o', 0x0, ',', 0x0, ' ', 0x0, 'W', 0x0, 'o', 0x0, 'r', 0x0, 'l', 0x0, 'd', 0x0, '!', 0x0 }; iconv_t cd = iconv_open(TOCODE, FROMCODE); if ((iconv_t)-1 == cd) { if (EINVAL == errno) { fprintf(stderr, "iconv: cannot convert from %s to %s\n", FROMCODE, TOCODE); } else { fprintf(stderr, "iconv: %s\n", strerror(errno)); } goto error; } size = sizeof(in) * sizeof(wchar_t); inbuf = in; inbytesleft = sizeof(in); while (1) { tmp = realloc(out, size + sizeof(wchar_t)); if (!tmp) { fprintf(stderr, "realloc: %s\n", strerror(errno)); goto error; } out = tmp; outbuf = (char *)out + converted; outbytesleft = size - converted; n = iconv(cd, (char **)&inbuf, &inbytesleft, &outbuf, &outbytesleft); if (-1 == n) { if (EINVAL == errno) { /* junk at the end of the buffer, ignore it */ break; } else if (E2BIG != errno) { /* unrecoverable error */ fprintf(stderr, "iconv: %s\n", strerror(errno)); goto error; } /* increase the size of the output buffer */ converted = size - outbytesleft; size <<= 1; } else { /* done */ break; } } converted = (size - outbytesleft) / sizeof(wchar_t); out[converted] = L'\0'; fprintf(stdout, "%ls\n", out); /* flush the iconv buffer */ iconv(cd, NULL, NULL, &outbuf, &outbytesleft); exit: if (out) { free(out); } if (cd) { iconv_close(cd); } exit(status); error: status = EXIT_FAILURE; goto exit; }

由于UTF-16是一种可变长度编码，因此您猜测输出缓冲区需要多大。正确的程序应该处理输出缓冲区不足以容纳转换数据的情况。

您还应注意iconv不为NULL您输出缓冲区。

Iconv是面向流的处理器，因此如果要将其重新用于另一次转换，则需要刷新iconv_t （示例代码在接近结束时执行此操作）。如果你想进行流处理，你将处理EINVAL错误，将输入缓冲区中剩余的任何字节复制到新输入缓冲区的开头，然后再次调用iconv 。

如何将UTF-16转换为UTF-32并在C中打印生成的wchar_t？

如何在运行时创建一个mex函数printf？

在C中有效计算kronecker产品

C – 尝试返回文件中的上一行

将图像拆分为64×64块

…未定义引用… collect2：ld返回1退出状态

如何将三元运算符合并到优先攀爬算法中？

malloc和calloc是如何以不同的签名结束的？

C语言中不同宏function/内联方法的优缺点

创建仅在预定义试用期（评估期）内工作的程序的最佳方法是什么？

在C中看起来像三角形的输出