Windows格式

我在Windows API项目中具有以下C函数，该项目读取文件并基于行结尾（UNIX，MAC，DOS），它用Windows的右行结尾替换行结尾（ \r\n ）：

 // Standard C header needed for string functions #include  // Defines for line-ending conversion function #define LESTATUS INT #define LE_NO_CHANGES_NEEDED (0) #define LE_CHANGES_SUCCEEDED (1) #define LE_CHANGES_FAILED (-1) ///  /// If the line endings in a block of data loaded from a file contain UNIX (\n) or MAC (\r) line endings, this function replaces it with DOS (\r\n) endings. /// 
 /// An array of bytes of input data. /// The size, in bytes, of inData. /// An array of bytes to be populated with output data. This array must already be allocated /// The maximum number of bytes that can be stored in outData. /// A pointer to an integer that receives the number of bytes written into outData. ///  /// If no changes were necessary (the file already contains \r\n line endings), then the return value is LE_NO_CHANGES_NEEDED.
 /// If changes were necessary, and it was possible to store the entire output buffer, the return value is LE_CHANGES_SUCCEEDED.
 /// If changes were necessary but the output buffer was too small, the return value is LE_CHANGES_FAILED.
 ///  LESTATUS ConvertLineEndings(BYTE* inData, INT inLen, BYTE* outData, INT outLen, INT* bytesWritten) { char *posR = strstr(inData, "\r"); char *posN = strstr(inData, "\n"); // Case 1: the file already contains DOS/Windows line endings. // So, copy the input array into the output array as-is (if we can) // Report an error if the output array is too small to hold the input array; report success otherwise. if (posN != NULL && posR != NULL) { if (outLen >= inLen) { strcpy(outData, inData); return LE_NO_CHANGES_NEEDED; } return LE_CHANGES_FAILED; } // Case 2: the file contains UNIX line endings. else if (posN != NULL && posR == NULL) { int i = 0; int track = 0; for (i = 0; i outLen) return LE_CHANGES_FAILED; } else { outData[track] = '\r'; track++; if (track > outLen) return LE_CHANGES_FAILED; outData[track] = '\n'; track++; if (track > outLen) return LE_CHANGES_FAILED; } *bytesWritten = track; } } // Case 3: the file contains Mac-style line endings. else if (posN == NULL && posR != NULL) { int i = 0; int track = 0; for (i = 0; i outLen) return LE_CHANGES_FAILED; } else { outData[track] = '\r'; track++; if (track > outLen) return LE_CHANGES_FAILED; outData[track] = '\n'; track++; if (track > outLen) return LE_CHANGES_FAILED; } *bytesWritten = track; } } return LE_CHANGES_SUCCEEDED; }

但是，我觉得这个function很长（差不多70行），可能会以某种方式减少。我在Google上搜索过但找不到任何有用的东西; 在C库或Windows API中是否有任何函数允许我执行字符串替换而不是在O（n）时间内逐字节手动搜索字符串？

每个角色都需要精确地看一次，而不是更多而不是更少。代码的第一行已经进行了重复比较，因为两个strstr调用都从相同的位置开始。你可以使用类似的东西

 char *posR = strstr(inData, "\r"); if (posR && posR[1] == '\n') // Case 1: the file already contains DOS/Windows line endings.

如果失败了，如果你确实找到了\r ，那么继续从你结束的地方继续，如果posR == NULL ，再从顶部开始。但是你让strstr已经“看着”每一个角色直到最后！

另外两个说明：

因为你正在寻找一个角色，所以不需要strstr ; 下次使用strchr ;
strXXX函数都假设你的输入是一个正确形成的C字符串：它应该以终止0结束。但是，您已经在inLen提供了长度，因此您不必检查零。如果在inLen字节之前输入中可能存在或者可能不存在0 ，则需要采取适当的操作。基于此函数的目的，我假设您根本不需要检查零。

我的建议：从一开始就查看每个字符，只有当它是\r 或 \n时才采取行动。如果您遇到的第一个是\r 而下一个是\n ，那么您就完成了。（这假设行结尾不是“混合”。）

如果你没有在第一个循环中返回，那么除了\r\n之外还有其他东西，你可以继续从那一点开始。但你仍然只需要对\r 或 \n ！所以我建议这个更短的代码（和enum而不是你的定义）：

 enum LEStatus_e { LE_CHANGES_FAILED=-1, LE_NO_CHANGES_NEEDED, LE_CHANGES_SUCCEEDED }; enum LEStatus_e ConvertLineEndings(BYTE *inData, INT inLen, BYTE *outData, INT outLen, INT *bytesWritten) { INT sourceIndex = 0, destIndex; if (outLen < inLen) return LE_CHANGES_FAILED; /* Find first occurrence of either \r or \n This will return immediately for No Change Needed */ while (sourceIndex < inLen) { if (inData[sourceIndex] == '\r') { if (sourceIndex < inLen-1 && inData[sourceIndex+1] == '\n') { memcpy (outData, inData, inLen); *bytesWritten = inLen; return LE_NO_CHANGES_NEEDED; } break; } if (inData[sourceIndex] == '\n') break; sourceIndex++; } /* We processed this far already: */ memcpy (outData, inData, sourceIndex); if (sourceIndex == inLen) return LE_NO_CHANGES_NEEDED; destIndex = sourceIndex; while (sourceIndex < inLen) { switch (inData[sourceIndex]) { case '\n': case '\r': sourceIndex++; if (destIndex+2 >= outLen) return LE_CHANGES_FAILED; outData[destIndex++] = '\r'; outData[destIndex++] = '\n'; break; default: outData[destIndex++] = inData[sourceIndex++]; } } *bytesWritten = destIndex; return LE_CHANGES_SUCCEEDED; }

有一些古老而罕见的“纯文本”格式使用其他结构; 来自记忆，像\r\n\n 。如果你想能够清理任何东西 ，你可以在单个\n之后为所有\r s添加一个跳过，对于相反的情况也是如此。这也将清理任何“混合”行结尾，因为它也会正确处理\r\n 。

这是我认为更简单的代码，一半是多少行。当然，正如Ben Voigt指出的那样，你无法击败O（n）时间，所以我没有尝试这样做。我没有使用任何库函数，因为它看起来更简单，我怀疑额外的函数调用可以使代码更快。

 enum lestatus { le_no_changes_needed = 0, le_changes_succeeded = 1, le_changes_failed = -1 }; enum lestatus ConvertLineEndings(char *indata, int inlen, char *outdata, int outlen) { int outpos = 0, inpos; enum lestatus it_changed = le_no_changes_needed; for (inpos = 0; inpos outlen) return le_changes_failed; if (indata[inpos] != '\r' && indata[inpos] != '\n') { /* it is an ordinary character, just copy it */ outdata[outpos++] = indata[inpos]; } else if (outpos + 2 > outlen) { return le_changes_failed; } else if ((indata[inpos+1] == '\r' || indata[inpos+1] == '\n') && indata[inpos] != indata[inpos+1]) { /* it is \r\n or \n\r, output it in canonical order */ outdata[outpos++] = '\r'; outdata[outpos++] = '\n'; inpos++; /* skip the second character */ } else { /* it is a mac or unix line ending, convert to dos */ outdata[outpos++] = '\r'; outdata[outpos++] = '\n'; it_changed = le_changes_succeeded; } } return it_changed; }

我的代码中最大的不同之处在于

我使用了增量运算符。
为简单起见，我避免了库函数。
我的函数正确处理混合结尾文件（在我的解释中）。
我更喜欢小写字符。这显然是一种风格偏好。
我比#defines更喜欢枚举。也是一种风格偏好。

Win32 / C：将行结尾转换为DOS / Windows格式

这个生产者 – 消费者实施中是否存在竞争条件？

G-WAN处理程序重写解决方案

如何在C中应用最后一个整数提升规则？

当我们用c中的字符串文字初始化一个char数组时，是否会发生垃圾收集？

如何访问计算机的RAM和ROM的所有字节？

如何连接，评估和字符串化宏？

将char 前缀添加到C中的现有char 的最佳方法

在C中读取文件时跳过一行

覆盖malloc以记录分配大小

char 和char *之间的区别？

Win32 / C：将行结尾转换为DOS / Windows格式

这个生产者 – 消费者实施中是否存在竞争条件？

G-WAN处理程序重写解决方案

如何在C中应用最后一个整数提升规则？

当我们用c中的字符串文字初始化一个char数组时，是否会发生垃圾收集？

如何访问计算机的RAM和ROM的所有字节？

如何连接，评估和字符串化宏？

将char *前缀添加到C中的现有char *的最佳方法

在C中读取文件时跳过一行

覆盖malloc以记录分配大小

char 和char *之间的区别？

将char 前缀添加到C中的现有char 的最佳方法