在C中将文本文件读入缓冲区的正确方法？

我正在处理我希望在处理它们时读入缓冲区的小文本文件，所以我想出了以下代码：

... char source[1000000]; FILE *fp = fopen("TheFile.txt", "r"); if(fp != NULL) { while((symbol = getc(fp)) != EOF) { strcat(source, &symbol); } fclose(fp); } ...

这是将文件内容放入缓冲区还是滥用strcat()的正确方法？

然后我遍历缓冲区：

 for(int x = 0; (c = source[x]) != '\0'; x++) { //Process chars }

 char source[1000000]; FILE *fp = fopen("TheFile.txt", "r"); if(fp != NULL) { while((symbol = getc(fp)) != EOF) { strcat(source, &symbol); } fclose(fp); }

这段代码有很多问题：

它非常慢（您一次提取缓冲区一个字符）。
如果filesize超过sizeof(source) ，则容易出现缓冲区溢出。
真的，当你仔细观察它时，这段代码根本不起作用。如手册页所述：

strcat()函数将以null结尾的字符串s2的副本附加到以null结尾的字符串s1的末尾，然后添加一个终止的“\ 0”。

您将一个字符（不是以NUL结尾的字符串！）附加到可能或可能不是NUL终止的字符串。我唯一可以想象这个根据man-page描述工作的是文件中的每个字符是否都是NUL终止的，在这种情况下，这将是毫无意义的。所以是的，这绝对是对strcat()的可怕滥用。

以下是考虑使用的两种替代方案。

如果您提前知道最大缓冲区大小：

 #include  #define MAXBUFLEN 1000000 char source[MAXBUFLEN + 1]; FILE *fp = fopen("foo.txt", "r"); if (fp != NULL) { size_t newLen = fread(source, sizeof(char), MAXBUFLEN, fp); if ( ferror( fp ) != 0 ) { fputs("Error reading file", stderr); } else { source[newLen++] = '\0'; /* Just to be safe. */ } fclose(fp); }

或者，如果你不这样做：

 #include  #include  char *source = NULL; FILE *fp = fopen("foo.txt", "r"); if (fp != NULL) { /* Go to the end of the file. */ if (fseek(fp, 0L, SEEK_END) == 0) { /* Get the size of the file. */ long bufsize = ftell(fp); if (bufsize == -1) { /* Error */ } /* Allocate our buffer to that size. */ source = malloc(sizeof(char) * (bufsize + 1)); /* Go back to the start of the file. */ if (fseek(fp, 0L, SEEK_SET) != 0) { /* Error */ } /* Read the entire file into memory. */ size_t newLen = fread(source, sizeof(char), bufsize, fp); if ( ferror( fp ) != 0 ) { fputs("Error reading file", stderr); } else { source[newLen++] = '\0'; /* Just to be safe. */ } } fclose(fp); } free(source); /* Don't forget to call free() later! */

是的 – 你可能因为你的strcat滥用而被捕！

看看getline（）它一次读取一行数据，但重要的是它可以限制你读取的字符数，所以你不会溢出缓冲区。

Strcat相对较慢，因为它必须在每个字符插入时搜索整个字符串以结束。您通常会保持指向字符串存储的当前末尾的指针，并将其传递给getline作为读取下一行的位置。

请参阅JoelOnSoftware的这篇文章，了解为什么不想使用strcat 。

看看fread的另一种选择。当您读取字节或字符时，将其与1一起使用。

你为什么不只使用你拥有的字符数组？这应该这样做：

  source[i] = getc(fp); i++;

没有测试，但应该工作..是的，它可以用fread更好地实现，我将把它作为练习留给读者。

 #define DEFAULT_SIZE 100 #define STEP_SIZE 100 char *buffer[DEFAULT_SIZE]; size_t buffer_sz=DEFAULT_SIZE; size_t i=0; while(!feof(fp)){ buffer[i]=fgetc(fp); i++; if(i>=buffer_sz){ buffer_sz+=STEP_SIZE; void *tmp=buffer; buffer=realloc(buffer,buffer_sz); if(buffer==null){ free(tmp); exit(1);} //ensure we don't have a memory leak } } buffer[i]=0;

如果您使用的是Linux系统，一旦拥有了文件描述符，就可以使用fstat（）获取有关该文件的大量信息。

http://linux.die.net/man/2/stat

所以你可能有

 #include  void main() { struct stat stat; int fd; //get file descriptor fstat(fd, &stat); //the size of the file is now in stat.st_size }

这避免了寻找文件的开头和结尾。

想要畏惧的方法：

http://www.cplusplus.com/reference/clibrary/cstdio/fread/

你考虑过mmap（）吗？您可以直接从文件中读取，就好像它已经在内存中一样。

http://beej.us/guide/bgipc/output/html/multipage/mmap.html

在C中将文本文件读入缓冲区的正确方法？

如何使用SOCK_DGRAM制作双向unix域套接字？

使用Fortran和C调用Metis API

麻烦简单的VBO示例

在c中读取python的全局变量

查找整数分区的字典顺序

MPI_Gatherv：创建和收集可变大小的数组（MPI + C）

知识树中的段错误

C在提交之前读取stdin缓冲区

2个暗淡的数组和双指针

如何从s64值中删除前3个字节和最后一个字节？