C file input/output: Difference between revisions

Content deleted Content added
huge cleanup, merge from fgetc, recat
Reading from a stream using fgetc: merge from article "getchar"
Line 58:
 
The standard function <code>'''getchar'''</code>, also defined in <code><stdio.h></code>, takes no arguments, and is equivalent to <code>fgetc([[stdin]])</code>.
 
===The EOF pitfall===
A common mistake when using <code>fgetc</code>, <code>getc</code>, or <code>getchar</code> is to assign the result to a variable of type <code>char</code> ''before'' comparing it to <code>EOF</code>. The following snippets of code exhibit this mistake, and then show the correct approach:
 
<code>
char c;
while ((c = getchar()) != EOF) { /* Bad! */
putchar(c);
}
</code>
 
<code>
int c;
while ((c = getchar()) != EOF) { /* Okay! */
putchar(c);
}
</code>
 
Consider a system in which the type <code>char</code> is 8&nbsp;bits wide, representing 256&nbsp;different values. <code>getchar</code> may return any of the 256&nbsp;possible characters, and it also may return <code>EOF</code> to indicate end-of-file, for a total of 257 different possible return values.
 
When <code>getchar</code>'s result is assigned to a <code>char</code>, which can represent only 256 different values, there is necessarily some loss of information &mdash; when packing 257&nbsp;items into 256&nbsp;slots, there [[Pigeonhole principle|must be a collision]]. The <code>EOF</code> value, when converted to <code>char</code>, becomes indistinguishable from whichever one of the 256 characters shares its numerical value. If that character is found in the file, the above example may mistake it for an end-of-file indicator; or, just as bad, if type <code>char</code> is unsigned, then because <code>EOF</code> is negative, it can never be equal to any unsigned <code>char</code>, so the above example will not terminate at end-of-file. It will loop forever, repeatedly printing the character which results from converting <code>EOF</code> to <code>char</code>.
 
On systems where <code>int</code> and <code>char</code> are the same size, even the "good" example will suffer from the indistinguishability of '''EOF''' and some character's value. The proper way to handle this situation is to check <code>[[feof]]</code> and <code>[[ferror]]</code> after <code>getchar</code> returns '''EOF'''. If <code>feof</code> indicates that end-of-file has not been reached, and <code>ferror</code> indicates that no errors have occurred, then the '''EOF''' returned by <code>getchar</code> can be assumed to represent an actual character. These extra checks are rarely done, because most programmers assume that their code will never need to run on one of these "big <code>char</code>" systems.
 
===External link===
*[http://c-faq.com/stdio/getcharc.html Question 12.1] in the C FAQ: using <code>char</code> to hold <code>getc</code>'s return value
 
==Writing to a stream using fputc==