Email List: Xaustin-group-lX
[All Lists]

Re: fgets/strtok and LINE_MAX

To: Don Cragun <dcragun@xxxxxxxxx>
Subject: Re: fgets/strtok and LINE_MAX
From: Don Cragun <dcragun@xxxxxxxxx>
Date: Mon, 9 Nov 2009 21:03:46 -0800
Cc: shwaresyst@xxxxxxx, austin-group-l@xxxxxxxxxxxxx
References: <ee3ffd5cb61c11046aea665c63c5adbd@austingroupbugs.net> <8CC2B9DD22E07E8-AF4-14C01@webmail-m088.sysops.aol.com> <20091105102458.GA11253@squonk.masqnet> <8CC2C4B8D552A03-2DFC-9667@webmail-m095.sysops.aol.com> <BB989B9C-79E4-438F-9A65-5BB0588375AA@sonic.net>
On Nov 5, 2009, at 3:26 PM, Don Cragun wrote:

... ... ...
I will work off-line to come up with a better example for fgets() before
the conference call next Thursday. I don't know yet if I will try to
make it a tutorial on how to process arbitrary length lines or if I will
just add comments noting that lines can be arbitrarily long and state
that the example has limitations by ignoring this possibility.
... ... ...
As promised...

The Utility Limits rationale (subclause C.1.3 on P3639, L123638-123646)
already warns application writers that it is not a good idea to
"blindly" allocate an array of size [LINE_MAX] and assume that it
will be large enough to hold a full input line. Further guidance
later in that section (P3640, L123685-123701) warns about creating
lines longer than LINE_MAX and needing to be careful if those lines
will later become input to a utility that is specified to process
text files.

I believe that it is safe to assume that applications using fgets()
already assume that any stream being read with fgets() is connected
to a text file. If the application does not have reason to believe
that the input is from a text file, it should be using functions
like read() or fread() rather than functions like fgets() and
getline().

I will also work on an update to get rid of the references to LINE_MAX
in the strtok() example. The strtok() function works on strings
whether or not they are lines. There is no need for the example to
imply that the string being parsed contains one or more lines.

Cheers,
Don

<pre>
As has been pointed out on the austin-group-l alias, there are
several problems with the current example in the description of
fgets(). Due to the comments discussed on the alias, I propose
changing the example on P852, L28298-28308 in the EXAMPLES section
for fgets() from:
The following example uses fgets() to read each line of
input. {LINE_MAX}, which defines the maximum size of the
input line, is defined in the <limits.h> header.

#include <stdio.h>
... char line[LINE_MAX];
...
while (fgets(line, LINE_MAX, fp) != NULL) {
...
}
...
to:
The following example uses fgets() to read lines of input.
It assumes that the file it is reading is a text file and
that lines in this text file are no longer than 16384 (or
[LINE_MAX] if it is less than 16384 on the implementation
where it is running) bytes long. (Note that the standard
utilities have no line length limit if sysconf(SC_LINE_MAX)
returns -1 without setting errno.)

#include <limits.h>
#include <stdio.h>
#include <unistd.h>

#define MYLIMIT 16384

char *line;
int line_max;

if (LINE_MAX >= MYLIMIT) {
// Use maximum line size of MYLIMIT. If LINE_MAX is
// bigger than our limit, sysconf() can't report a
// smaller limit.
line_max = MYLIMIT;
} else {
long limit = sysconf(_SC_LINE_MAX);
line_max = (limit < 0 || limit > MYLIMIT) ? MYLIMIT : (int) limit;
}

// line_max + 1 leaves room for the NUL byte added by fgets().
line = malloc(line_max + 1);
if (line == NULL) {
// out of space
...
return error;
}
while (fgets(line, line_max + 1, fp) != NULL) {
// Verify that a full line has been read...
// If not report an error or prepare to treat the
// next time through the loop as a read of a
// continuation of the current line.
...
// Process line...
...
}
free(line);
...
</pre>

<Prev in Thread] Current Thread [Next in Thread>