[GNU Manual] [POSIX requirement] [Linux man] [FreeBSD man]
Summary
wc - print newline, word, and byte counts
Lines of code: 875
Principal syscall: write()
Support syscalls: open(), close()
Options: 13 (5 short, 8 long)
Descended from wc introduced in Version 1 UNIX (1971)
Added to Textutils in November 1992 [First version]
Number of revisions: 186
compute_number_width()
- Computes an optimal output print width for countersget_input_fstatus()
-stat()
input files and retain resultswc()
- Top level counting procedure for all fileswc_file()
- Counts a single filewrite_counts()
- Print provided file count results
argv_iter_init_argv()
- Creates an argument interator for argv (from gnulib)die()
- Exit with mandatory non-zero error and message to stderrerror()
- Outputs error message to standard error with possible process termination
Setup
wc keeps tracks the counts and other variables as globals, including:
have_read_stdin
- Flag set if input is read from STDINmax_line_length
- Tracks the longest line seennumber_width
- The width of the output display for countspage_size
- The number of bytes in a memory page (a lagetpagesize()
)print_bytes
- Flag set to display byte countsprint_chars
- Flag set to display character countsprint_linelength
- Flag set to display line lengthprint_lines
- Flag set to display line countsprint_words
- Flag set to display word countstotal_bytes
- The total number of bytes across all filestotal_chars
- The total number of characters across all filestotal_lines
- The total number of lines across all filestotal_words
- The total number of words across all files
main()
introduces a few local variables:
**files
- The input file names*files_from
- The name of the reference file holding all the target file namesfstatus
- A custom structure holdingstat()
info and execution statusnfiles
- The number of files to countok
- The final return statusoptc
- The character for the next option to processtok
- Token structure based on obstacks for input file list
Parsing
Parsing user options defines the execution parameters, specifically:
- Are we counting by bytes, characters, words, or lines?
- Is there a max line length?
- Is there a separate file which lists the target files to analyze?
The only parsing error caught is if the user provides files via command line and with a list file. The result is a short error message followed by the usage instructions.
Execution
The most complicated aspect of wc is that reading files may use one of several procedures optimized on how the user is counting. This is reflected in the wc()
function which accounts for over 550 lines (60%) of the source file. The high-level procedure for the utility is:
- Access the file list (source file or command line arguments)
- Tokenize the file list (if using a file list)
- Get the status of each file using
(f)stat()
and define fstatus structures - Compute optimal number widths based on all files
- While there still files to process:
- Verify file accessibility (exists? name? openable?)
- Read each element from the file (byte, word, line)
- Check for appropriate delimiters to count each element appropriately
- At the end of a file, print the file counts and add them to the total counts
- Repeat above for all input files
- Print the totals
Failure cases:
- Unable to open or close input files
- Unable to read from input source
- Unable to allocate memory for reading input
- Unable to close standard input after reading
All failures at this stage output an error message to STDERR and return without displaying usage help