Decoded: tac (coreutils) – MaiZure's Projects

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [No POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of tac command (coreutils)

Summary

tac - concatenate and write files in reverse

[Source] [Code Walkthrough]

Lines of code: 714
Principal syscall: write()
Support syscalls: open(), close() (both via the standard fstream functions)
Options: 2 (3 short, 5 long)

Lineage unclear -- not part of Research UNIX or System V
Added to Fileutils in November 1992 [First version]
Number of revisions: 191

The tac utility is functionally similar to cat but the implementation strategy is necessarily different in much the same way head differs from tail. We need to read and buffer the file contents from the beginning until we find the end and then begin output. Non-seekable data sources (ttys) complicate the problem further.

Helpers:

copy_to_temp() - Steams data to a temporary file
output() - Writes buffer data to output
record_or_unlink_tempfile() -
tac_file() - Top level tac procedure
tac_nonseekable() - tac procedure for nonseekable source
tac_seekable() - tac procedure for seekable sources (normal files)
temp_stream() - Creates a temporary stream file
unlink_tempfile() - Closes and deletes a file

External non-standard helpers:

die() - Exit with mandatory non-zero error and message to stderr
error() - Outputs error message to standard error with possible process termination

Setup

tac uses several flags and variables as globals for managing execution:

G_buffer - The input buffer
G_buffer_size - The size of the input buffer (generally 2 full reads, plus sentinal, plus 2)
have_read_stdin - Flag set if data is read from STDIN
match_length - The length of a successful match
read_size - The size of the read buffer
sentinel_length - The length of a separator
separator - The record/line separating string
separator_ends_record - Flag if the separator should be at the end (or beginning of subsequent)

tac may use regular expressions to match a separator. This features relies on the POSIX regular expression library and thus we bring in some structures and variables to support it:

compiled_separator - The compiled pattern buffer

compiled_separator_fastmap - The fastmap for lookups

regs - The match information structure

main() introduces several local variables for processing:

*file - The name of the next file to process
*error_message - The regex compiler message
default_file_list[] -
half_buffer_size -
ok - The final return status
optc - The character for the next option to process

Parsing

Parsing tac is fairly simple considering the lack of options. The information provided by the user considers separators and targets:

Should separators apply to the beginning or end of a record (line)
What is the separator?
Should we match separators based on regular expression?

Parsing failures

These failure cases are explicitly checked:

Not defining a separator if using a regex
Using an unknown option

User specified parsing failures result in a short error message followed by the usage instructions. Access related parsing errors die with an error message.

Execution

If a non-seekable data source is used (i.e. ttys, pipes, sockets, FIFOs), then tac begins by creating a temporary file and stream all of the source contents to that file first. Now we can apply the seekable procedure to all cases:

Find the end of the file and align partial first read to buffer size
Scan backwards to match end of record/line separator (regex and normal match cases)
Copy record to buffer
Repeat backward scan until buffer is full then output buffer
When the beginning of the file is found, output current buffer
Close the file
Repeat entire process with any subsequent files requested

The logical diagram of tac shown above doesn't capture the many ways the utility could fail during processing.

Failure cases:

Unable to open or close data source
Unable to create temporary file for non-seekable sources
Unable to read from data source/temp file at any point
Invalid regular expression separator defined
A single line/record is too large to fit in the buffer
Seek fails on a seekable source
Write fails while streaming non-seekable source to temp file
Output write fails

All failures at this stage output an error message to STDERR and return without displaying usage help

[Back to Project Main Page]