The NOW Trace Collection Project

Trace Information

The trace environment and trace collection process are described in the Berkeley technical report UCB//CSD-98-1029. Please include a reference to this report for any work that uses the traces. Note that these are the postprocessed versions of the traces described in the report. File and program names have been removed for privacy reasons. However, the id of a given file's parent directory can be found by using the directory databases and the getdir.c source code included with the sample code described below.

Corrected Technical Report

Here is the corrected version of technical report UCB//CSD-98-1029. The corrections change only the graph labels. The labels on Figure 8 have been corrected; the "sequential" and "random" lines on the WEB graph were previously switched. The labels on Figure 4 have been made more legible.

Other Errata

Figure 1 in the technical report incorrectly indicates that the user id occurs before the process id in the record header. The order of these two fields should be reversed.

More Information

More information on the trace collection process and postprocessing is described in Chapter 2 of Drew Roselli's dissertation.

postscript

pdf

Trace Availability

Currently, the entire web trace set (Jan-Feb 1997) and the first 3.5 months of the instructional and research traces are available (Sep 11 - Dec 31, 1996). The remaining traces will be added as they become available.

Sample Code

Sample source code for using the traces is available for download. Note that this is only sample code, not guaranteed to be portable.

A perl script contributed by David Petrou may supply a more portable version of some of this code. I have not tested the program myself.

The directory databases are dirdb.ins , dirdb.res , and dirdb.web .

Trace Size

The total sizes of the traces are roughly: 5 GB instructional, 1.5 GB research, and 300 MB web. For each workload, the traces are broken into separate files, one for each day. The size of each day varies but is generally between 15-150 MB compressed.

Traces

Instructional Workload
Research Workload
Web Workload


Other Traces

Note: we do not maintain these traces.