The NOW Trace Collection Project
The trace environment and trace collection process are described
in the Berkeley technical report UCB//CSD-98-1029.
Please include a reference to this report for any work that
uses the traces. Note that these are the postprocessed versions
of the traces described in the report. File and program names have been
removed for privacy reasons. However, the id of a given file's
parent directory can be found by using the directory databases and
the getdir.c source code included with the sample code described below.
Corrected Technical Report
Here is the corrected version of technical report
UCB//CSD-98-1029. The corrections change only the graph labels. The labels
on Figure 8
have been corrected; the "sequential" and "random" lines on the WEB graph
were previously switched. The labels on Figure 4 have been made more legible.
Figure 1 in the technical report incorrectly indicates that the user id
occurs before the process id in the record header. The order of
these two fields should be reversed.
More information on the trace collection process and postprocessing
is described in Chapter 2 of Drew Roselli's dissertation.
Currently, the entire web trace set (Jan-Feb 1997) and the first
3.5 months of the instructional and research traces are available
(Sep 11 - Dec 31, 1996).
The remaining traces will be added as they become available.
Sample source code for using the traces is available
Note that this is only sample code, not guaranteed to be portable.
A perl script
contributed by David Petrou may supply a more portable
version of some of this code. I have not tested the program myself.
The directory databases are
dirdb.res , and
The total sizes of the traces are roughly: 5 GB instructional, 1.5 GB research,
and 300 MB web.
For each workload, the traces are broken into separate files, one for each
day. The size of each day varies but is generally between
15-150 MB compressed.
Note: we do not maintain these traces.