Corsaro ships with several tools which leverage the libcorsaro library. This section describes the purpose of each tool and how to use it.
This is the main tool in the Corsaro suite, it provides a lightweight wrapper around the Corsaro-Out features of libcorsaro. The corsaro tool processes trace files and uses a set of plugins to analyze and generate aggregated statistics about the packets they contain. For more information about this process, see the Corsaro-Out and Plugins sections of this manual.
In addition to processing existing trace files, corsaro can capture packets from a live interface by using the special pcapint:<interface>
trace URI parameter. This feature is still in the alpha testing phase and so only has limited functionality, e.g., output files are not rotated.
corsaro takes two mandatory arguments: an output filename template, and an input trace URI.
The output template must contain the string P
which will be expanded to the name of the plugin (and possibly a plugin-specific identifier) for each file created. corsaro will scan the filename to determine which, if any, compression should be used when creating the file. A gz
extension will cause gzip
compression to be used, whereas a bz2
compression will use bzip
compression. Uncompressed files will be created for all other extensions.
The template may optionally contain the string N
which will be expanded to the monitor name. The monitor name can either be specified using the -n
option at run-time, or using the --with-monitorname
option to configure
.
In addition, the template can contain any of the specifiers supported by strftime(3)
which will be replaced with the appropriate representation of start time of the first interval in the file.
For example, at CAIDA, we use:
The input URI will most commonly just be the path to a file of any format supported by libtrace
. It can however also be a FlowTuple file - this is useful for re-processing data using additional plugins (many of the Core Plugins support processing FlowTuple records).
The IO library (libwandio) used by libcorsaro can automatically detect gzip and bzip compression if it is used. Multiple trace URIs can be supplied and will be processed in the order they are listed. Take care to ensure packets are sorted in chronological order - plugin behavior is undefined for unordered packets.
The remaining arguments are all optional and alter how the trace is processed.
-a
align-f
filter-i
interval-l
legacy-m
modebinary
and ascii
are supported valuesbinary
-n
name--with-monitorname
)N
specifier is used-p
plugin-P
promiscuous-r
rotate-R
meta-rotate-r
The output files generated by the corsaro tool can be viewed either with a standard text viewer (for the ASCII output format), or using the cors2ascii tool (for the binary output format).
The cors2ascii tool converts binary output from any corsaro plugin that implements the Corsaro-In API to an ASCII format. The output from cors2ascii depends on the specific plugin used, but the output will be in a format which is mostly human-readable, as well as supporting ad-hoc analysis scripts (e.g. written in Perl).
Currently cors2ascii supports the FlowTuple and RS DoS binary output formats, as well as the Corsaro Global Output File . See the File Formats page for details about the output formats for plugins.
usage: cors2ascii input_file
cors2ascii takes a single argument: the path to the file to be converted to ASCII. Because cors2ascii uses the IO Framework , gzip and bzip compressed files are supported also.
The cors-ft-aggregate tool re-aggregates FlowTuple based on time and sub-tuples.
The re-aggregation features of cors-ft-aggregate
provide a powerful method for analyzing specific dimensions of a dataset, much more efficiently and reliably than parsing and manually aggregating the data output by cors2ascii
.
The current version of the tool only supports the FlowTuple ASCII output format. The fields of the tuple which are not included in the re-aggregation will be zeroed out as shown in the example output below. Also, the tool does not preserve the classes from the original data - tuples from all classes are aggregated into a single table. Support for binary output and class preservation is planned for a future release.
field
flowtuple_file
l[egacy]
interval
-1
indicates that all data should be aggregated into a single interval0
uses the original interval in the file (60 seconds for CAIDA data).value
packet_cnt
in a raw FlowTuple filepacket_cnt
, the value will be the number of unique elements in the setsrc_ip
will give a value for each tuple which is the number of unique source IP addresses which match the sub-tuple (as specified by the field
arguments)file_list
-
to read the list from standard inputRe-aggregating data with a 24 hour interval, using protocol
as the field, and src_ip
as the value, gives per-day tables describing the number of unique source IP addresses observed for each protocol.
Command used:
Sample output:
Note, this output has been sorted in post-processing
The output shows the familiar FlowTuple ASCII format, albeit with all fields except the protocol zeroed out. Also, the packet count value has been replaced with a count of the number of unique source IPs observed for the corresponding sub-tuple over the interval. In this example, we can see that UDP (protocol 17, 6th line of output) packets were received from a total of 1,042,968 different sources during the interval.
Re-aggregate and filter data using protocol
and dst_port
fields, leaving the interval and value unchanged.
Command Used:
Sample Output:
We re-aggregate the data using the protocol
and dst_port
fields, maintaining the original interval (60 seconds) and leaving the packet count as the value field. We then use a simple grep
to filter the output to only records which have a destination port of 5060
(SIP) and a protocol value of 17
(UDP). In this example, the first minute of data, which begins at 01/01/12 04:00 UTC (1325390400) contained 3,577 packets to UDP port 5060. Note, future versions of cors-ft-aggregate
will directly support filters such as this to greatly improve processing speed.
A simple script that reads ASCII flowtuple data from STDIN and sums the tuple values in each interval. The output is of the format:
This is useful for converting the output from cors-ft-aggregate into a simple time series.
Because this script reads from STDIN, it should be piped the data to be converted. If you have an existing ASCII-format FlowTuple file, simple do something like:
But more likely you will chain it directly to cors-ft-aggregate like so:
Takes the output from cors2ascii and splits each interval into a separate file. Useful for generating a single file per interval for further processing without the need to detect interval start and end records.
Similar to the corsaro tool, an output file name template must be specified, in which the string INTERVAL%
will be replaced with the interval start time in each output file.
The input file can be any file which contains ASCII formatted corsaro interval data. To read data from stdin, use -
as the input file. This allows cors-splitascii.pl
to be chained directly to cors2ascii
like this:
Quickly converts a trace file into an easily parseable list of tuples.
While cors-trace2tuple
does not use the libcorsaro framework, it is useful for quickly generating a high-level representation of the packets contained in a trace file.
Usage: cors-trace2tuple [-H|--libtrace-help] [--filter|-f bpf ]... libtraceuri...
Each packet in the input trace (which is accepted by the optional BPF filter) is represented by a single line in the tab-separated ASCII output.
Output is in the following format:
Not yet implemented