The Corsaro 2.0.0 public release includes several Core Plugins which provide both useful trace analysis functionality, and also serve as a template for developing new plugins. These plugins can be loosely divided into three categories: Aggregation/Analysis Plugins, Meta-data Plugins, and Filter Plugins
All plugins must support processing packet trace files, but they may also optionally support processing FlowTuple files. Plugins that support processing FlowTuple data are:
The FlowTuple plugin compresses a trace by counting the number of packets which share a common set of header fields. This is conceptually similar to a NetFlow report, but with a set of header fields specifically tailored to facilitate darknet data analysis.
We selected the fields for inclusion in the tuple based reviewing the types of analysis performed over the last decade using UCSD Network Telescope data. We found a combination of eight fields which would allow the majority of analysis to be carried out without needing to resort to full pcap traces.
The eight fields that the FlowTuple plugin aggregates packets based on are:
Note, if the Protocol value is 1 (ICMP), then the Source and Destination Port fields are used to represent the ICMP Type and Code fields, respectively.
The flows also have an extra implicit field, the class that the packets belong to. There are currently three possible classes:
These classes have been derived from the CAIDA crl_attack_flow software. The logic to classify a packet into a class is contained in the flowtuple_classify_packet function in the corsaro_flowtuple.c file.
The plugin maintains a table of flows for each class, and simply counts the number of packets that belong to each flow in a given interval. At the end of an interval, the plugin traverses each table and writes out a series of key-value pairs, where the key is the 8 fields and the value is the number of packets in the flow. The tables are then cleared and the process begins again for the next interval.
By default, flow tables are sorted before being written out because we found with empirical testing that gzip compression was maximized by using the sort comparator found in corsaro_flowtuple_lt.
See the FlowTuple section of the File Formats page for information about the format of the output created by the FlowTuple plugin.
We also provide an efficient tool for re-aggregating the FlowTuple data based on a different time interval, a subset of key fields and/or a different value field. See cors-ft-aggregate for a more detailed description of the tool.
The FlowTuple plugin has one optional argument, -s
, to disable output sorting.
The RS DoS plugin uses heuristics described by Moore et al. in [3] to detect backscatter packets caused by Randomly Spoofed Denial of Service attacks, it then groups the backscatter by suspected attack victim and reports statistics about each attack.
In addition to keeping high-level statistics about the attack, the plugin also preserves a copy of the initial packet observed to be a part of the attack.
This plugin makes use of the plugin chaining feature described in Plugins to detect backscatter packets based on the classification performed by the FlowTuple plugin. As such, the RS DoS plugin requires the FlowTuple plugin to be enabled.
See RS DoS section of the File Formats page for information about the format of the output files written by the RS DoS plugin.
The RS DoS plugin currently has no run-time configuration options. These may be added in a subsequent release. Please contact corsa if this would be of benefit to you. ro-i nfo@c aida .org
Smee uses the same analysis library as the IATmon tool. See [2] and the iatmon project page for more information.
Smee requires the third-party libsmee library which is currently only available as a part of the Corsaro release. To use the Smee plugin, you will first need to install the libsmee library located in the thirdparty directory of the Corsaro tarball (if you do not have root access see the note below).
At this point, you should be able to build Corsaro using the --with-smee
option to configure.
If you are building Corsaro on a machine that you do not have root access to (or do not want to install libsmee into a system-wide location), you can use the --prefix
option to configure when building smee, as follows:
And then when configuring Corsaro, do the following:
The simplest plugin which ships with Corsaro is the pcap pass-through filter. It simply captures packets and writes them out to a file in pcap format.
If Corsaro is operating on existing traces files, the output file will be identical to the input (though potentially a different size due to compression). In testing we find that even using threaded IO for the gzip compression, the volume of data generated causes a bottle-neck in processing. As such, we strongly recommend against enabling the pcap plugin when processing existing trace files.
The pcap plugin becomes more useful when Corsaro is attached to a live interface. It allows Corsaro to simultaneously capture raw trace data (for archival and later analyis) and perform real-time analysis and aggregation with other plugins. As noted earlier, the gzip compression can cause a bottle-neck in processing when the thread writing the pcap data cannot clear the write buffer fast enough. This can be alleviated somewhat by tweaking the buffer size in libtrace.
The pcap plugin can also be useful in conjunction with one of the Filter Plugins for extracting a subset of packets from a trace file.
The pcap plugin currently has no run-time configuration options.
The Crpyto-PAn anonymization plugin uses the Corsaro implementation of the Crypto-PAn algorithm to anonymize source and/or destination IP addresses in packet headers.
This implementation is adapted from the traceanon tool distributed with libtrace.
This plugin writes no output itself, but can be chained with other plugins (such as FlowTuple or Raw pcap) to produce anonymized output.
The only mandatory argument is the encryption key (or prefix when using the prefix substitution mode). The key can be up to 32 bytes and will be padded with NULLs. If using prefix substitution, the prefix to substitute must be given.
There are three optional arguments: -d
enables encryption of the destination address, -s
enables encryption of the source address (the default is no encryption), and -t
specifies the encrpytion type (the default is cryptopan
).
The Prefix-to-AS ASN Lookup plugin uses CAIDA's Prefix-to-AS databases to determine the AS that the source IP of a packet belongs to. The plugin does not write any data, instead it registers as a Geolocation Provider (see Geolocation Framework ) to annotate packets as they are processed. Thus later plugins in the chain can leverage the ASN for further analysis.
The pfx2as plugin has one mandatory argument: -f
, which specifies the pfx2as database file to use for ASN lookups. Database files can be downloaded from the Prefix-to-AS page of the CAIDA website.
There is also one optional argument: -c
, which causes the plugin to cache IP-to-ASN results. This is useful if processing traces which have many repeated IP addresses. Beware that the cache may grow large if running over long trace files.
The geolocation plugin provides per-packet geographic annotation using either Maxmind, or Digital Envoy's NetAcuity databases. Maxmind provides a free version of their database, Maxmind Geo-Lite which is in CSV format and fully supported by this plugin.
The NetAcuity database support requires an ASCII-dump version of the NetAcuity Edge database (not a free product) which has been pre-processed into a format similar to the Maxmind CSV databases. For more information about pre-processing the database, please contact corsa. ro-i nfo@c aida .org
This plugin writes no output data, instead it registers as a Geolocation Provider (see Geolocation Framework ) and annotates packets with geographic information based on the source IP address. Thus later plugins (such as the Geographic Filter plugin), can leverage the meta-data for further analysis.
The geodb
plugin has one optional argument, -p
which specifies the type of geolocation database to use. The default database type is maxmind
.
Additionally, both blocks and locations database files must be specified. This can be done one of two ways:
-b
and the -l
arguments, or-d
argument and providing a directory containing both filesIf the -d
option is used to specify a directory, the plugin will search given directory for a blocks file named GeoLiteCity-Blocks.csv.gz
, and a locations file named GeoLiteCity-Location.csv.gz
. If your database files use other names, then the -b
and -l
arguments must be used instead.
Filter plugins can be considered a special class of meta-data plugins. They perform some analysis on each packet and determine whether subsequent plugins should ignore it. If a plugin is to be ignored (i.e. removed), a filter plugin will set the CORSARO_PACKET_STATE_IGNORE flag. Plugins should then consider this flag when processing packets.
For example, the FlowTuple plugin ignores any packets for which the ignore flag is set. This has the effect of removing these packets from the resulting FlowTuple file.
Note that setting the ignore flag does not guarantee that subsequent plugins will ignore the packet. It is up to each plugin to determine whether they will obey the ignore request.
The geographic filter plugin allows a subset of packets to be extracted from the input based on a given list of country codes. It can either extract only packets that belong to the given countries, or only those packets which do not.
This plugin requires that a geolocation provider plugin which populates the country code field (currently only the Geolocation plugin) is also enabled, and run prior to this. This plugin does not do any geolocation itself.
The filtergeo
plugin operates based on a list of 2 character ISO 3166-1 alpha-2 country codes. These can be passed to the plugin in one of two ways:
-c
argument (can be used multiple times), or-f
optionThe -c
option is best suited for use with a small number of countries. For a large set, please use the file option.
The plugin also accepts one optional argument, -i
, which inverts the matching. The default behavior is to remove all packets which do not match one of the input countries. Using the -i
option inverts this so that packets which do match are removed.
The prefix filter plugin operates in much the same manner as the Geographic Filter plugin, except it takes as input, a list of IPv4 prefixes.
The plugin uses longest-prefix matching to determine whether the source (or destination of given the -d
option) IP address of a packet is within one of the specified prefixes.
The filterpfx
plugin operates based on a list of IPv4 prefixes in CIDR notation. These prefixes can be passed to the plugin in one of two ways:
-p
argument (can be used multiple times), or-f
optionThe -p
option is best suited for use with a small number of prefixes. For a large set, please use the file option.
The plugin also accepts one optional argument, -i
, which inverts the matching. The default behavior is to remove all packets which do not match one of the input prefixes. Using the -i
option inverts this so that packets which do match are removed.