This tutorial gives some background for the Corsaro plugin architecture and describes how one could go about designing and implementing a new plugin.

Because Corsaro has been designed to make adding a new plugin as easy as possible, there is a minimal API for writing a plugin that can process packets and write output. In addition, some plugins will also implement the Corsaro-In plugin API, allowing the data that they generate to be de-serialized and used by other programs.

An example of a plugin that implements both the Corsaro-Out and Corsaro-In plugin APIs is the FlowTuple plugin included in the Corsaro distribution. The FlowTuple plugin extracts a key comprised of several values from the packet header and maintains psuedo-flows of packets that match this key. The optimized binary output generated by FlowTuple uses as little storage space as possible, but at the expense of human readability. Due to this, the FlowTuple plugin also implements the Corsaro-In API to allow other tools to further process FlowTuple data. An example of such a tool is cors-ft-aggregate which can re-aggregate FlowTuple distributions based on sub-keys.

Because of the bidirectional nature of corsaro plugins (Corsaro-Out and Corsaro-In ), this tutorial is split into three sections - the first describes the general process for creating and bootstrapping a new plugin, the second describes the API to be implemented to process packets and write output (implementing the Corsaro-Out plugin API), while the third section describes the API for reading existing corsaro plugin output data (the Corsaro-In API).

Bootstrapping a New Plugin

Bootstrapping a new plugin generally consists of two steps:

Creating boilerplate plugin code
Adding references to the newly created plugin to the appropriate autotools config files and to the corsaro plugin manager.

Creating Boilerplate Code

To ready a plugin for analysis code to be added, several steps must be followed:

Create Interface Code
Create Implementation Code
1. Create corsaro_plugin Instance
2. Create State Structure(s)
3. Create Helper Macros

Each corsaro plugin must have both interface (.h) and implementation (.c) files. Currently, the convention is for these files to be placed in the libcorsaro/plugins/ directory to keep them separate from the base corsaro code.

There is also a strict naming scheme which allows the plugin macros to function correctly. That is, a plugin has both a name and an ID. The name is used for assigning a namespace to the plugin and also for output file names, logs, etc. The ID is used to allow the plugin manager to maintain state about each plugin currently in operation.

Each plugin is free to choose any name provided it does not conflict with existing plugins. We suggest keeping names as brief as possible while conveying the purpose of the plugin (remember, this is how you will identify the file(s) each plugin has generated). For example, we use the name 'flowtuple' for our plugin which generates distributions of key fields (tuples) in the packet headers.

As mentioned earlier, the plugin name provides a unique namespace for each plugin. To this end, the source files for each plugin must follow the following format:

corsaro_<plugin_name>.[ch]

Building on our previous example, the FlowTuple code is located in the following two files:

libcorsaro/plugins/corsaro_flowtuple.c

libcorsaro/plugins/corsaro_flowtuple.h

Interface Code

The simplest of the two files needed for each plugin is the interface (.h) file. A minimal plugin needs only a single line in the header file:

CORSARO_PLUGIN_GENERATE_PROTOS(<plugin_name>)

The CORSARO_PLUGIN_GENERATE_PROTOS macro is located in corsaro_plugin.h and will expand to all of the required function prototypes to satisfy the plugin API. This is another example of why the plugin name is important - this macro assumes that the functions which implement the plugin API are prefixed with the name of the plugin. This can be seen when we look at the function prototypes that would be generated (by the pre-processor) for the FlowTuple plugin:

int corsaro_flowtuple_probe_filename(const char *fname);                       
int corsaro_flowtuple_probe_magic(struct corsaro_in * corsaro, corsaro_file_in_t *file); 
int corsaro_flowtuple_init_input(struct corsaro_in *corsaro);                  
int corsaro_flowtuple_init_output(struct corsaro *corsaro);                    
int corsaro_flowtuple_close_input(struct corsaro_in *corsaro);                 
int corsaro_flowtuple_close_output(struct corsaro *corsaro);                   
off_t corsaro_flowtuple_read_record(struct corsaro_in *corsaro,
                           enum corsaro_in_record_type *record_type,    
                           struct corsaro_in_record *record);           
off_t corsaro_flowtuple_read_global_data_record(struct corsaro_in *corsaro,    
                           enum corsaro_in_record_type *record_type,    
                           struct corsaro_in_record *record);           
int corsaro_flowtuple_start_interval(struct corsaro *corsaro,                  
                           struct corsaro_interval *int_start);      
int corsaro_flowtuple_end_interval(struct corsaro *corsaro,                    
                           struct corsaro_interval *int_end);          
int corsaro_flowtuple_process_packet(struct corsaro *corsaro,                  
                           struct corsaro_packet *packet);

These prototypes specify the names of the functions that the implementation file must contain.

If you look at the actual header file (corsaro_flowtuple.h) for the FlowTuple plugin, you will find lots of other code also - this is used for the Corsaro-In API and is discussed in the Implementing the Corsaro-In API section.

Implementation Code

Moving to the implementation (.c) file, there is two data structures that we recommend you define before implementing the API functions. These provide some state for the plugin (because there can potentially be multiple instances of Corsaro used simultaneously) and helper macros for accessing that state.

corsaro_plugin Instance

The first structure that is needed is an instance of corsaro_plugin which describes the plugin. This will be copied by the plugin manager when an instance of the plugin is created, so should be declared as static. The plugin name and magic number are at the discretion of the plugin so long as they are unique across plugins. The plugin ID must match that which is listed in corsaro_plugin.h (adding a new ID is described in Integrating a New Plugin section). The remainder of the fields can be filled in using the CORSARO_PLUGIN_GENERATE_PTRS macro. The following is the corsaro_plugin definition for the FlowTuple plugin:

static corsaro_plugin_t corsaro_flowtuple_plugin = {
  PLUGIN_NAME,                                        /* name */
  CORSARO_PLUGIN_ID_FLOWTUPLE,                        /* id */
  CORSARO_FLOWTUPLE_MAGIC,                            /* magic */
  CORSARO_PLUGIN_GENERATE_PTRS(corsaro_flowtuple),    /* func ptrs */
  NULL,                                               /* next */
};

State Structure(s)

The second structure needed is a structure that will hold any per-instance state needed by the plugin. For example, pointers to output files, data structures for analysis, etc. This structure can be in any format, the pointer will be cast to void by the plugin manager, but using a well-structured name will allow the use of convenience macros described below for retrieving the state from the plugin manager. The following is the definition of the state structure for the FlowTuple plugin:

struct corsaro_flowtuple_state_t {
  khash_t(sixt) *st_hash[CORSARO_FLOWTUPLE_CLASS_MAX+1];
  corsaro_file_t *outfile;
};

If implementing the Corsaro-In API, the plugin will also require a state structure to use in Corsaro-In mode. While these two may be the same structure, we suggest keeping them separate to avoid confusion. The following is the Corsaro-In state structure for the FlowTuple plugin:

struct corsaro_flowtuple_in_state_t {
  corsaro_in_record_type_t expected_type;
  int tuple_total;
  int tuple_cnt;
};

Helper Macros

As mentioned earlier, there are two macros provided by the plugin manager (corsaro_plugin.h) which assist with the retrieval of plugin and state structures from the plugin manager. These are:

CORSARO_PLUGIN_PLUGIN
- Takes a pointer to the corsaro state structure, and the ID of the plugin to retrieve. Returns a pointer to the corsaro_plugin structure registered for that plugin ID.
CORSARO_PLUGIN_STATE
- Takes a pointer to the corsaro state structure, the type of the structure, and the plugin ID. Returns a (correctly cast) pointer to the state structure registered for the given plugin ID.
- The type field assumes a state structure which is named corsaro_<type>_state_t

While these can be used alone to retrieve state and plugin structures from the plugin manager, it might be useful to wrap them in additional macros which are specific to the plugin being created. The core plugins included in the Corsaro distribution all define additional macros: one to retrieve the plugin structure, one to retrieve the Corsaro-Out state structure, and optionally one to retrieve the Corsaro-In state structure. The FlowTuple plugin defines all three of these as follows:

/* Extends the generic plugin state convenience macro in corsaro_plugin.h */
#define STATE(corsaro)                               \
  (CORSARO_PLUGIN_STATE(corsaro, flowtuple, CORSARO_PLUGIN_ID_FLOWTUPLE))
/* Extends the generic plugin state convenience macro in corsaro_plugin.h */
#define STATE_IN(corsaro)                            \
  (CORSARO_PLUGIN_STATE(corsaro, flowtuple_in, CORSARO_PLUGIN_ID_FLOWTUPLE))
/* Extends the generic plugin plugin convenience macro in corsaro_plugin.h */
#define PLUGIN(corsaro)                              \
  (CORSARO_PLUGIN_PLUGIN(corsaro, CORSARO_PLUGIN_ID_FLOWTUPLE))

Each of these simply takes a pointer to the corsaro (or corsaro_in) structure and returns the appropriate pointer.

Once these structures and macros have been defined, stub API functions defined by the CORSARO_PLUGIN_GENERATE_PROTOS macro should be created, and then the plugin can be integrated into Corsaro.

Integrating a New Plugin

There are 4 files that must be updated in order to include a new corsaro plugin:

configure.ac
libcorsaro/plugins/Makefile.am
libcorsaro/corsaro_plugin.c
libcorsaro/corsaro_plugin.h

configure.ac

configure.ac contains a list of plugins that are available for compilation into Corsaro. Each plugin that is to be made available must be declared using the ED_WITH_PLUGIN macro. This macro takes four arguments, the full name of the plugin (e.g. corsaro_pcap), the short name of the plugin (e.g. pcap), the 'macro' name for the plugin (e.g. PCAP), and whether the plugin should be enabled by default (yes or no). The Raw pcap plugin is declared as follows:

ED_WITH_PLUGIN([corsaro_pcap],[pcap],[PCAP],[no])

Note that the order plugins are declared in this file is the default order in which they will be run. That is, a plugin declared after all others will be given a packet to process once all other plugins have processed it.

libcorsaro/plugins/Makefile.am

Using the values defined by the ED_WITH_PLUGIN macro declaration in the configure.ac file, add the plugin to libcorsaro/plugins/Makefile.am as follows:

if WITH_PLUGIN_<macro_name>
PLUGIN_SRC += <full_name>.c <full_name>.h
endif

The corresponding Makefile.am entry for the Raw pcap plugin ED_WITH_PLUGIN example given above is:

if WITH_PLUGIN_PCAP
PLUGIN_SRC+=corsaro_pcap.c corsaro_pcap.h 
endif

libcorsaro/corsaro_plugin.c

The code that needs to be added to libcorsaro/corsaro_plugin.c is very similar to that which was added to libcorsaro/plugins/Makefile.am in the previous section. The format is as follows:

#ifdef WITH_PLUGIN_<macro_name>
#include "<full_name>.h"
#endif

Using the Raw pcap example again, the entry would be:

#ifdef WITH_PLUGIN_PCAP
#include "corsaro_pcap.h"
#endif

libcorsaro/corsaro_plugin.h

The final change that must be made is to add a unique ID for the plugin to the corsaro_plugin_id enum in libcorsaro/corsaro_plugin.h.

The ID value can be any number not taken, but the plugin manager will allocate memory to hold plugins for every possible value up to the maximum, so we suggest keeping this number reasonably small. The ID value does not affect the order in which plugins are run, this is (currently) determined either by the order of the ED_WITH_PLUGIN macros in configure.ac or at run-time by using corsaro_enable_plugin. The ID value defined for a plugin in this list must be the same value given in the corsaro_plugin structure described in the Implementation Code section.

For example, the ID definition for the Raw pcap plugin is:

CORSARO_PLUGIN_ID_PCAP = 1,

Testing the Plugin

At this point, the (stub) plugin is fully integrated into Corsaro, and should compile cleanly. Because we have altered files needed by autoconf, the autoreconf -vfi command should be used to regenerate configure and each Makefile. Also remember that unless the ED_WITH_PLUGIN macro in configure.ac has a yes as the final argument, the plugin will need to be explicitly enabled by passing --with-<short_name> to configure (e.g. --with-pcap).

To fully rebuild everything, use the following:

autoreconf -vfi
./configure [add any options needed]
make

There is also a build_latest.sh script included in the distribution which will take care of these tasks, and also build a distribution tarball that can be tested on another system.

Implementing the Corsaro-Out API

This section describes each of the functions that must be implemented to fully comply with the Corsaro-Out API. Each function has some code snippets that are almost always used. The actual function implementations will have corsaro_<plugin_name>_ prefixed to the names. For example, the Raw pcap implementation of the alloc function is called corsaro_pcap_alloc.

The required functions are:

alloc

The alloc function is called when the plugin manager needs to instantiate a new instance of a plugin. If you followed the steps earlier in this tutorial (see Implementation Code), then this function should simply return a pointer to the static corsaro_plugin structure. The plugin manager will then make a copy for this specific instance.

For example, the Raw pcap plugin uses the following static corsaro_plugin structure definition and alloc function:

static corsaro_plugin_t corsaro_pcap_plugin = {
  PLUGIN_NAME,                                 /* name */
  CORSARO_PLUGIN_ID_PCAP,                      /* id */
  CORSARO_PCAP_MAGIC,                          /* magic */
  CORSARO_PLUGIN_GENERATE_PTRS(corsaro_pcap),  /* func ptrs */
  NULL,                                        /* next */
};

corsaro_plugin_t *corsaro_pcap_alloc(corsaro_t *corsaro)
{
  return &corsaro_pcap_plugin;
}

init_output

The init_output function is called by Corsaro to ready a plugin for use in the Corsaro-Out mode. This event should be used to establish any state necessary for analyzing packets. Plugins should expect the next event to be start_interval .

We will use the RS DoS plugin as an example for a simple init_output function which establishes some state, and opens an output file to write data to. The full corsaro_dos_init_output function is as follows:

int corsaro_dos_init_output(corsaro_t *corsaro)
{
  struct corsaro_dos_state_t *state;
  corsaro_plugin_t *plugin = PLUGIN(corsaro);
  assert(plugin != NULL);
  if((state = malloc_zero(sizeof(struct corsaro_dos_state_t))) == NULL)
    {
      corsaro_log(__func__, corsaro, 
                "could not malloc corsaro_dos_state_t");
      goto err;
    }
  corsaro_plugin_register_state(corsaro->plugin_manager, plugin, state);
  /* open the output file */
  if((state->outfile = corsaro_io_prepare_file(corsaro, plugin->name)) == NULL)
    {
      corsaro_log(__func__, corsaro, "could not open %s output file", 
                plugin->name);
      goto err;
    }
  state->attack_hash = kh_init(av);
  return 0;
 err:
  corsaro_dos_close_output(corsaro);
  return -1;
}

Breaking it down, there are three important steps:

Allocate memory for our state structure
Register the state structure with the plugin manager
Open the output file

Allocate Memory for State

Because Corsaro is designed such that multiple instances of a plugin can be used at once, per-instance state must not be stored in static structures. Instead we create a special state structure and register it with the plugin manager so that it can be retrieved at any time it is needed.

struct corsaro_dos_state_t *state;
...
if((state = malloc_zero(sizeof(struct corsaro_dos_state_t))) == NULL)
  {
    corsaro_log(__func__, corsaro, 
                    "could not malloc corsaro_dos_state_t");
    goto err;
  }

Corsaro provides a utility function, malloc_zero, which will allocate and zero a block of memory. If the malloc fails, we issue a log message using corsaro_log and jump to the err block which simply calls the corsaro_dos_close_output function described in the next section.

Register State with Plugin Manager

To register the newly created state structure with the plugin manager, we simply call the corsaro_plugin_register_state function:

corsaro_plugin_t *plugin = PLUGIN(corsaro);
...
corsaro_plugin_register_state(corsaro->plugin_manager, plugin, state);

This function takes three arguments, a pointer to the plugin manager (contained in the corsaro state structure), a pointer to the plugin which is registering the state, and a void pointer to a state structure. We use the PLUGIN macro which was described in Implementation Code to retrieve a pointer to the plugin structure.

Open Output File

/* open the output file */
if((state->outfile = corsaro_io_prepare_file(corsaro, plugin->name)) == NULL)
  {
    corsaro_log(__func__, corsaro, "could not open %s output file", 
                        plugin->name);
    goto err;
  }

Because the RS DoS plugin uses a generic output file, we simply used the corsaro_io_prepare_file function as described in IO Framework . We store the returned corsaro_file pointer in the state structure for later use. If the file could not be opened, we issue a log message and jump to the err block to clean up and return an error code.

These are the common steps that nearly every plugin will follow when initializing for output. At this time you should also establish any other state required. For example, the RS DoS plugin initializes a hash table and stores a pointer in the state structure:

state->attack_hash = kh_init(av);

close_output

The close_output function should free any state that was established in the init_output function (or during processing). The close_output function should also be able to free partial state - i.e. when an error occurs during initialization, and as such some (or all) of the state is unallocated. The corresponding close_output function to the _init_output_function shown above for RS DoS plugin is as follows:

int corsaro_dos_close_output(corsaro_t *corsaro)
{
  struct corsaro_dos_state_t *state = STATE(corsaro);
  if(state != NULL)
    {
      if(state->attack_hash != NULL)
        {
          kh_free(av, state->attack_hash, &attack_vector_free);
          kh_destroy(av, state->attack_hash);
          state->attack_hash = NULL;
        }
      if(state->outfile != NULL)
        {
          corsaro_file_close(corsaro, state->outfile);
          state->outfile = NULL;
        }
      corsaro_plugin_free_state(corsaro->plugin_manager, PLUGIN(corsaro));
    }
  return 0;
}

It starts by using the STATE macro to retrieve a pointer to the state structure stored for this plugin, if this pointer is not NULL, it then proceeds to free the hashtable and output file (if they are still allocated). The output file is closed using the corsaro_file_close function described in the IO Framework section. Once the state has been free, the state structure itself is freed by calling corsaro_plugin_free_state and passing a pointer to the plugin manager, and to 'itself' - the structure for the current plugin.

start_interval

The start_interval function is used by Corsaro to inform a plugin that a new interval (see the Intervals section) has begun. The plugin may need to initialize some per-interval state at this point. It is perfectly acceptable to ignore a start_interval event (i.e. simply return 0) if the plugin has no use for it.

end_interval

The end_interval function is used by Corsaro to inform a plugin that the current interval is ending. The plugin may need to write out aggregated data (using the Corsaro IO Framework ) for the interval, and possibly free some per-interval state at this point. As with the start_interval event, a plugin may safely ignore an end_interval event.

process_packet

The process_packet event is likely where the majority of a plugin's analysis code will go. The function is called once for every packet that Corsaro processes. Corsaro provides two arguments to the function - a pointer to the current corsaro state structure, from which the plugin and plugin state structures can be retrieved, and also a pointer to a corsaro_packet structure.

A corsaro_packet is a lightweight wrapper around a libtrace packet and a corsaro_packet_state structure. The libtrace packet encapsulates the actual packet that was captured, and all the regular libtrace API functions may be used on it. See the libtrace API documentation for more details. Corsaro provides a simple macro, LT_PKT, which dereferences the corsaro_packet pointer to allow access to the libtrace packet. For example, the RS DoS plugin uses this simple check to ensure that there are no non-IPv4 packets in the trace:

libtrace_packet_t *ltpacket = LT_PKT(packet);
uint16_t ethertype;
uint32_t remaining;
...
/* check for ipv4 */
if((temp = trace_get_layer3(ltpacket, &ethertype, &remaining)) != NULL &&
   ethertype == TRACE_ETHERTYPE_IP)
  {
    ip_hdr = (libtrace_ip_t *)temp;
  }
else
  {
    /* not an ip packet */
    corsaro_log(__func__, corsaro, "non-ip packet found (ethertype: %x)", 
              ethertype);
    return 0;
  }

The corsaro_packet_state structure which is contained in the corsaro_packet structure is used to pass meta-data about a packet down the plugin chain. Currently, corsaro_packet_state structure (in corsaro_int.h) must be altered to store new meta-data. This will likely change in a future version. To make use of the packet state, a plugin would add an appropriate field to the structure, and then after some analysis of a packet, set the field to an appropriate value. The state is then passed down the plugin chain, and successive plugins can leverage the meta-data derived by the earlier plugin.

For example, if we had some code to perform geolocation of IP addresses, we could augment each packet with the latitude and longitude of the source. First we would add fields to the corsaro_packet_state structure:

typedef struct corsaro_packet_state
{
  ...
  /* The latitude of the source IP address (using xxx geolocation) */
  int16_t latitude;
  /* The longitude of the source IP address (using xxx geolocation) */
  int16_t longitude;
} corsaro_packet_state_t;

Then in the process_packet function, our geolocation plugin would take the packet provided, find the geolocation data for the source address, and store the latitude and longitude in the packet state structure:

int corsaro_dos_process_packet(corsaro_t *corsaro, 
                             corsaro_packet_t *packet)
{
  ...
  int16_t latitude;
  int16_t longitude;
  /* code to do geolocation here... */
  packet->state.latitude = latitude;
  packet->state.longitude = longitude;
  return 0;
}

Implementing the Corsaro-In API

Under Development

This section describes each of the functions that must be implemented to fully comply with the corsaro-in API. Each function has some code snippets that are almost always used. The actual function implementations will have corsaro_<plugin_name>_ prefixed to the names.

This section assumes that the corsaro-in API has already been fully implemented (in order to have generated the data), but it is theoretically possible to only implement the alloc function from that API and still have the corsaro-in API function correctly.