WAN Bandwidth Estimation

Stefano Masini

ste@caida.org

Summer 2001

Existing Bandwidth Estimation (bwest) tools

Actively being tested (collecting data)

pathchar (first, by Van Jacobson)

pathrate (Dovrolis, U Delaware)

pipechar (LBL, Berkley Labs)

pchar (Bruce Mah, Cisco)

{bc}probe (Oceans, Boston U)

Work in progress

nettest

nettimer (Kevin Lai, Stanford)

bing (not maintained)

clink (Downey, Wellesley Coll.)

netperf (HP)

treno (Pittsburg Supercomp. Ctr)

ttcp (Army Research Lab)

nttcp (Tech. Univ. Munchen)

sprobe (U Washington)

The testbed

The main testbed (cont)

Fully connected graph

20 distinct source-destination pairs

Each host probes each other every 3 minutes via traceroute

Few hundreds of distinct paths seen in 3 months

Controlling the testbed

Centralized headquarter (Stefano’s laptop)

Perl server running on each host

Headquarter issues commands to hosts

Convenient command line syntax

Data is encrypted

Fast response (no ssh connection delay)

Headquarter keeps local mirror of hosts file systems (rsync) for convenient access to the data

Server features

Simple encryption schema for communication with headquarter

XOR of data with shared secret key

Resistant to crashes and/or kill

Job control

Job timeout

Output of issued commands logged to file

Automated tool execution

Tools must be run across every distinct source-destination pair (as long as portability allows).

At any given time only one src-dest pair is being tested.

Some tools may suffer from machine load and/or network load.

Easiest way to avoid both is to do one thing at a time.

Tool execution is “wrapped” around an executor script, which takes care of scheduling the runs at the various hosts.

Distributed approach preferred over centralized one:

No need for centralized scheduler

No need for extra communication

The executor wrapper

A global run defined as a sequence of local runs.

A local run is an execution of the tool over a given src-dest pair.

A global run contains exactly one local run for every src-dest pair.

The state of a global run represents which src-dest pairs have been tested so far (and which have not).

Wrapper running at a given host starts the tool with the correct destination basing on the state, and after completion it updates the state and runs a new wrapper on the next designated host with the new state.

Tools evaluation

Evaluation strategy

Collect outputs of the various tools

as many src-dest pairs as possible

testing period as long as possible

Parse the outputs to produce only numbers in a way that allows for statistical analysis of many runs

Compare the estimated network properties with real network topology (where known)

Sample results

Pipechar results. 647 runs. 2 weeks.

The topology knowledge

Information about the real topology

CAIDA peering with sysadmins of many ASes.

Topology information is difficult to manage:

Incredibly high level of detail

No homogeneous structure/technology

No steady technology (changing fast over time)

Need to make as few assumptions and simplifications as possible because we don’t know which factors are important for bandwidth estimation

Database overview

Main components:

nodes (router, workstation, switch, etc.)

interfaces (network card, router/switch ports, etc.)

links

IP addresses (including names and ASes)

Extreme importance to keeping track of the information sources and people.

Important relations among components

One interface ® many IP addresses

e.g. IP aliasing

Many interfaces ® one IP address

e.g. Load balancing in server cluster

Many interfaces ® one link

e.g. Ethernet

One interface ® many links

e.g. Special hardware for low-level load balancing (Cisco?)

The Database E/R Scheme

Links: logical or physical?

Links technology is difficult to categorize:

Ethernet, Token Ring, FDDI

Packet over ATM

ATM virtual circuits

Packet over Sonet (POS)

Software limited bandwidth channels

... and in a few months?

Links: logical or physical? (cont)

Definition left vague in the DB design

Added attribute “containing_link_id” to link_info

Currently unused

Allows for representation of tree structures:

Information sources

Every piece of knowledge about the topology is added to the database along with the source of the information and the time

allows to fetch a “snapshot” of the topology at any given time, past or present

useful for estimating tools performance because at the time the logs are analyzed the topology may have changed

How is the topology determined?

Real topology is impossible to determine in an automated way

Need to gather infos from the sysadmins

Slow process

Massive use of Traceroute and/or Skitter

Discovered topology is just an approximation of the reality, but enough for tool performance estimation

Fast and automated process

Sample topology and traceroute behaviour

Traceroute reconstructed path:

Sample topology and traceroute behaviour

Traceroute reconstructed path:

Interfaces C and E are missing!

Hidden interfaces assumption

In interpreting a traceroute path we assume the existence of hidden interfaces.

e.g. In the following example in the database would be inserted not only the four interfaces A, B, D and F, but also two unnamed interfaces (which correspond in reality to C and E)

this assumption may cause a reverse path to appear distinct from the forward one even if they are symmetric

the only way to solve the problem is asking to the sysadmins L

More details to come...

Database on Kokoro

Tools and documentation on Cider

Stefano on the Moo

Thanks CAIDA !!


	Actively being tested (collecting data)
		pathchar (first, by Van Jacobson)
		pathrate (Dovrolis, U Delaware)
		pipechar (LBL, Berkley Labs)
		pchar (Bruce Mah, Cisco)
		{bc}probe (Oceans, Boston U)
	Work in progress
		nettest
		nettimer (Kevin Lai, Stanford)
		bing (not maintained)
		clink (Downey, Wellesley Coll.)
		netperf (HP)
		treno (Pittsburg Supercomp. Ctr)
		ttcp (Army Research Lab)
		nttcp (Tech. Univ. Munchen)
		sprobe (U Washington)


	Fully connected graph
		20 distinct source-destination pairs

	Each host probes each other every 3 minutes via traceroute
		Few hundreds of distinct paths seen in 3 months


	Centralized headquarter (Stefano’s laptop)
	Perl server running on each host
	Headquarter issues commands to hosts
		Convenient command line syntax
		Data is encrypted
		Fast response (no ssh connection delay)
	Headquarter keeps local mirror of hosts file systems (rsync) for convenient access to the data


	Simple encryption schema for communication with headquarter
		XOR of data with shared secret key
	Resistant to crashes and/or kill
	Job control
	Job timeout
	Output of issued commands logged to file


	Tools must be run across every distinct source-destination pair (as long as portability allows).
	At any given time only one src-dest pair is being tested.
		Some tools may suffer from machine load and/or network load.
		Easiest way to avoid both is to do one thing at a time.
	Tool execution is “wrapped” around an executor script, which takes care of scheduling the runs at the various hosts.
	Distributed approach preferred over centralized one:
		No need for centralized scheduler
		No need for extra communication


	A global run defined as a sequence of local runs.
	A local run is an execution of the tool over a given src-dest pair.
	A global run contains exactly one local run for every src-dest pair.
	The state of a global run represents which src-dest pairs have been tested so far (and which have not).
	Wrapper running at a given host starts the tool with the correct destination basing on the state, and after completion it updates the state and runs a new wrapper on the next designated host with the new state.


	Collect outputs of the various tools
		as many src-dest pairs as possible
		testing period as long as possible
	Parse the outputs to produce only numbers in a way that allows for statistical analysis of many runs
	Compare the estimated network properties with real network topology (where known)


	CAIDA peering with sysadmins of many ASes.
	Topology information is difficult to manage:
		Incredibly high level of detail
		No homogeneous structure/technology
		No steady technology (changing fast over time)
		Need to make as few assumptions and simplifications as possible because we don’t know which factors are important for bandwidth estimation


	Main components:
		nodes (router, workstation, switch, etc.)
		interfaces (network card, router/switch ports, etc.)
		links
		IP addresses (including names and ASes)
	Extreme importance to keeping track of the information sources and people.


	One interface ® many IP addresses
		e.g. IP aliasing
	Many interfaces ® one IP address
		e.g. Load balancing in server cluster
	Many interfaces ® one link
		e.g. Ethernet
	One interface ® many links
		e.g. Special hardware for low-level load balancing (Cisco?)


	Links technology is difficult to categorize:
		Ethernet, Token Ring, FDDI
		Packet over ATM
		ATM virtual circuits
		Packet over Sonet (POS)
		Software limited bandwidth channels
		... and in a few months?


	Definition left vague in the DB design

	Added attribute “containing_link_id” to link_info
		Currently unused
		Allows for representation of tree structures:


	Every piece of knowledge about the topology is added to the database along with the source of the information and the time
		allows to fetch a “snapshot” of the topology at any given time, past or present
		useful for estimating tools performance because at the time the logs are analyzed the topology may have changed


	Real topology is impossible to determine in an automated way
		Need to gather infos from the sysadmins
		Slow process

	Massive use of Traceroute and/or Skitter
		Discovered topology is just an approximation of the reality, but enough for tool performance estimation
		Fast and automated process


	Traceroute reconstructed path:


	Interfaces C and E are missing!


	In interpreting a traceroute path we assume the existence of hidden interfaces.
		e.g. In the following example in the database would be inserted not only the four interfaces A, B, D and F, but also two unnamed interfaces (which correspond in reality to C and E)
		this assumption may cause a reverse path to appear distinct from the forward one even if they are symmetric
		the only way to solve the problem is asking to the sysadmins L



	Database on Kokoro

	Tools and documentation on Cider

	Stefano on the Moo