Comparing Multi6 and Atoms Introduction ============ This document summarises the state of the IETF multi6 working group around the time of IETF 57 from the perspective of the atoms project. The proposals made within the multi6 group are briefly reviewed and compared with the atoms architecture. Multi6 goals ------------ http://ietf.org/html.charters/multi6-charter.html Multihoming today is done largely by having a site obtain a block of address space and then advertising a route for that prefix through each of its ISP connections. The address block may be from the so-called provider independent space, or may be a sub-allocation from one of its ISPs. A site's ISPs in turn advertise the prefix to some or all of their upstream connections and the route for the prefix may propagate to all of the routers connected to the default-free zone (DFZ). As the number of sites multihoming in this manner increase, the number of routes propagated throughout the DFZ increases and overall routing stability decreases because of the burden on convergence time. This WG will explore alternative approaches with better scaling properties. Specifically, the WG will prefer multi-homing solutions that tend to minimise adverse impacts on the end-to-end routing system and limit the number of prefixes that need to be advertised in the Default-Free Zone (DFZ). Atoms goals ----------- https://www.caida.org/funding/atomized_routing/ CAIDA is researching and implementing modifications to BGP routing that aggregate prefixes into equivalence classes (policy atoms) based on common AS path from a given topological location. The motivation behind development of BGP atomization mechanisms is to achieve potential savings in computation and communication costs (by absorbing routing dynamics of prefixes into coarser grained atoms), as well as reduction in BGP table size (there will be far fewer atoms than prefixes). Current multi6 proposals ======================== Preliminary note: multi6 deals with site multihoming, not per-host multihoming. Since in this context there can be no confusion, I'm referring to a host in a multihomed site as a 'multihomed host'. The multi6 proposals fall into the following (overlapping) categories: Category: multiple prefixes --------------------------- Under the proposals in this category a multihomed site has multiple prefixes, one prefix per provider. The main proposal in this category is [HOSTS]. Each host in the multihomed site is assigned as many addresses as there are providers: one address from each provider's prefix. The prefix obtained from a provider is fully aggregatable in that provider's address space, and is not globally advertised. In addition, no provider-independent address space assignments are made to multihomed sites. As a result, there are no global routes *specifically* for a multihomed site, as there are today. A multihomed site is (globally) reachable through one of its providers' routes. [ In some cases PI address space might still be used for multihomed sites, but this would be the exception rather than the rule. For example, only very large multinational sites might be assigned PI address space. ] Today, traffic to a multihomed site is forwarded on the site's globally advertised prefix. BGP keeps track of the 'best' path to the site's prefix from any given point in the Internet (based on reachability, metrics and policy). Under the proposals in this category, with multiple prefixes per site and each prefix aggregated into a site's provider's address space, BGP can only keep track of what paths are best to the site's providers. When sending traffic to a multihomed host, the sender must pick one of the host's addresses as the destination address. But picking a destination address affects which of the site's providers, and therefore which path, is used to deliver traffic from the sender to the site. Therefore destination address selection effectively performs a routing decision performed today by BGP, and should be sensitive to reachability and metrics associated with the path corresponding to each address. With multiple addresses assigned to a multihomed host it becomes possible to dynamically change the IP address used to address the host. For example, when a multihomed site loses connectivity to one of its providers, the hosts in the site can continue to communicate if they are subsequently addressed by an IP address obtained from another provider. This raises the issue that transport layers and applications use IP addresses to identify hosts. In particular TCP connections do not tolerate IP address changes of a host. One of the goals set by multi6 is that long-lived transport connections (e.g. TCP connections) should be preserved given address changes due to multihoming. Note that dynamically switching between addresses during the lifetime of a connection is not unlike mobility. Some proposals therefore incorporate IP mobility mechanisms [MULTI6-MNM]. Other proposals invoke STCP to solve transport connection survivability [SCTP-MULTIHOME], or depend on the locator/identifier distinction described below. Other issues for 'Multiple prefixes' ------------------------------------ When a multihomed host sends a packet, a choice can be made which of the site's providers to use. An issue here is that the packet's source address must be part of the chosen provider's address space, or the provider will reject the packet due to ingress filtering (a commonly used measure against source address spoofing) [HOSTS]. Another issue with source address selection is that unless something special is done, the source address used by a host will be used by the peer to send reverse traffic to. Therefore choosing a source address may affect how reverse traffic is forwarded, which may not be desirable. There are cases in which it is impossible to find a pair of addresses that allow two hosts (in different multihomed sites) to communicate bidirectionally. [???] Relevant drafts in this category: The main draft is [HOSTS]. [SCTP-MULTIHOME] describes different ways in which SCTP can work or fails to work for multihomed hosts. [MULTI6-MNM] decribes how MIPv6 can be applied to multihoming. [NAROS] is a traffic engineering solution for a multihomed site. Category: Loc/ID ---------------- Traditionally, IP addresses have been used for identification and location of a host (or network interface). An example of IP address based identification is checksum computation by transport layers. An example of location is traffic forwarding based on IP addresses. The proposals in this category explicitly distinguish identity from location: each host is assigned an identifier, which is associated with one or more locators. (I'm ignoring the fact that a host can have more than one identifier, and identifiers can be associated with entities other than hosts, such as interfaces.) Locators are used to forward packets to through the network and could be today's IP addresses. The set of locators of a host can change dynamically, e.g. be dependent on reachability properties. Identifiers are used by transport (and security) protocols and applications to identify hosts. The identifier of a host is more stable than its locators and can be maintained despite changes to the set of locators. An example of an identifier is a (hash of a) public key. In the context of multihoming, 'Loc/ID' appears to be mostly a subcategory of 'Multiple prefixes'. The locator/identifier distinction allows a host to be associated with multiple addresses (possibly changing over time) and yet preserve its identity to the transport layer and applications. In particular it solves the specific problem of how to enable transport connections (e.g. TCP connections) to survive events in which the 'preferred' address of a host changes (as decribed earlier under 'Multiple prefixes'). However, the locator/identifier distinction has broader applicability than multihoming. It can be applied in the area of mobility and enables sites to cope with IPv6 renumbering events. As a special case, it may allow sites to switch providers more easily. Another reason to distinguish location from identity is to allow a more independent architectural evolution of architectural layers. The locator/identifier distinction comes in several varieties: o 'Combined': A locator and an identifier are combined to form an IPv6 address (e.g. 8 bytes of each). To properly identify a host an application or transport layer must only consider the identifier part of an address, or the locator part must be cleared (e.g. zeroed out) before the IPv6 address is presented to the application or transport layer. [GSE, E2E-MULTIHOMING, LIN6-MULTIHOME-API] o 'Separated': IDs and locators are separate entities. The locator is commonly an IPv6 address, and the identifier takes on different forms in different proposals. Having separated IDs and locators may imply an explicit mapping of ID to locators and possibly also a reverse mapping. HIP [HIP-ARCH, HIP-MM] proposes a new namespace, the Host Identity namespace, and a new layer, the Host Identity Layer, between the internetworking and transport layers. Identifiers are based on public keys. In [2PI1A] a multihomed host uses one of its IPv6 addresses as identifier and the remainder as locators. (The identifier optionally does double duty as a locator.) [MHAP] is discussed in more detail later. Category: IPv4-style -------------------- In this category multihoming is performed as it is today and as described in the multi6 charter quoted above. However the effects on the global routing table are mitigated in some way. There are currently two proposals based on this principle: o [ASN-PI] This proposal derives multihomed IPv6 provider-independent prefixes from AS numbers *currently* allocated. This is explicitly a temporary limited application of IPv4-style multihoming. It is restricted to a limited number of sites by incorporating an AS number in the range of 1 - 32767 in the PI prefix. o [ISP-INT-AGGR] Within an AS, the global routing table is partitioned and distributed over the border routers. Border routers are dedicated to carrying the assigned partition and merely carry an aggregated route for each partition not assigned to them. 'Scenic routing' is avoided by assigning address space in a geographically aggregatable manner. Proposed as an intermediate solution. MHAP ---- [MHAP] belongs to the Loc/ID category ('separated' variant), but is sufficiently distinct from other drafts to warrant a separate discussion. The MHAP draft is quite extensive, and the following is highly summarised and omits many details. Also, I have replaced some of the terminology. MHAP's Internet consists of two BGP-based infrastructures: o A 'PA infrastructure' A multihomed site is assigned multiple provider-aggregatable prefixes, one for each of provider. Single-homed sites have one PA prefix. The PA address space can be highly aggregated. o A 'PI infrastructure' Each multi-homed site has one provider-independent prefix. A single-homed site does not have PI address space. The infrastructures have their own default-free zones (DFZs) and are able to forward packets independently. The bulk of the traffic is carried over the PA infrastructure as follows. A host that wishes to send a packet to a multihomed host addresses the packet to the remote host's PI address. Before the packet leaves the source site, the PI destination address is 'aliased' to (replaced by) one of the corresponding PA addresses and then forwarded by the PA infrastructure based on the chosen PA address. As the packet enters the destination site (before it reaches the destination host) the destination PA address is aliased back to the original PI destination address. Note that the process of aliasing twice is somewhat similar to tunneling or MPLS. This process effectively separates location and identity for multihomed destinations, with PA addresses acting as locators and PI addresses acting as identifiers. The source site is required to learn from the destination site the mapping between the destination site's PI prefix and one or more corresponding PA prefixes. The PI infrastructure is capable of forwarding (unaliased) packets to the PI address space but is only used in a limited way: o To discover the mapping between a PI prefix and the PA prefixes, the source site sends a request to the destination site. This request is forwarded over the PI infrastructure. o To improve traffic latency while the discovery is in progress, a limited amount of unaliased traffic is forwarded by the PI infrastructure. o During deployment and to support senders that are not MHAP-aware, other traffic may be forwarded by the PI infrastructure. The PI infrastructure potentially contains large routing tables, but carries a relatively light traffic load. Conversely, the PA infrastructure carries heavy traffic load, but contains small, highly aggregated, routing tables. Distributing information ------------------------ In most proposals sites, hosts or routers need to communicate multihoming- related information. Depending on the proposal one or more of the following information must be distributed: o Alternative addresses available to address a host. o A locator/identifier mapping. o Routing information (reachability, metrics, etc.). This section summarises the mechanisms deployed. BGP --- All proposals rely on BGP in one way or another. In particular the IPv4-style multihoming proposals depend on BGP for providing routes to and from a multihomed site, as is the case today. To select between multiple addresses or between multiple locators, [E2E-MULTIHOMING] places full BGP tables on every host. BGP routing information can then guide the selection process. The assumption is that under IPv6 aggressive aggregation will be possible and full rouing tables will not be very large. Measurement ----------- As described earlier ('Multiple prefixes'), selecting an address or locator out of a number of alternatives affects path selection and is therefore a routing decision. A number of drafts propose guiding the address selection process by taking (end-to-end) measurements of paths and/or detecting ICMP messages for failed paths. A simple implementation is for a host to try different addresses in turn until a satisfactory path is found. DNS --- DNS may be used to carry multihoming information. In fact DNS is already capable of carrying multiple addresses for a given hostname. For Loc/ID solutions, DNS can be extended to contain locator-identifier associations. The advantage of using DNS is that the infrastructure is already there and a host initiating communication usually needs to access DNS anyway to map a hostname to an IP address. Additional multihoming information can be retrieved at the same timee. In [GSE], [HOSTS], [SCTP-MULTIHOME] and [MULTI6-MNM] an initial set of alternative destination addresses are looked up in DNS. In [2PI1A] a client uses DNS to find multiple IPv6 addresses of a multihomed server. One of these addresses is the identifier of the server. The remainder are locators. [E2E-MULTIHOMING] proposes to use DNS for mapping a hostname to a set of addresses, and for mapping an identifer of a host to a hostname. Since an identifier forms part of an address, the two mappings together enable a mapping from one address to alternative addresses. Source address -------------- The source address of a packet is normally used as the destination address of reverse traffic. This property can be used to signal a (preferred) destination address to be used by the remote peer [GSE]. In [HOSTS] one host picks a pair of endpoint IP addresses and the same pair is used by the peer. [NAROS] appears to use this method too. In [2PI1A] locators and identifiers are IPv6 addresses. One of the locators of a host may act as the identifier. As described above, a server can have any number of locators registered in DNS. When sending a packet to a client, the server may indicate its 'preferred' locator using the source address. In addition, [2PI1A] uses a clever trick that allows a multihomed client to signal a pair of locators (one of which is an identifier) to another host without using DNS. The locator pair is constructed in such a way that either locator can be transformed into the other. The client need only fill in either of the locators in the source address of a packet in order to transmit both locators. Peer updates ------------ A multihomed host may inform its peer of alternative address(es) to use after communication between the hosts has started. If no locator/identifier distinction is used, then some other mechanism is needed that allows transport connections to survive address changes. Likely candidate mechanisms are mobility mechanisms, e.g. the 'Binding Update' of MIPv6. [HIP-MM] [SCTP-MULTIHOME] [MULTI6-MNM] Special Servers --------------- Some drafts introduce special servers (or 'agents') that keep track of locations. In most cases these servers belong to a multihomed receiver and tell the sender what destination address to use [E2E-MULTIHOMING]. In other cases receiver-owned servers perform relaying of traffic [HIP-MM]. In [NAROS] a server is associated with a multihomed site and tells senders in that site what source address (and thereby what provider) to use for outbound traffic. This enables outbound traffic engineering. It is unclear whether the peer is expected to use this source address as its destination address (as under 'Source address'). If so, NAROS is also an inbound traffic engineering solution. Special Infrastructure ---------------------- [MHAP] proposes a separate BGP based infrastructure that not only maps identifiers to locators but also performs limited routing based on identifiers. MHAP was discussed in more detail above. Where to implement ------------------ Multihoming can be implemented at the network, transport or application level. The end-to-end argument implies a preference for higher layers. However, all transport protocols and applications can benefit from a single network layer based implementation. A transport layer implementation can be done for TCP but appears impractical for UDP. Another choice is whether to implement multihoming in routers, hosts, separate servers (agents or proxies), or some combination. Security of bindings -------------------- Distinguishing location and identity implies a mapping between the two which should be secured. Examples of possible attacks are: o Diverting traffic along an unintended path by changing the set of locators in the mapping of a victim's identifier. o Performing a DOS attack by allocating an identifier, mapping it to a locator belonging to the victim, and starting a high-volume session for the identifier at a third party server. Multiple prefixes have similar security issues. Current atoms proposal ====================== This section assumes the reader is familiar with the atoms proposal so is very brief. Multihomed ASes (sites or providers) partition their set of prefixes into equivalence classes defined by routing properties (reachability through upstream providers, other routing properties). Each such set of prefixes constitutes an atom. The AS 'declares' an atom, by announcing a per-atom route, and specifying what prefixes are part of the atom. For the purpose of routing an atom is represented by a special prefix, called an 'atom ID'. Atoms are units of global routing and carried by the DFZ. Prefixes that are part of an atom are not carried as individual routes by the DFZ. However the AS is allowed to make announcements of such prefixes separate from the atom announcements. These per-prefix routes are dropped as they reach the DFZ. The per-prefix routes allow more specific routing without affecting the global routing table. They are carried by ASes upstream of the origin AS or by customers of the upstream ASes. A mapping between atoms and contained prefixes is maintained by 'edge routers' (routers at the edge of the DFZ). As traffic enters the DFZ, the edge router that accepts it maps the destination address to the corresponding atom ID, encapsulates the packet by adding an IP header containing an address from the atom ID prefix as destination address, and forwards it. The packet is effectively tunneled along the atom route to the destination AS. At the destination AS the packet is decapsulated. In the DFZ, only edge routers need to carry the prefix->atom ID mapping. Other DFZ routers contain a reduced, atom-based routing table. Furthermore, all routing decisions and computations in the DFZ are based on per-atom routes. Based on observations through Routeviews (July 16 2003 12:00 data) we estimate the number of atoms to be 24430 (covering 122659 prefixes). We are able to further reduce the number of globally advertised objects by defining the prefix equivalence classes at the level of the immediate providers of ASes, as follows. A prefix is defined to be equivalent to another prefix (regardless of which AS it belongs to) if it is announced to the same set of immediate providers, and has the same routing properties (other than the origin AS). The immediate providers are responsible for declaring atoms corresponding to these equivalence classes. If we apply this method to prefixes originated by stub ASes (in the Routeviews-based AS graph) the number of atoms is reduced to 12773. To distinguish the two kinds of atoms we call the former origin-declared atoms and the latter provider-declared atoms. Comparing multi6 and atoms ========================== The atoms proposal falls into the category of IPv4-style of multihoming. It assumes current IPv4 practices for assignment of provider-independent and provider-aggregatable address space to sites. It is implemented entirely by routers as a modification to BGP and is transparent to hosts. It does not affect long-lived transport connections. Security considerations have not been fully analysed. Multi6 proposals are generally specific to IPv6, with the possible exception of the IPv4-style multihoming proposals. Atoms was designed for IPv4 but is agnostic to the IP version. Multi6 specifically addresses multihoming. However multihoming is only one of the causes of global routing table growth. [BGP-GROWTH] identifies the following causes: o Multihoming: the practice of announcing a global route through several providers. o Inbound traffic engineering: cases where prefixes are announced non-identically. Note that [BGP-GROWTH] refers to this category as 'load balancing'. o Failure to aggregate: cases where aggregation can be performed but isn't. o Address fragmentation: cases where a site is unable to aggregate address space because the address space does not form a single CIDR block. Multi6 handling of these causes is as follows: o Multihoming is clearly addressed by multi6, as discussed. o Inbound traffic engineering is handled outside of BGP in most multi6 proposals, e.g. by having different senders use different destination addresses when sending to a multihomed site. o Failure to aggregate is not specifically addressed by multi6. However it is unclear to me how prevalent failure to aggregate will be in an IPv6 world. If most address space is allocated out of provider-aggregatable address space, this may not be an issue. o Address fragmentation is also not specifically addressed by multi6. The IPv6 answer to this issue is that renumbering is much more feasible than under IPv4 and can unfragment address space. However, it appears that even IPv6 renumbering remains objectionable to many people. Atoms: o Multihoming and inbound traffic engineering. Under origin-declared atoms (see 'Current atoms proposal' above), one or more routing entries for each multihomed AS will be present in the global routing table. However, under provider-declared atoms, several customers that multihome to their immediate providers in the same way can be summarised into fewer global routing table entries. o Failure to aggregate is not specifically dealt with under atoms. Failure to aggregate might translate to failure to atomise :) o Fragmented address space can be aggregated by atoms since an atom is not restricted to being a CIDR block. Multi6 techniques ----------------- A number of multi6 proposals employ techniques (some of which are IPv6-only) that may be of use to atoms. For example: Offloading edge routers ----------------------- In [MHAP] sender and receiver sites are responsible for aliasing. A sender site contacts a receiver site to discover the mapping from the receiver's provider-independent address space to the corresponding provider-aggregatable address space. The main function of the PI-DFZ is to forward discovery traffic from sender to receiver site as well as to carry a limited amount of traffic while discovery is in progress. A somewhat similar technique could be applied under atoms in order to offload the DFZ-edge routers as follows. When an edge router receives the first packet from a sender site addressed to some destination site it sends a message back to the sender informing it of the atom ID corresponding to the destination. As soon as the sender site has received this information, it rather than the DFZ-edge router becomes responsible for performing the encapsulation. Until that time, to reduce latency the DFZ-edge router will continue to encapsulate and forward a limited amount of traffic on behalf of the sender site. This offloads the edge routers at the expense of some transparency of atoms to the sender site. Combined Loc/ID --------------- There are known issues with tunnels, such as path MTU discovery and ICMP handling. Under IPv6, a 'Loc/ID'-style IPv6 address could replace the tunneling mechanism in atoms, similar to [GSE]. The first part of the IPv6 address (locator) contains an atom ID and is to be filled in by the DFZ-edge router. The second part (identifier) identifies the destination host in some way to be further specified and is set by the sending host. However, such a scheme can only work if a radically different address allocation mechanism is in place. This is probably beyond the scope of the atoms project. References ========== [GSE] Mike O'Dell, "GSE - An Alternate Addressing Architecture for IPv6," 1997. http://www.free.net/Docs/IETF/draft-ietf-ipngwg-gseaddr-00.txt [ASN-PI] P. Savola, "Multihoming Using IPv6 Addressing Derived from AS Numbers," January 2003. http://www.ietf.org/internet-drafts/draft-savola-multi6-asn-pi-00.txt [ISP-INT-AGGR] I. van Beijnum, "Provider-Internal Aggregation based on Geography to Support Multihoming in IPv6," June 30, 2003. http://www.ietf.org/internet-drafts/draft-van-beijnum-multi6-isp-int-aggr-01.txt [HOSTS] C. Huitema, R. Draves, "Host-Centric IPv6 Multihoming," June 24, 2002. http://www.info.ucl.ac.be/people/delaunoi/paper/draft-huitema-multi6-hosts-01.txt [EXPERIMENT] C. Huitema, D. Kessens, "Simple Dual Homing Experiment," June 20, 2003. http://www.ietf.org/internet-drafts/draft-huitema-multi6-experiment-00.txt [MHAP] M. Py, "Multi Homing Aliasing Protocol (MHAP)," April 29, 2002. http://arneill-py.sacramento.ca.us/ipv6mh/draft-py-mhap-01a.txt [HIP-ARCH] R. Moskowitz, P. Nikander, "Host Identity Protocol Architecture," April 2003. http://www.ietf.org/internet-drafts/draft-moskowitz-hip-arch-03 [HIP-MM] P. Nikander, J. Arkko, P. Jokela, "End-Host Mobility and Multi-Homing with Host Identity Protocol," June 17, 2003. http://www.ietf.org/internet-drafts/draft-nikander-hip-mm-00.txt [SCTP-MULTIHOME] L. Coene, "Multihoming issues in the Stream Control Transmission Protocol," June 2003. http://www.ietf.org/internet-drafts/draft-coene-sctp-multihome-04.txt [MULTI6-MNM] M. Bagnulo, A. Garcia-Martinez, I. Soto, "Application of the MIPv6 protocol to the multi-homing problem," February 25, 2003. http://www.ietf.org/internet-drafts/draft-bagnulo-multi6-mnm-00.txt [2PI1A] I. Van Beijnum, "Two Prefixes in One Address," May 11, 2003. http://www.muada.com/drafts/draft-van-beijnum-multi6-2pi1a.txt [NAROS] C. de Launois, O. Bonaventure, "NAROS : Host-Centric IPv6 Multihoming with Traffic Engineering," May 15, 2003. http://www.ietf.org/internet-drafts/draft-de-launois-multi6-naros-00 [MULTIHOMED-ISPS] M. Ohta, "Multihomed ISPs and Policy Control," June 2003. http://www.ietf.org/internet-drafts/draft-ohta-multihomed-isps-00.txt [E2E-MULTIHOMING] M. Ohta, "The Architecture of End to End Multihoming," June 2003. http://www.ietf.org/internet-drafts/draft-ohta-e2e-multihoming-05.txt [ASSIGN-SELECT-E2E-MULTIHOME] K. Ohira, K. Ogata, A. Matsumoto, K. Fujikawa, Y. Okabe, "IPv6 Address Assingment and Route Selection for End-to-End Multihoming," June 30, 2003. http://www.ietf.org/internet-drafts/draft-ohira-assign-select-e2e-multihome-01.txt [LIN6-MULTIHOME-API] Arifumi Matsumoto, Kenji Fujikawa, Yasuo Okabe, "Basic Socket API Extensions for LIN6 End-to-End Multihoming," 23 June 2003. http://www.ietf.org/internet-drafts/draft-arifumi-lin6-multihome-api-00.txt [BGP-GROWTH] T. Bu, L. Gao and D. Towsley, "On characterizing BGP routing table growth," Proc. IEEE Global Telecommunications Conf. (GLOBECOMM), pp. 2197-2201, Nov. 2002.