faqs.org - Internet FAQ Archives

RFC 4271 - A Border Gateway Protocol 4 (BGP-4)


Or Display the document by number




Network Working Group                                    Y. Rekhter, Ed.
Request for Comments: 4271                                    T. Li, Ed.
Obsoletes: 1771                                            S. Hares, Ed.
Category: Standards Track                                   January 2006

                  A Border Gateway Protocol 4 (BGP-4)

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This document discusses the Border Gateway Protocol (BGP), which is
   an inter-Autonomous System routing protocol.

   The primary function of a BGP speaking system is to exchange network
   reachability information with other BGP systems.  This network
   reachability information includes information on the list of
   Autonomous Systems (ASes) that reachability information traverses.
   This information is sufficient for constructing a graph of AS
   connectivity for this reachability from which routing loops may be
   pruned, and, at the AS level, some policy decisions may be enforced.

   BGP-4 provides a set of mechanisms for supporting Classless Inter-
   Domain Routing (CIDR).  These mechanisms include support for
   advertising a set of destinations as an IP prefix, and eliminating
   the concept of network "class" within BGP.  BGP-4 also introduces
   mechanisms that allow aggregation of routes, including aggregation of
   AS paths.

   This document obsoletes RFC 1771.

Table of Contents

   1. Introduction ....................................................4
      1.1. Definition of Commonly Used Terms ..........................4
      1.2. Specification of Requirements ..............................6
   2. Acknowledgements ................................................6
   3. Summary of Operation ............................................7
      3.1. Routes: Advertisement and Storage ..........................9
      3.2. Routing Information Base ..................................10
   4. Message Formats ................................................11
      4.1. Message Header Format .....................................12
      4.2. OPEN Message Format .......................................13
      4.3. UPDATE Message Format .....................................14
      4.4. KEEPALIVE Message Format ..................................21
      4.5. NOTIFICATION Message Format ...............................21
   5. Path Attributes ................................................23
      5.1. Path Attribute Usage ......................................25
           5.1.1. ORIGIN .............................................25
           5.1.2. AS_PATH ............................................25
           5.1.3. NEXT_HOP ...........................................26
           5.1.4. MULTI_EXIT_DISC ....................................28
           5.1.5. LOCAL_PREF .........................................29
           5.1.6. ATOMIC_AGGREGATE ...................................29
           5.1.7. AGGREGATOR .........................................30
   6. BGP Error Handling. ............................................30
      6.1. Message Header Error Handling .............................31
      6.2. OPEN Message Error Handling ...............................31
      6.3. UPDATE Message Error Handling .............................32
      6.4. NOTIFICATION Message Error Handling .......................34
      6.5. Hold Timer Expired Error Handling .........................34
      6.6. Finite State Machine Error Handling .......................35
      6.7. Cease .....................................................35
      6.8. BGP Connection Collision Detection ........................35
   7. BGP Version Negotiation ........................................36
   8. BGP Finite State Machine (FSM) .................................37
      8.1. Events for the BGP FSM ....................................38
           8.1.1. Optional Events Linked to Optional Session
                  Attributes .........................................38
           8.1.2. Administrative Events ..............................42
           8.1.3. Timer Events .......................................46
           8.1.4. TCP Connection-Based Events ........................47
           8.1.5. BGP Message-Based Events ...........................49
      8.2. Description of FSM ........................................51
           8.2.1. FSM Definition .....................................51
                  8.2.1.1. Terms "active" and "passive" ..............52
                  8.2.1.2. FSM and Collision Detection ...............52
                  8.2.1.3. FSM and Optional Session Attributes .......52
                  8.2.1.4. FSM Event Numbers .........................53

                  8.2.1.5. FSM Actions that are Implementation
                           Dependent .................................53
           8.2.2. Finite State Machine ...............................53
   9. UPDATE Message Handling ........................................75
      9.1. Decision Process ..........................................76
           9.1.1. Phase 1: Calculation of Degree of Preference .......77
           9.1.2. Phase 2: Route Selection ...........................77
                  9.1.2.1. Route Resolvability Condition .............79
                  9.1.2.2. Breaking Ties (Phase 2) ...................80
           9.1.3. Phase 3: Route Dissemination .......................82
           9.1.4. Overlapping Routes .................................83
      9.2. Update-Send Process .......................................84
           9.2.1. Controlling Routing Traffic Overhead ...............85
                  9.2.1.1. Frequency of Route Advertisement ..........85
                  9.2.1.2. Frequency of Route Origination ............85
           9.2.2. Efficient Organization of Routing Information ......86
                  9.2.2.1. Information Reduction .....................86
                  9.2.2.2. Aggregating Routing Information ...........87
      9.3. Route Selection Criteria ..................................89
      9.4. Originating BGP routes ....................................89
   10. BGP Timers ....................................................90
   Appendix A.  Comparison with RFC 1771 .............................92
   Appendix B.  Comparison with RFC 1267 .............................93
   Appendix C.  Comparison with RFC 1163 .............................93
   Appendix D.  Comparison with RFC 1105 .............................94
   Appendix E.  TCP Options that May Be Used with BGP ................94
   Appendix F.  Implementation Recommendations .......................95
                Appendix F.1.  Multiple Networks Per Message .........95
                Appendix F.2.  Reducing Route Flapping ...............96
                Appendix F.3.  Path Attribute Ordering ...............96
                Appendix F.4.  AS_SET Sorting ........................96
                Appendix F.5.  Control Over Version Negotiation ......96
                Appendix F.6.  Complex AS_PATH Aggregation ...........96
   Security Considerations ...........................................97
   IANA Considerations ...............................................99
   Normative References .............................................101
   Informative References ...........................................101

1.  Introduction

   The Border Gateway Protocol (BGP) is an inter-Autonomous System
   routing protocol.

   The primary function of a BGP speaking system is to exchange network
   reachability information with other BGP systems.  This network
   reachability information includes information on the list of
   Autonomous Systems (ASes) that reachability information traverses.
   This information is sufficient for constructing a graph of AS
   connectivity for this reachability, from which routing loops may be
   pruned and, at the AS level, some policy decisions may be enforced.

   BGP-4 provides a set of mechanisms for supporting Classless Inter-
   Domain Routing (CIDR) [RFC1518, RFC1519].  These mechanisms include
   support for advertising a set of destinations as an IP prefix and
   eliminating the concept of network "class" within BGP.  BGP-4 also
   introduces mechanisms that allow aggregation of routes, including
   aggregation of AS paths.

   Routing information exchanged via BGP supports only the destination-
   based forwarding paradigm, which assumes that a router forwards a
   packet based solely on the destination address carried in the IP
   header of the packet.  This, in turn, reflects the set of policy
   decisions that can (and cannot) be enforced using BGP.  BGP can
   support only those policies conforming to the destination-based
   forwarding paradigm.

1.1.  Definition of Commonly Used Terms

   This section provides definitions for terms that have a specific
   meaning to the BGP protocol and that are used throughout the text.

   Adj-RIB-In
      The Adj-RIBs-In contains unprocessed routing information that has
      been advertised to the local BGP speaker by its peers.

   Adj-RIB-Out
      The Adj-RIBs-Out contains the routes for advertisement to specific
      peers by means of the local speaker's UPDATE messages.

   Autonomous System (AS)
      The classic definition of an Autonomous System is a set of routers
      under a single technical administration, using an interior gateway
      protocol (IGP) and common metrics to determine how to route
      packets within the AS, and using an inter-AS routing protocol to
      determine how to route packets to other ASes.  Since this classic
      definition was developed, it has become common for a single AS to

      use several IGPs and, sometimes, several sets of metrics within an
      AS.  The use of the term Autonomous System stresses the fact that,
      even when multiple IGPs and metrics are used, the administration
      of an AS appears to other ASes to have a single coherent interior
      routing plan, and presents a consistent picture of the
      destinations that are reachable through it.

   BGP Identifier
      A 4-octet unsigned integer that indicates the BGP Identifier of
      the sender of BGP messages.  A given BGP speaker sets the value of
      its BGP Identifier to an IP address assigned to that BGP speaker.
      The value of the BGP Identifier is determined upon startup and is
      the same for every local interface and BGP peer.

   BGP speaker
      A router that implements BGP.

   EBGP
      External BGP (BGP connection between external peers).

   External peer
      Peer that is in a different Autonomous System than the local
      system.

   Feasible route
      An advertised route that is available for use by the recipient.

   IBGP
      Internal BGP (BGP connection between internal peers).

   Internal peer
      Peer that is in the same Autonomous System as the local system.

   IGP
      Interior Gateway Protocol - a routing protocol used to exchange
      routing information among routers within a single Autonomous
      System.

   Loc-RIB
      The Loc-RIB contains the routes that have been selected by the
      local BGP speaker's Decision Process.

   NLRI
      Network Layer Reachability Information.

   Route
      A unit of information that pairs a set of destinations with the
      attributes of a path to those destinations.  The set of

      destinations are systems whose IP addresses are contained in one
      IP address prefix carried in the Network Layer Reachability
      Information (NLRI) field of an UPDATE message.  The path is the
      information reported in the path attributes field of the same
      UPDATE message.

   RIB
      Routing Information Base.

   Unfeasible route
      A previously advertised feasible route that is no longer available
      for use.

1.2.  Specification of Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Acknowledgements

   This document was originally published as [RFC1267] in October 1991,
   jointly authored by Kirk Lougheed and Yakov Rekhter.

   We would like to express our thanks to Guy Almes, Len Bosack, and
   Jeffrey C. Honig for their contributions to the earlier version
   (BGP-1) of this document.

   We would like to specially acknowledge numerous contributions by
   Dennis Ferguson to the earlier version of this document.

   We would like to explicitly thank Bob Braden for the review of the
   earlier version (BGP-2) of this document, and for his constructive
   and valuable comments.

   We would also like to thank Bob Hinden, Director for Routing of the
   Internet Engineering Steering Group, and the team of reviewers he
   assembled to review the earlier version (BGP-2) of this document.
   This team, consisting of Deborah Estrin, Milo Medin, John Moy, Radia
   Perlman, Martha Steenstrup, Mike St. Johns, and Paul Tsuchiya, acted
   with a strong combination of toughness, professionalism, and
   courtesy.

   Certain sections of the document borrowed heavily from IDRP
   [IS10747], which is the OSI counterpart of BGP.  For this, credit
   should be given to the ANSI X3S3.3 group chaired by Lyman Chapin and
   to Charles Kunzinger, who was the IDRP editor within that group.

   We would also like to thank Benjamin Abarbanel, Enke Chen, Edward
   Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey Haas, Dimitry
   Haskin, Stephen Kent, John Krawczyk, David LeRoy, Dan Massey,
   Jonathan Natale, Dan Pei, Mathew Richardson, John Scudder, John
   Stewart III, Dave Thaler, Paul Traina, Russ White, Curtis Villamizar,
   and Alex Zinin for their comments.

   We would like to specially acknowledge Andrew Lange for his help in
   preparing the final version of this document.

   Finally, we would like to thank all the members of the IDR Working
   Group for their ideas and the support they have given to this
   document.

3.  Summary of Operation

   The Border Gateway Protocol (BGP) is an inter-Autonomous System
   routing protocol.  It is built on experience gained with EGP (as
   defined in [RFC904]) and EGP usage in the NSFNET Backbone (as
   described in [RFC1092] and [RFC1093]).  For more BGP-related
   information, see [RFC1772], [RFC1930], [RFC1997], and [RFC2858].

   The primary function of a BGP speaking system is to exchange network
   reachability information with other BGP systems.  This network
   reachability information includes information on the list of
   Autonomous Systems (ASes) that reachability information traverses.
   This information is sufficient for constructing a graph of AS
   connectivity, from which routing loops may be pruned, and, at the AS
   level, some policy decisions may be enforced.

   In the context of this document, we assume that a BGP speaker
   advertises to its peers only those routes that it uses itself (in
   this context, a BGP speaker is said to "use" a BGP route if it is the
   most preferred BGP route and is used in forwarding).  All other cases
   are outside the scope of this document.

   In the context of this document, the term "IP address" refers to an
   IP Version 4 address [RFC791].

   Routing information exchanged via BGP supports only the destination-
   based forwarding paradigm, which assumes that a router forwards a
   packet based solely on the destination address carried in the IP
   header of the packet.  This, in turn, reflects the set of policy
   decisions that can (and cannot) be enforced using BGP.  Note that
   some policies cannot be supported by the destination-based forwarding
   paradigm, and thus require techniques such as source routing (aka
   explicit routing) to be enforced.  Such policies cannot be enforced
   using BGP either.  For example, BGP does not enable one AS to send

   traffic to a neighboring AS for forwarding to some destination
   (reachable through but) beyond that neighboring AS, intending that
   the traffic take a different route to that taken by the traffic
   originating in the neighboring AS (for that same destination).  On
   the other hand, BGP can support any policy conforming to the
   destination-based forwarding paradigm.

   BGP-4 provides a new set of mechanisms for supporting Classless
   Inter-Domain Routing (CIDR) [RFC1518, RFC1519].  These mechanisms
   include support for advertising a set of destinations as an IP prefix
   and eliminating the concept of a network "class" within BGP.  BGP-4
   also introduces mechanisms that allow aggregation of routes,
   including aggregation of AS paths.

   This document uses the term `Autonomous System' (AS) throughout.  The
   classic definition of an Autonomous System is a set of routers under
   a single technical administration, using an interior gateway protocol
   (IGP) and common metrics to determine how to route packets within the
   AS, and using an inter-AS routing protocol to determine how to route
   packets to other ASes.  Since this classic definition was developed,
   it has become common for a single AS to use several IGPs and,
   sometimes, several sets of metrics within an AS.  The use of the term
   Autonomous System stresses the fact that, even when multiple IGPs and
   metrics are used, the administration of an AS appears to other ASes
   to have a single coherent interior routing plan and presents a
   consistent picture of the destinations that are reachable through it.

   BGP uses TCP [RFC793] as its transport protocol.  This eliminates the
   need to implement explicit update fragmentation, retransmission,
   acknowledgement, and sequencing.  BGP listens on TCP port 179.  The
   error notification mechanism used in BGP assumes that TCP supports a
   "graceful" close (i.e., that all outstanding data will be delivered
   before the connection is closed).

   A TCP connection is formed between two systems.  They exchange
   messages to open and confirm the connection parameters.

   The initial data flow is the portion of the BGP routing table that is
   allowed by the export policy, called the Adj-Ribs-Out (see 3.2).
   Incremental updates are sent as the routing tables change.  BGP does
   not require a periodic refresh of the routing table.  To allow local
   policy changes to have the correct effect without resetting any BGP
   connections, a BGP speaker SHOULD either (a) retain the current
   version of the routes advertised to it by all of its peers for the
   duration of the connection, or (b) make use of the Route Refresh
   extension [RFC2918].

   KEEPALIVE messages may be sent periodically to ensure that the
   connection is live.  NOTIFICATION messages are sent in response to
   errors or special conditions.  If a connection encounters an error
   condition, a NOTIFICATION message is sent and the connection is
   closed.

   A peer in a different AS is referred to as an external peer, while a
   peer in the same AS is referred to as an internal peer.  Internal BGP
   and external BGP are commonly abbreviated as IBGP and EBGP.

   If a particular AS has multiple BGP speakers and is providing transit
   service for other ASes, then care must be taken to ensure a
   consistent view of routing within the AS.  A consistent view of the
   interior routes of the AS is provided by the IGP used within the AS.
   For the purpose of this document, it is assumed that a consistent
   view of the routes exterior to the AS is provided by having all BGP
   speakers within the AS maintain IBGP with each other.

   This document specifies the base behavior of the BGP protocol.  This
   behavior can be, and is, modified by extension specifications.  When
   the protocol is extended, the new behavior is fully documented in the
   extension specifications.

3.1.  Routes: Advertisement and Storage

   For the purpose of this protocol, a route is defined as a unit of
   information that pairs a set of destinations with the attributes of a
   path to those destinations.  The set of destinations are systems
   whose IP addresses are contained in one IP address prefix that is
   carried in the Network Layer Reachability Information (NLRI) field of
   an UPDATE message, and the path is the information reported in the
   path attributes field of the same UPDATE message.

   Routes are advertised between BGP speakers in UPDATE messages.
   Multiple routes that have the same path attributes can be advertised
   in a single UPDATE message by including multiple prefixes in the NLRI
   field of the UPDATE message.

   Routes are stored in the Routing Information Bases (RIBs): namely,
   the Adj-RIBs-In, the Loc-RIB, and the Adj-RIBs-Out, as described in
   Section 3.2.

   If a BGP speaker chooses to advertise a previously received route, it
   MAY add to, or modify, the path attributes of the route before
   advertising it to a peer.

   BGP provides mechanisms by which a BGP speaker can inform its peers
   that a previously advertised route is no longer available for use.
   There are three methods by which a given BGP speaker can indicate
   that a route has been withdrawn from service:

      a) the IP prefix that expresses the destination for a previously
         advertised route can be advertised in the WITHDRAWN ROUTES
         field in the UPDATE message, thus marking the associated route
         as being no longer available for use,

      b) a replacement route with the same NLRI can be advertised, or

      c) the BGP speaker connection can be closed, which implicitly
         removes all routes the pair of speakers had advertised to each
         other from service.

   Changing the attribute(s) of a route is accomplished by advertising a
   replacement route.  The replacement route carries new (changed)
   attributes and has the same address prefix as the original route.

3.2.  Routing Information Base

   The Routing Information Base (RIB) within a BGP speaker consists of
   three distinct parts:

      a) Adj-RIBs-In: The Adj-RIBs-In stores routing information learned
         from inbound UPDATE messages that were received from other BGP
         speakers.  Their contents represent routes that are available
         as input to the Decision Process.

      b) Loc-RIB: The Loc-RIB contains the local routing information the
         BGP speaker selected by applying its local policies to the
         routing information contained in its Adj-RIBs-In.  These are
         the routes that will be used by the local BGP speaker.  The
         next hop for each of these routes MUST be resolvable via the
         local BGP speaker's Routing Table.

      c) Adj-RIBs-Out: The Adj-RIBs-Out stores information the local BGP
         speaker selected for advertisement to its peers.  The routing
         information stored in the Adj-RIBs-Out will be carried in the
         local BGP speaker's UPDATE messages and advertised to its
         peers.

   In summary, the Adj-RIBs-In contains unprocessed routing information
   that has been advertised to the local BGP speaker by its peers; the
   Loc-RIB contains the routes that have been selected by the local BGP

   speaker's Decision Process; and the Adj-RIBs-Out organizes the routes
   for advertisement to specific peers (by means of the local speaker's
   UPDATE messages).

   Although the conceptual model distinguishes between Adj-RIBs-In,
   Loc-RIB, and Adj-RIBs-Out, this neither implies nor requires that an
   implementation must maintain three separate copies of the routing
   information.  The choice of implementation (for example, 3 copies of
   the information vs 1 copy with pointers) is not constrained by the
   protocol.

   Routing information that the BGP speaker uses to forward packets (or
   to construct the forwarding table used for packet forwarding) is
   maintained in the Routing Table.  The Routing Table accumulates
   routes to directly connected networks, static routes, routes learned
   from the IGP protocols, and routes learned from BGP.  Whether a
   specific BGP route should be installed in the Routing Table, and
   whether a BGP route should override a route to the same destination
   installed by another source, is a local policy decision, and is not
   specified in this document.  In addition to actual packet forwarding,
   the Routing Table is used for resolution of the next-hop addresses
   specified in BGP updates (see Section 5.1.3).

4.  Message Formats

   This section describes message formats used by BGP.

   BGP messages are sent over TCP connections.  A message is processed
   only after it is entirely received.  The maximum message size is 4096
   octets.  All implementations are required to support this maximum
   message size.  The smallest message that may be sent consists of a
   BGP header without a data portion (19 octets).

   All multi-octet fields are in network byte order.

4.1.  Message Header Format

   Each message has a fixed-size header.  There may or may not be a data
   portion following the header, depending on the message type.  The
   layout of these fields is shown below:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +                                                               +
      |                                                               |
      +                                                               +
      |                           Marker                              |
      +                                                               +
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |          Length               |      Type     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Marker:

         This 16-octet field is included for compatibility; it MUST be
         set to all ones.

      Length:

         This 2-octet unsigned integer indicates the total length of the
         message, including the header in octets.  Thus, it allows one
         to locate the (Marker field of the) next message in the TCP
         stream.  The value of the Length field MUST always be at least
         19 and no greater than 4096, and MAY be further constrained,
         depending on the message type.  "padding" of extra data after
         the message is not allowed.  Therefore, the Length field MUST
         have the smallest value required, given the rest of the
         message.

      Type:

         This 1-octet unsigned integer indicates the type code of the
         message.  This document defines the following type codes:

                              1 - OPEN
                              2 - UPDATE
                              3 - NOTIFICATION
                              4 - KEEPALIVE

         [RFC2918] defines one more type code.

4.2.  OPEN Message Format

   After a TCP connection is established, the first message sent by each
   side is an OPEN message.  If the OPEN message is acceptable, a
   KEEPALIVE message confirming the OPEN is sent back.

   In addition to the fixed-size BGP header, the OPEN message contains
   the following fields:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+
       |    Version    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     My Autonomous System      |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           Hold Time           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                         BGP Identifier                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Opt Parm Len  |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |             Optional Parameters (variable)                    |
       |