IMRG IPMP Review Team Report
----------------------------
September 2004

The Internet Measurement Research Group (within the IRTF) convened a
small team to review the materials related to the IP Measurement
Protocol (IPMP).  The members of the group (listed at the end of
this report) discussed IPMP and several larger issues.  In
particular, the team reviewed the following two Internet-Drafts:

    draft-mcgregor-ipmp-03.txt
    draft-bennett-ippm-ipmp-01.txt

The goal of this effort was to chart a strawman course for moving
forward with some sort of measurement protocol (if possible).

Note: This message represents the group's consensus.  However, that
does not mean that each member of the team agrees with each point in
this note.  The group reached rough agreement, not unaminity.

The following are the high-order bits from the discussion.

The fundamental challenge that measurement protocols attempt to
address is to provide a means to measure the network characteristics
researchers and operators want to understand in a way that provides
fine grained information about the network in a lightweight fashion.
To this end, we would suggest that IPMP wants to develop tools that
are:

  - implementable in reasonable timeframes on existing equipment,
    which means that they should not depend on ASIC development or
    new equipment purchase

  - deployable; ISPs would ideally want them, and at minimum not
    turn them off

  - useful to the ISPs in terms of their business rules and the
    questions they ask about their own networks

If the procedures or protocols are useful to the ISPs, one can
expect that they will be willing to collect the data, and may under
some appropriate rules also allow researchers to collect data or
share collected data with researchers.

In the above context, the team found the motivation for IPMP given
in both documents to be lacking --- to the point where the team did
not feel the current proposals are viable.  Several
related/supporting points were discussed:

  * From the perspective of a vendor developing equipment and
    protocols or an ISP deploying them, the IPMP proposals on the
    table do not look viable.  The fundamental goal of IPMP is to
    display the structure of a network and many of its fine-scale
    characteristics.  This is information that a service provider
    does not share with anyone else except - maybe - under
    NDA. Given that the protocols to obtain the information are
    fairly complex and involve a fair level of memory writes, the
    vendor will do this if and only if its ISP customers ask for it,
    and they are not asking for this.

  * Making a better ping or traceroute is, on the one hand, too
    narrow and mechanistic a focus and yet also too focused on what
    researchers might find compelling rather than what operators
    would. 

  * A tool to reverse engineer a network isn't needed by the ISPs.
    They already know the structure of their own networks.

That said, the team **strongly** believes that there is much room
for improvement in the state of network troubleshooting and
debugging.  In particular:

  * Some service providers are asking for a solution to a problem
    that may yield data that researchers may find valuable.  Within
    its own network, a service provider is generally interested in
    locating the links that introduce variability into their
    network.  It may view them as under-provisioned for offered
    load, as inappropriately routed, or whatever, but they are in
    fact interested in locating links that require upgrading in some
    form. 

  * Some service providers are asking (in TIA and related fora) how
    they can deploy SLAs that cross ISP boundaries. These may be
    among ISPs that form business coalitions, such as Teleglobe has
    tried to set up with its transit network customers, or among
    regional networks such as US RBOCs that view transitive SLAs as
    a rational approach.  The watchword in such consortia is "trust
    but verify"; it is in their interest to have a procedure or
    protocol that will allow them to isolate issues that may prevent
    them from meeting SLA guarantees in something resembling real
    time.  Since those SLAs are one-way, this means accurate one-way
    delay and jitter measurements host to host, POP to POP, or CPE
    to CPE.

In addition, in looking at the protocols themselves, we found
ourselves wondering how much could be learned by clever inference from
fairly simple data collection and black box measurement, as opposed
to explicit reporting of values.

As another example, we note that the intention of such procedures as
CalTech's FAST and MIT's XCP protocols is to detect and measure
variable delays in the network and cause traffic to be sent in such
a way as to maximize throughput while minimizing such delays. This
fundamental question is a direct corollary to that raised in
http://www.nwfusion.com/research/2002/1216isptestside1.html, and
that raised in the context of transitive Tier 2 network SLAs. These
would like to be able to identify the existence of an SLA failure or
other disturbance in the Force on a route, report its magnitude, and
isolate the disturbing device. To that end, we wonder can be done
with the numbers measured by Dina Katabi's XCP protocol.

Finally, the team wondered if a protocol that carries less global
information but more precision would be more deployable.  For
example if the stamps just consisted of an opaque ID, TTL and simple
32 bit counter running on "the most stable local frequency source",
then the ISP (w/ the engineering documentation for their own gear)
can use database techniques to compute everything carried by the
current protocol.  The stamps are simple enough where we can, with a
straight face, ask for them in multiple places within one box: input
and output framers, bus DMA engines, etc.  We can envision that this
would be an extremely valuable tool for an ISP to understand (and
diagnose) certain QoS properties of their own network.  Note that,
globally parsable metadata in the stamps probably has negative value
to most ISPs because it reduces an ISP's ability to keep it's assets
private.  The barrier to deployment in not so much the cost of the
implementation, but the indirect cost of the leaking proprietary
topology information.

At the same time, external researchers could use inference
techniques to get some of the same information, including most
dynamic properties such as queue depths etc.  The external users get
much less topology information, unless they make an explicit
arrangement with the ISP to get the annotations associated with the
opaque IDs.

In summary, the team came to two points of consensus: 1) that the
protocol is inadequately motivated by the proposals, even though
ISPs would like to be able to measure their and their neighbors'
networks; 2) that the protocol's complexity and intrusiveness are
inadequately justified with respect to other, potentially more
lightweight approaches that may be easier to deploy.  The main point
is that to get a protocol deployed, ISPs need to ask for it loudly
enough and router vendors need to be able to implement it easily
enough, and neither is argued by these proposals.

Review team members: Guy Almes (Internet2), Fred Baker (Cisco), Paul
  Barford (UWisc), Chistophe Diot (Intel Research), Ralph Droms
  (Cisco), Larry Dunn (Cisco), Matt Mathis (PSC), David Moore
  (CAIDA), Jennifer Rexford (AT&T Research), Neil Spring (Univ. of
  Washington)
Scribe / team shepherd: Mark Allman (ICIR)