PANRG K. Meynell Internet-Draft SCION Association Intended status: Informational J. García Pardo Expires: 25 January 2025 T. Zäschke ETH Zürich N. Rustignoli SCION Association 24 July 2024 SCION Research Questions draft-meynell-panrg-scion-research-questions-01 Abstract This draft intends to summarize all SCION related questions which are deemed important to be answered for SCION to properly work at Internet scale. The set of questions is at the moment not comprehensive, but we intend them to initiate a discussion to determine which questions should remain here, and which ones are missing. When appropriate, some example solutions can be provided for the community to determine whether said solutions are adequate or not. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://scionassociation.github.io/scion-research_I-D/draft-meynell- panrg-scion-research-questions.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft- meynell-panrg-scion-research-questions/. Discussion of this document takes place on the WG Working Group mailing list (mailto:panrg@irtf.org), which is archived at https://datatracker.ietf.org/rg/panrg. Subscribe at https://www.ietf.org/mailman/listinfo/panrg/. Source for this draft and an issue tracker can be found at https://github.com/scionassociation/scion-research_I-D. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Meynell, et al. Expires 25 January 2025 [Page 1] Internet-Draft scion-research_I-D July 2024 Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 25 January 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 3 3. Discovery, Distribution, and Trustworthiness of Path Properties . . . . . . . . . . . . . . . . . . . . . . . 3 3.1. ISD, AS Identity . . . . . . . . . . . . . . . . . . . . 4 3.2. Beacon Selection Policies . . . . . . . . . . . . . . . . 4 3.3. Name Resolution and DNS Service Binding (SVCB) . . . . . 5 3.4. Segment Dissemination . . . . . . . . . . . . . . . . . . 5 3.5. Periodic Beacon Propagation . . . . . . . . . . . . . . . 7 3.6. Beacon Optimization and Extensibility . . . . . . . . . . 7 3.7. DRKey . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.8. SCMP Authentication . . . . . . . . . . . . . . . . . . . 9 3.9. Proof of Transit . . . . . . . . . . . . . . . . . . . . 10 3.10. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4. Data Plane Stability . . . . . . . . . . . . . . . . . . . . 10 4.1. Link Load Balancing . . . . . . . . . . . . . . . . . . . 10 4.2. Reverse Path Refreshment . . . . . . . . . . . . . . . . 11 4.2.1. Proposed Solutions (not comprehensive) . . . . . . . 11 5. Interfacing SCION with Existing Technologies . . . . . . . . 12 6. Implications of Path Awareness for the Transport and Application Layers . . . . . . . . . . . . . . . . . . . 13 7. Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8. Security Considerations . . . . . . . . . . . . . . . . . . . 13 Meynell, et al. Expires 25 January 2025 [Page 2] Internet-Draft scion-research_I-D July 2024 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 10.1. Normative References . . . . . . . . . . . . . . . . . . 13 10.2. Informative References . . . . . . . . . . . . . . . . . 14 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 1. Introduction SCION is an inter-domain network architecture. Its core components, as deployed by some of its early adopters, are specified in [I-D.dekater-scion-dataplane], [I-D.dekater-scion-pki], [I-D.dekater-scion-controlplane] which are currently under ISE review. The goal of this draft is to explore how SCION and its early deployment experiences can help address open research questions in [RFC9217]. Specifically, there are still many open areas of research around path-aware networking, where SCION with its early deployment experiences and research efforts can provide a contribution. This can also be a starting point for discussions around long-term protocol evolution. This draft assumes the reader is familiar with some of the fundamental concepts defined in the components specification. *Note:* This is a very early version of the SCION research questions draft, and it merely contains a selection of potential topics to be further discussed in this draft. Any feedback is welcome and much appreciated. Thanks! 2. Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Discovery, Distribution, and Trustworthiness of Path Properties Meynell, et al. Expires 25 January 2025 [Page 3] Internet-Draft scion-research_I-D July 2024 3.1. ISD, AS Identity In the SCION specification, identifiers for ISDs and ASes are 16 bits and 48 bits long respectively. Whilst 48 bits for the AS will accommodate up to 2.81475e14 assignments which is likely to be more than sufficient for the future, using 16 bits for the ISD only offers 65,536 possible assignments. Further investigation on whether this is sufficient is required. The following questions arise: (not comprehensive) * How many ASes do we expect in the SCION network model? * Can one AS belong to many ISDs? * Are AS numbers globally unique, or only unique within each ISD? * How many ISDs do we expect? * What is the ontology of an ISD? - Per geographic area? - Per legal jurisdiction? - Per capacity tier? - Per "type"? (meaning: any grouping that makes sense to their members) - Possible combinations of any previous "types"? * Note that, to achieve global connectivity, ISDs require core links to other ISDs. This reduces the number of ISDs to those that have core ASes that can directly connect to core ASes in other ISDs. The number of ISDs is still super-exponential asymptotically. 3.2. Beacon Selection Policies Path discovery between SCION ASes relies on Path Construction Beacons (PCBs). In core beaconing PCBs are propagated omnidirectionally, unless this would cause a loop or exceed the maximum _path length_. A core AS selects a small number of paths to/from other core ASes, based on its beacon selection policy. It then propagates valid and policy compliant paths to neighbor ASes. The number of propagated beacons is limited to the _best PCBs set size_, in order to avoid that the number of paths (and beacons) grows exponentially with core network size. Meynell, et al. Expires 25 January 2025 [Page 4] Internet-Draft scion-research_I-D July 2024 Some potential questions are: * Limiting the number of beacons to a _best PCBs set size_ per AS results in a partial view of the network being available to endpoints. To what point this affects endpoint's path-selection possibilities? * what is a good, practical, general purpose policy, that can fulfill conflicting requirements of both operators and endpoints, as highlighted in section 2.7 of [RFC9217]? * What are the desirable path properties (e.g. diversity, PCB Expiration Time, how recently the same PCBs was forwarded before). 3.3. Name Resolution and DNS Service Binding (SVCB) In current deployments, SCION addresses are added to TXT records. Example of current entry: $ dig +short ethz.ch TXT | grep scion "x-sciondiscovery=8041" "scion=64-2:0:9,129.132.230.98" The DNS Service Binding [RFC9460], Section 14.3 allows a dedicated SCION Service Parameter to be specified. Service Parameters allow the specification of alternative IP addresses or other parameters (such as ISD/AS numbers) for a given URL. This would be more elegant than using DNS TXT records. With SVCB this may look like this for https: dig +short https ethz.ch 1 . alpn="h2" scion=64-2:0:9,129.132.230.98 SVCB is also planned to be supported by Happy Eyeballs v3 [I-D.draft-pauly-v6ops-happy-eyeballs-v3-01]. 3.4. Segment Dissemination A path look-up in SCION works similarly to a recursive DNS query (section 4 of [I-D.dekater-scion-controlplane]): * The source endpoint queries the control service in its own AS. Meynell, et al. Expires 25 January 2025 [Page 5] Internet-Draft scion-research_I-D July 2024 * The local AS has already at least one segment to one core AS of its local ISD. It uses it to query the core AS' control service for segments to core ASes of the remote ISD. * The local AS has now segments also from its local ISD core ASes to core ASes in the remote ISD. The local AS queries the remote ISD's core ASes for segments to the destination AS. * The local AS returns all these segments to the endpoint, to be combined. Control services may return a large number of path segments for some queries. This can cost considerable bandwidth while at the same time overload clients with an unnecessarily large numbers of segments. * This problem may be more acute in ASes with many endpoints (e.g. IoT), or for resource-constrained endpoints. * Getting a full path to a remote endpoint may require three queries to control services. There are multiple possible and independent solution steps here: * Predefine some policies that can be resolved by the control plane, e.g. ANY, BEST_LATENCY, BEST_BANDWIDTH, BEST_PRICE, BEST_CO2. For these, the control plane could simply calculate 5-10 compliant paths and return these. Moreover these could be cached for commonly requested remote ASes. If a user requires a custom policy they can still resort to requesting actual segments. * Caching of paths. An open question concerns when a control service should return a chached path, versus performing a recursive path lookup. An example including the number of core segments between different ISDs as of 2024-07-12 in the SCION production network is shown in Table 1. Meynell, et al. Expires 25 January 2025 [Page 6] Internet-Draft scion-research_I-D July 2024 +=========+=========+===================+ | src ISD | dst ISD | segments returned | +=========+=========+===================+ | 64 | 64 | 337 | +---------+---------+-------------------+ | 64 | 65 | 240 | +---------+---------+-------------------+ | 64 | 67 | 60 | +---------+---------+-------------------+ Table 1: core segment count examples 3.5. Periodic Beacon Propagation The SCION control plane protocol specifies that beacons should be propagated periodically. Is this really necessary? * For path freshness, only the initial AS emitting the PCB needs to originate beacons periodically, and others can disseminate immediately. * As response to link failures or availability of new paths, beacon services can respond instantly. If no periodic propagation is necessary for path freshness or to respond to link failures, the periodic propagation would only be used for the discovery of new paths at each interval, enhancing the scalability and path diversity. 3.6. Beacon Optimization and Extensibility Communication requirements vary according to source, destination, and application. Satisfying all these requirements either requires discovering all paths in the network, or optimizing the creation of paths during the beaconing process. Selecting the 5 shortest paths per destination for each beaconing period may not satisfy all requirements that different applications, on different endpoints, and on different ASes, will have. The beacon selection process, the criteria and metrics that they carry, and the adaptability of them all have a strong impact in the traffic engineering of the individual ASes, and of the inter-domain communication as a whole. See question 2.7 of [RFC9217]. * What optimization functions should be applied to beacons and what metrics should be considered when propagating them? Is the set of properties composed of path length, peering ASes, path disjointness, PCB last reception, and path lifetime enough? Meynell, et al. Expires 25 January 2025 [Page 7] Internet-Draft scion-research_I-D July 2024 * How do we extend the metrics with new dimensions, such as bandwidth, latency, geo-position, etc? * Who should select these functions? * How should the outcome of these functions be verified? * How can multiple functions be applied concurrently, for different source and destination applications? * How should end-ASes express their desired requirements to the inter-domain control plane? * How do these requirements translate into concrete optimization functions? * How would standardization of the functions look like? * The functions changing over time: - How can optimization functions adapt to incorporate these changes? - How to achieve fast adaptation of optimization functions? * See also: IREC [TABAEIAGHDAEI2023] 3.7. DRKey DRKey is a key distribution system that scales well with the number of endpoints in the network. It relies on two things: * Two sides of a key: a fast side, and a slow side. Sometimes called fast and slow side of the derivation. The ability of deriving a key very quickly on the fast side is necessary for most of its use cases. * A grouping of endpoints (such as ASes): The pieces necessary to derive a key, namely the L1 keys, are communicated to each keystore at each grouping (e.g., a keystore per AS). The questions related to DRKey are the following ones (not comprehensive): Meynell, et al. Expires 25 January 2025 [Page 8] Internet-Draft scion-research_I-D July 2024 * Do we want to have any possible authorization that is at the moment carried out at the data-plane to be moved to the control- plane? This could include e.g. authorization to deliver traffic depending on the source, but also information like port numbers/ ranges per source, etc. - Could this obsolete firewalls? What else would be necessary? - What do we mean when we say "authorize transit"? * Could perfect forward secrecy DRKey be useful, and should we research it? - What would be the trust model? Do end-users trust their ASes to ephemerally create personal keys? - What would be the attacker model? - Which use cases are relevant? For more info: [I-D.garciapardo-panrg-drkey]. 3.8. SCMP Authentication In SCION, endpoints are responsible to select alternate paths in case of failure. One approach to detect failures is to rely on signals from the network, such as SCMP (SCION Control Message Protocol) messages. These signals must be authenticated in order to be trusted by endpoints. This reflects a question raised in section 2.7 of [RFC9217]. One option is to use DRKey as the mechanism to use to derive the authentication key, where the fast path would be on the infrastructure side (e.g. the border router in the case of an interface down type of message), and the slow side being on the intended endpoint destination for that SCMP message (e.g. the endpoint receiving the SCMP interface down message). Another option is to leverage the control-plane PKI. However, we have identified a number of possible issues (not comprehensive): * Denial of Service/Capability Attacks: If an endpoint receives (too) many SCMP messages, it will need (too) many resources just to authenticate their origin. Most of these messages could just be sent to the endpoint to exhaust its processing capacity. Meynell, et al. Expires 25 January 2025 [Page 9] Internet-Draft scion-research_I-D July 2024 * Sending an SCMP message in certain cases might be an amplification factor: If a border router sends an SCMP message (e.g. interface down) on all cases, even with small packets, there is the risk of having that border router sending a lot of traffic to a possibly unintended recipient, e.g. when the packet is not source validated. In addition, validation may trigger additional requests for keys. 3.9. Proof of Transit FABRID [KRAHENBUHL2023] and EPIC [LEGNER2020]. 3.10. NAT Currently, the SCION implementation is not compatible out-of-the-box with NAT'ed devices, regardless of whether these devices are end- hosts, or even running SCION services. This is due to the (UDP-IP) underlay being modified by the NAT mechanism, but not the internal destination SCION address. Although this does not concern the SCION protocols themselves, we want to check that this will not be a problem. Critically, the SCION header needs to contain the SRC address as seen by the border router so that the border router can forward incoming response packets to the correct NAT device and port. Possible solutions: * With IPv6 as an intra-domain protocol, this problem disappears. * Introduce a mechanism so that the SCION border router can report the NATed address to an endpoint (similar to a STUN server). 4. Data Plane Stability 4.1. Link Load Balancing Links may get overloaded because the SCION distributed path selection process fails to distribute load properly over different links, resulting in uneven utilization. There are several potential approaches to relieve an overloaded link: * introduce explicit congestion notifications from routers to endpoints. * optimize the path lookup process so that the control plane hands out paths that optimize load balancing. Meynell, et al. Expires 25 January 2025 [Page 10] Internet-Draft scion-research_I-D July 2024 4.2. Reverse Path Refreshment When a client and server communicate, return traffic should usually follow the same reverse path. If the server uses that path for a long period of time, the path will eventually expire. How should the server determine the path for return traffic and at which layer? How to avoid this being a vector pf path hijackings? There are some relevant points for the discussion: * In order to send data back to the client, the server needs to store the path locally (analogous to storing the client's 4-tuple in an TCP/IP scenario). * More generally, if multiple paths are used to contact the server, which one of those would be used to reply? Should this be responsibility of the transport protocol, as in the case of QUIC- MP [I-D.ietf-quic-multipath]? * How long before path expiration should the client and server still use a path? The client can choose from paths that will not expire in a short period of time, but it does not control for how long the server will attempt to use it (i.e. how long it will take the server to send the complete response). 4.2.1. Proposed Solutions (not comprehensive) * The server MUST ask the control plane for a path, regardless of the client's policy. * The client SHOULD (somehow) send a new packet with a new path, prompting the server to use this path from now on. * The client and server agree, via a path policy specification, on which kinds of paths are okay for the server to use. This solution implies a standard specification of this path policy. * Leave the solution to the application and transport layers. Transport protocols may require keep alive messages, or already support multiple paths. Applications should know for how long they are willing to read a response from a server. With this knowledge these two layers can easily determine when to send a new path (analogous to connection migration in QUIC [RFC9000]), so that the server is instructed to use it for the next replies. * The server must ask the control plane for a path, regardless of the client's policy. Meynell, et al. Expires 25 January 2025 [Page 11] Internet-Draft scion-research_I-D July 2024 * The client (somehow) sends a new packet with a new path, prompting the server to use this path from now on. There are some nuances: Usually the server's API will store the initial address of the client to be used through all the session. A related question: how long before expiration should a path be used? Is reverse path refresh a relevant problem? * Contra: It is probably rare that a server needs to send data for a long time without the application layer protocol requiring the client to ever answer back. * Pro: The client may happen to have an old-ish path. If we can't refresh, the client always needs to consider whether a path is valid "long enough", which might only be possible to guess. * Contra: Sending keep-alives sounds like a connection based protocol. It also means we need to figure out when to stop sending keep-alives. * Contra: It may be better to solve this in the application layer or in the overlay protocol, where we we know more about potential length of the session, whether this is a singular request/answer type of exchange, or whether more frequent keep-alives are required anyway. 5. Interfacing SCION with Existing Technologies The questions posed here are: * What existing protocols/solutions should be compatible with SCION simultanously? How can ISPs offer traditional IP side by side with SCION. * Could a future evolution of the SCION specification better reuse existing technologies? Referring to the possibility of slightly changing an existing technology (e.g. IPv6) to be used as part of SCION, replacing part of the ad-hoc specification that we have for SCION. * What would be required effort to make them work? Referring to the ranking according to different types of parties involved: ISPs, vendors, application developers, etc. There are several possibilities of existing protocols/technologies/ solutions that may work for this purpose: Meynell, et al. Expires 25 January 2025 [Page 12] Internet-Draft scion-research_I-D July 2024 * IPv6 in the Data Plane. Use an IPv6 routing header as specified in 4.4. of [RFC8200]. * SCION IP Gateway. See section 3 of [I-D.rustignoli-panrg-scion-components] * SCION-IP translation [SCIONIPTRLN] * How can we interface with QUIC Multipath [I-D.ietf-quic-multipath]? * How can we interface with, and what is the relationship to TAPS [TAPS]? 6. Implications of Path Awareness for the Transport and Application Layers This question relates to 2.5 in [RFC9217]. * How to express transport preferences and map them to SCION path properties? 7. Naming To be discussed 8. Security Considerations TODO Security 9. IANA Considerations This document has no IANA actions. 10. References 10.1. Normative References [I-D.dekater-scion-controlplane] de Kater, C., Rustignoli, N., and S. Hitz, "SCION Control Plane", Work in Progress, Internet-Draft, draft-dekater- scion-controlplane-05, 21 July 2024, . [I-D.dekater-scion-dataplane] de Kater, C., Rustignoli, N., and S. Hitz, "SCION Data Plane", Work in Progress, Internet-Draft, draft-dekater- Meynell, et al. Expires 25 January 2025 [Page 13] Internet-Draft scion-research_I-D July 2024 scion-dataplane-02, 8 July 2024, . [I-D.dekater-scion-pki] de Kater, C., Rustignoli, N., and S. Hitz, "SCION Control- Plane PKI", Work in Progress, Internet-Draft, draft- dekater-scion-pki-06, 8 July 2024, . [I-D.ietf-quic-multipath] Liu, Y., Ma, Y., De Coninck, Q., Bonaventure, O., Huitema, C., and M. Kühlewind, "Multipath Extension for QUIC", Work in Progress, Internet-Draft, draft-ietf-quic-multipath-10, 8 July 2024, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, DOI 10.17487/RFC8200, July 2017, . [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . [RFC9217] Trammell, B., "Current Open Questions in Path-Aware Networking", RFC 9217, DOI 10.17487/RFC9217, March 2022, . [TAPS] IETF, "Transport Services Working Group", 2024, . 10.2. Informative References Meynell, et al. Expires 25 January 2025 [Page 14] Internet-Draft scion-research_I-D July 2024 [I-D.draft-pauly-v6ops-happy-eyeballs-v3-01] Pauly, T., Schinazi, D., Jaju, N., and K. Ishibashi, "Happy Eyeballs Version 3: Better Connectivity Using Concurrency", Work in Progress, Internet-Draft, draft- pauly-v6ops-happy-eyeballs-v3-01, 4 March 2024, . [I-D.garciapardo-panrg-drkey] de los Galanes, J. A., Krähenbühl, C., Rothenberger, B., and A. Perrig, "Dynamically Recreatable Keys", Work in Progress, Internet-Draft, draft-garciapardo-panrg-drkey- 03, 24 July 2022, . [I-D.rustignoli-panrg-scion-components] Rustignoli, N. and C. de Kater, "SCION Components Analysis", Work in Progress, Internet-Draft, draft- rustignoli-panrg-scion-components-03, 10 September 2023, . [KRAHENBUHL2023] Krähenbühl, C., Wyss, M., Basin, D., Lenders, V., Perrig, A., and M. Strohmeier, "FABRID: Flexible Attestation-Based Routing for Inter-Domain Networks", 2020, . [LEGNER2020] Legner, M., Klenze, T., Wyss, M., Sprenger, C., and A. Perrig, "EPIC: Every Packet Is Checked in the Data Plane of a Path-Aware Internet", 2020, . [RFC9460] Schwartz, B., Bishop, M., and E. Nygren, "Service Binding and Parameter Specification via the DNS (SVCB and HTTPS Resource Records)", RFC 9460, DOI 10.17487/RFC9460, November 2023, . [SCIONIPTRLN] Schulz, L., Gallrein, F., and D. Hausheer, "Unlocking Path Awareness for Legacy Applications through SCION-IP Translation in eBPF", ACM, Proceedings of the SIGCOMM Workshop on eBPF and Kernel Extensions, DOI 10.1145/3672197.3673437, August 2024, . Meynell, et al. Expires 25 January 2025 [Page 15] Internet-Draft scion-research_I-D July 2024 [TABAEIAGHDAEI2023] Tabaeiaghdaei, S., Wyss, M., Giuliari, G., van Bommel, J., Zehmakan, A. N., and A. Perrig, "Inter-domain Routing with Extensible Criteria", 2023, . Acknowledgments Many thanks go to Matthias Frei (SCION Association), Seyedali Tabaeiaghdaei (ETH Zurich) for reviewing and contributing to this document. Authors' Addresses Kevin Meynell SCION Association Email: kme@scion.org Juan A. García Pardo Giménez de los Galanes ETH Zürich Email: juan.garcia@inf.ethz.ch Tilmann Zäschke ETH Zürich Email: tilmann.zaeschke@inf.ethz.ch Nicola Rustignoli SCION Association Email: nic@scion.org Meynell, et al. Expires 25 January 2025 [Page 16]