<?xml version='1.0' encoding='utf-8'?>

<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-ietf-dnsop-dns-error-reporting-08" number="9567" obsoletes="" updates="" xml:lang="en" category="std" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">

  <front>
    <title abbrev="DNS Error Reporting">DNS Error Reporting</title>
    <seriesInfo name="RFC" value="9567"/>
    <author initials="R." surname="Arends" fullname="Roy Arends">
      <organization>ICANN</organization>
      <address>
        <email>roy.arends@icann.org</email>
      </address>
    </author>
    <author initials="M." surname="Larson" fullname="Matt Larson">
      <organization>ICANN</organization>
      <address>
        <email>matt.larson@icann.org</email>
      </address>
    </author>
    <date year="2024" month="April"/>
    <area>OPS</area>
    <workgroup>dnsop</workgroup>

    <abstract>
      <t>DNS error reporting is a lightweight reporting mechanism that provides the operator of an authoritative server with reports on DNS resource records that fail to resolve or validate. A domain owner or DNS hosting organization can use these reports to improve domain hosting. The reports are based on extended DNS errors as described in RFC 8914.</t>
      <t>When a domain name fails to resolve or validate due to a misconfiguration or an attack, the operator of the authoritative server may be unaware of this. To mitigate this lack of feedback, this document describes a method for a validating resolver to automatically signal an error to a monitoring agent specified by the authoritative server. The error is encoded in the QNAME; thus, the very act of sending the query is to report the error.</t>
    </abstract>
  </front>
  <middle>
    <section anchor="introduction">
      <name>Introduction</name>
      <t>When an authoritative server serves a stale DNSSEC-signed zone, the cryptographic signatures over the resource record sets (RRsets) may have lapsed. A validating  resolver will fail to validate these resource records.</t>
      <t>Similarly, when there is a mismatch between the Delegation Signer (DS) records at a parent zone and the key signing key at the child zone, a validating resolver will fail to authenticate records in the child zone.</t>
      <t>These are two of several failure scenarios that may go unnoticed for some time by the operator of a zone.</t>
      <t>Today, there is no direct relationship between operators of validating resolvers and authoritative servers. Outages are often noticed indirectly by end users and reported via email or social media (if reported at all).</t>
      <t>When records fail to validate, there is no facility to report this failure in an automated way. If there is any indication that an error or warning has happened, it may be buried in log files of the resolver or not logged at all.</t>
      <t>This document describes a method that can be used by validating resolvers to report DNSSEC validation errors in an automated way.</t>

      <t>It allows an authoritative server to announce a monitoring agent to which validating resolvers can report issues if those resolvers are configured to do so.</t>
      <t>The burden to report a failure falls on the validating resolver. It is important that the effort needed to report failure is low, with minimal impact to its main functions. To accomplish this goal, the DNS itself is utilized to report the error.</t>
    </section>
    <section anchor="requirements-notation">
      <name>Requirements Notation</name>
        <t>
    The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>",
    "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>",
    "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>",
    "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
    "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be
    interpreted as described in BCP&nbsp;14 <xref target="RFC2119"/> <xref
    target="RFC8174"/> when, and only when, they appear in all capitals, as
    shown here.
        </t>
    </section>
    <section anchor="terminology">
      <name>Terminology</name>

      <t>This document uses DNS terminology defined in BCP 219 <xref target="RFC9499"/>. This document also defines and uses the following terms:</t>
      <dl>
        <dt>Reporting resolver:</dt>
        <dd>
          <t>A validating resolver that supports DNS error reporting.</t>
        </dd>
        <dt>Report query:</dt>
        <dd>
          <t>The DNS query used to report an error. A report query is for a DNS TXT resource record type. The content of the error report is encoded in the QNAME of a DNS request to the monitoring agent.</t>
        </dd>
        <dt>Monitoring agent:</dt>
        <dd>
          <t>An authoritative server that receives and responds to report queries. This facility is indicated by a domain name, referred to as the "agent domain".</t>
        </dd>
        <dt>Agent domain:</dt>
        <dd>
          <t>A domain name that is returned in the EDNS0 Report-Channel option and indicates where DNS resolvers can send error reports.</t>
        </dd>
      </dl>
    </section>
    <section anchor="overview">
      <name>Overview</name>
      <t>An authoritative server indicates support for DNS error reporting by including an EDNS0 Report-Channel option with OPTION-CODE 18 and the agent domain in the response. The agent domain is a fully qualified, uncompressed domain name in DNS wire format. The authoritative server <bcp14>MUST NOT</bcp14> include this option in the response if the configured agent domain is empty or is the null label (which would indicate the DNS root).</t>
      <t>The authoritative server includes the EDNS0 Report-Channel option unsolicited. That is, the option is included in a response despite the EDNS0 Report-Channel option being absent in the request.</t>

      <t>If the authoritative server has indicated support for DNS error reporting and there is an issue that can be reported via extended DNS errors, the reporting resolver encodes the error report in the QNAME of the report query. The reporting resolver builds this QNAME by concatenating the "_er" label, the QTYPE, the QNAME that resulted in failure, the extended DNS error code (as described in <xref target="RFC8914"/>), the label "_er" again, and the agent domain. See the example in <xref target="example"/> and the specification in <xref target="constructing-the-report-query"/>. Note that a regular RCODE is not included because the RCODE is not relevant to the extended DNS error code.</t>
      <t>The resulting report query is sent as a standard DNS query for a TXT DNS resource record type by the reporting resolver.</t>

      <t>The report query will ultimately arrive at the monitoring agent. A response is returned by the monitoring agent, which in turn can be cached by the reporting resolver. This caching is essential. It dampens the number of report queries sent by a reporting resolver for the same problem (that is, with caching, one report query per TTL is sent). However, certain optimizations, such as those described in <xref target="RFC8020"/> and <xref target="RFC8198"/>, may reduce the number of error report queries as well.</t>
      <t>This document gives no guidance on the content of the RDATA in the TXT resource
 record.</t>
      <section anchor="example">
        <name>Example</name>

        <t>A query for "broken.test.", type A, is sent by a reporting resolver.</t>
        <t>The domain "test." is hosted on a set of authoritative servers. One of these authoritative servers serves a stale version of the "test." zone. This authoritative server has an agent domain configured as "a01.agent-domain.example.".</t>
        <t>The authoritative server with the stale "test." zone receives the request for "broken.test.". It returns a response that includes the EDNS0 Report-Channel option with the domain name "a01.agent-domain.example.".</t>
        <t>The reporting resolver is unable to validate the "broken.test." RRset for type A (an RR type with value 1), due to an RRSIG record with an expired signature.</t>

	<t>The reporting resolver constructs the QNAME "_er.1.broken.test.7._er.a01.agent-domain.example." and resolves it. This QNAME indicates extended DNS error 7 occurred while trying to validate "broken.test." for a type A (an RR type with value 1) record.
</t>
        <t>When this query is received at the monitoring agent (the operators of the authoritative server for "a01.agent-domain.example."), the agent can determine the "test." zone contained an expired signature record (extended DNS error 7) for type A for the domain name "broken.test.". The monitoring agent can contact the operators of "test." to fix the issue.</t>
      </section>
    </section>
    <section anchor="edns0-option-specification">
      <name>EDNS0 Option Specification</name>
      <t>This method uses an EDNS0 <xref target="RFC6891"/> option to indicate the agent domain in DNS responses. The option is structured as follows:</t>
      <artwork><![CDATA[
                     1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1                     
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        OPTION-CODE = 18       |       OPTION-LENGTH           |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
/                         AGENT DOMAIN                          /
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
]]></artwork>
      <t>Field definition details:</t>
      <dl>
        <dt>OPTION-CODE:</dt>
        <dd>
          <t>2 octets; an EDNS0 code that is used in an EDNS0 option to indicate support for error reporting.  The name for this EDNS0 option code is Report-Channel.</t>
        </dd>
        <dt>OPTION-LENGTH:</dt>
        <dd>
          <t>2 octets; contains the length of the AGENT DOMAIN field in octets.</t>
        </dd>
        <dt>AGENT DOMAIN:</dt>
        <dd>
          <t>A fully qualified domain name <xref target="RFC9499"/> in uncompressed DNS wire format.</t>
        </dd>
      </dl>
    </section>
    <section anchor="dns-error-reporting-specification">
      <name>DNS Error Reporting Specification</name>
      <t>The various errors that a reporting resolver may encounter are listed in <xref target="RFC8914"/>. Note that not all listed errors may be supported by the reporting resolver. This document does not specify what is or is not an error.</t>
      <t>The DNS class is not specified in the error report.</t>
      <section anchor="reporting-resolver-specification">
        <name>Reporting Resolver Specification</name>

	<t>Care should be taken when additional DNS resolution is needed to resolve the QNAME that contains the error report. This resolution itself could trigger another error report to be created.
	A maximum expense or depth limit <bcp14>MUST</bcp14> be used to prevent
cascading errors.</t>
        <t>The EDNS0 Report-Channel option <bcp14>MUST NOT</bcp14> be included in queries.</t>

        <t>The reporting resolver <bcp14>MUST NOT</bcp14> use DNS error reporting if the authoritative server returned an empty AGENT DOMAIN field in the EDNS0 Report-Channel option.</t>

	<t>For the monitoring agent to gain more confidence that the report is not spoofed, the reporting resolver <bcp14>SHOULD</bcp14> send error reports over TCP
<xref target="RFC7766"/> or other connection-oriented protocols or <bcp14>SHOULD</bcp14> use DNS Cookies <xref target="RFC7873"/>.  This makes it harder to falsify the source address.</t>
        <t>A reporting resolver <bcp14>MUST</bcp14> validate responses received from the monitoring agent. There is no special treatment for responses to error-reporting queries. <xref target="security-considerations"/> ("Security Considerations") contains the rationale behind this.</t>
        <section anchor="constructing-the-report-query">
          <name>Constructing the Report Query</name>

          <t>The QNAME for the report query is constructed by concatenating the following elements:</t>
          <ul spacing="normal">
            <li>
              <t>A label containing the string "_er".</t>
            </li>
            <li>
              <t>The QTYPE that was used in the query that resulted in the extended DNS error, presented as a decimal value, in a single DNS label. If additional QTYPEs were present in the query, such as described in <xref target="I-D.ietf-dnssd-multi-qtypes"/>, they are represented as unique, ordered decimal values separated by a hyphen. As an example, if both QTYPE A and AAAA were present in the query, they are presented as the label "1-28".</t>
            </li>
            <li>
              <t>The list of non-null labels representing the query name that is the subject of the DNS error report.</t>
            </li>
            <li>
              <t>The extended DNS error code, presented as a decimal value, in a single DNS label.</t>
            </li>
            <li>
              <t>A label containing the string "_er".</t>
            </li>
            <li>
              <t>The agent domain. The agent domain as received in the EDNS0 Report-Channel option set by the authoritative server.</t>
            </li>
          </ul>

          <t>If the QNAME of the report query exceeds 255 octets, it
          <bcp14>MUST NOT</bcp14> be sent.</t>
          <t>The "_er" labels allow the monitoring agent to differentiate
          between the agent domain and the faulty query name. When the
          specified agent domain is empty, or is a null label (despite being
          not allowed in this specification), the report query will have "_er"
          as a top-level domain, and not the top-level domain from the query
          name that was the subject of this error report.  The purpose of the
          first "_er" label is to indicate that a complete report query has
          been received instead of a shorter report query due to query
          minimization.</t>
        </section>
      </section>
      <section anchor="authoritative-server-specification">
        <name>Authoritative Server Specification</name>
        <t>The authoritative server <bcp14>MUST NOT</bcp14> include more than one EDNS0 Report-Channel option in a response.</t>
        <t>The authoritative server includes the EDNS0 Report-Channel option unsolicited in responses. There is no requirement that the EDNS0 Report-Channel option be present in queries.</t>
      </section>
      <section anchor="monitoring-agent-specification">
        <name>Monitoring Agent Specification</name>

        <t>It is <bcp14>RECOMMENDED</bcp14> that the authoritative server for the agent domain reply with a positive response (i.e., not with NODATA or NXDOMAIN) containing a TXT record.</t>
        <t>The monitoring agent <bcp14>SHOULD</bcp14> respond to queries received over UDP that have no DNS Cookie set with a response that has the truncation bit (TC bit) set to challenge the resolver to requery over TCP.</t>
      </section>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>IANA has assigned the following in the "DNS EDNS0 Option Codes (OPT)" registry:</t>

<table anchor="iana1"> 
  <name></name>    
  <thead>
    <tr>
      <th>Value</th>    
      <th>Name</th>
      <th>Status</th>
      <th>Reference</th>
    </tr>
  </thead>
  <tbody>          
    <tr>
      <td>18</td>
      <td>Report-Channel</td>
      <td>Standard</td>
      <td>RFC 9567</td>
    </tr>
  </tbody>
</table>
      
      <t>IANA has assigned the following in the "Underscored and Globally Scoped DNS Node Names" registry:</t>
<table anchor="iana2">
  <name></name>
  <thead>
    <tr>
      <th>RR Type</th>
      <th>_NODE NAME</th>
      <th>Reference</th>
    </tr>
  </thead>
  <tbody>          
    <tr>
      <td>TXT</td>
      <td>_er</td>
      <td>RFC 9567</td>
    </tr>
  </tbody>
</table>
      
    </section>
    <section anchor="operational-considerations">
      <name>Operational Considerations</name>
      <section anchor="choosing-an-agent-domain">
        <name>Choosing an Agent Domain</name>
        <t>It is <bcp14>RECOMMENDED</bcp14> that the agent domain be kept relatively short to allow for a longer QNAME in the report query. The agent domain <bcp14>MUST NOT</bcp14> be a subdomain of the domain it is reporting on. That is, if the authoritative server hosts the foo.example domain, then its agent domain <bcp14>MUST NOT</bcp14> end in foo.example.</t>
      </section>
      <section anchor="managing-caching-optimizations">
        <name>Managing Caching Optimizations</name>
        <t>The reporting resolver may utilize various caching optimizations that inhibit subsequent error reporting to the same monitoring agent.</t>
        <t>If the monitoring agent were to respond with NXDOMAIN (name error), <xref target="RFC8020"/> states that any name at or below that domain should be considered unreachable, and negative caching would prohibit subsequent queries for anything at or below that domain for a period of time, depending on the negative TTL <xref target="RFC2308"/>.</t>
        <t>Since the monitoring agent may not know the contents of all the zones for which it acts as a monitoring agent, the monitoring agent <bcp14>MUST NOT</bcp14> respond with NXDOMAIN for domains it is monitoring because that could inhibit subsequent queries. One method to avoid NXDOMAIN is to use a wildcard domain name <xref target="RFC4592"/> in the zone for the agent domain.</t>
        <t>When the agent domain is signed, a resolver may use aggressive negative caching (described in <xref target="RFC8198"/>). This optimization makes use of NSEC and NSEC3 (without opt-out) records and allows the resolver to do the wildcard synthesis. When this happens, the resolver does not send subsequent queries because it will be able to synthesize a response from previously cached material.</t>
        <t>A solution is to avoid DNSSEC for the agent domain. Signing the agent domain will incur an additional burden on the reporting resolver, as it has to validate the response. However, this response has no utility to the reporting resolver other than dampening the query load for error reports.</t>
      </section>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>Use of DNS error reporting may expose local configuration mistakes in the reporting resolver, such as stale DNSSEC trust anchors, to the monitoring agent.</t>
      <t>DNS error reporting <bcp14>SHOULD</bcp14> be done using DNS query name minimization <xref target="RFC9156"/> to improve privacy.</t>
      <t>DNS error reporting is done without any authentication between the reporting resolver and the authoritative server of the agent domain.</t>
      <t>Resolvers that send error reports <bcp14>SHOULD</bcp14> send them over TCP <xref target="RFC7766"/> or <bcp14>SHOULD</bcp14> use DNS Cookies <xref target="RFC7873"/>. This makes it hard to falsify the source address. The monitoring agent <bcp14>SHOULD</bcp14> respond to queries received over UDP that have no DNS Cookie set with a response that has the truncation bit (TC bit) set to challenge the resolver to requery over TCP.</t>
      <t>Well-known addresses of reporting resolvers can provide a higher level of confidence in the error reports and potentially enable more automated processing of these reports.</t>

      <t>Monitoring agents that receive error reports over UDP should consider that the source of the reports and the reports themselves may be false.</t>
      <t>The method described in this document will cause additional queries by the reporting resolver to authoritative servers in order to resolve the report query.</t>
      <t>This method can be abused by intentionally deploying broken zones with agent domains that are delegated to victims.  This is particularly effective when DNS requests that trigger error messages are sent through
open resolvers <xref target="RFC9499"/> or widely distributed network monitoring systems that perform distributed queries from around the globe.</t>
      <t>An adversary may create massive error report flooding to camouflage an attack.</t>

      <t>Though this document gives no guidance on the content of the RDATA in
      the TXT resource record, if the RDATA content is logged, the monitoring agent
      <bcp14>MUST</bcp14> assume the content can be malicious and take
      appropriate measures to avoid exploitation. One such method could be to
      log in hexadecimal. This would avoid remote code execution through
      logging string attacks, such as the vulnerability described in <xref target="CVE-2021-44228"/>.</t>
      <t>The rationale behind mandating DNSSEC validation for responses from a reporting agent, even if the agent domain is proposed to remain unsigned, is to mitigate the risk of a downgrade attack orchestrated by adversaries. In such an attack, a victim's legitimately signed domain could be deceptively advertised as an agent domain by malicious actors. Consequently, if the validating resolver treats it as unsigned, it is exposed to potential cache poisoning attacks. By enforcing DNSSEC validation, this vulnerability is preemptively addressed.</t>
    </section>
  </middle>
  <back>
    <displayreference target="I-D.ietf-dnssd-multi-qtypes" to="MULTI-QTYPES"/>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
      </references>
      <references>
        <name>Informative References</name>
        <reference anchor="CVE-2021-44228" target="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">
          <front>
            <title>CVE-2021-44228</title>
            <author>
              <organization>CVE</organization>
            </author>
            <date year="2021" month="November" day="26"/>
          </front>
        </reference>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8914.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9499.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8020.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8198.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6891.xml"/>

<!-- [I-D.ietf-dnssd-multi-qtypes] IESG state: I-D Exists as of 4/16/24 -->

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-dnssd-multi-qtypes.xml"/>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2308.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4592.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9156.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7766.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7873.xml"/>
      </references>
    </references>
    <section anchor="acknowledgements" numbered="false">
      <name>Acknowledgements</name>
     <t>This document is based on an idea by <contact fullname="Roy Arends"/> and <contact fullname="David Conrad"/>. The authors would like to thank <contact fullname="Peter van Dijk"/>, <contact fullname="Stephane Bortzmeyer"/>, <contact fullname="Shane Kerr"/>, <contact fullname="Vladimir Cunat"/>, <contact fullname="Paul Hoffman"/>, <contact fullname="Philip Homburg"/>, <contact fullname="Mark Andrews"/>, <contact fullname="Libor Peltan"/>, <contact fullname="Matthijs Mekking"/>, <contact fullname="Willem Toorop"/>, <contact fullname="Tom Carpay"/>, <contact fullname="Dick Franks"/>, <contact fullname="Ben Schwartz"/>, <contact fullname="Yaron Sheffer"/>, <contact fullname="Viktor Dukhovni"/>, <contact fullname="Wes Hardaker"/>, <contact fullname="James Gannon"/>, <contact fullname="Tim Wicinski"/>, <contact fullname="Warren Kumari"/>, <contact fullname="Gorry Fairhurst"/>, <contact fullname="Benno Overeinder"/>, <contact fullname="Paul Wouters"/>, and <contact fullname="Petr Spacek"/> for their contributions.</t>
    </section>
  </back>

</rfc>
