schedule 6-min read

How to Read DMARC XML Reports Without Losing Your Mind

Raw DMARC XML reports look chaotic. Here’s how to parse them by hand, what to look for, and when to stop and use a platform instead.

01

Introduction

DMARC XML reports are machine-readable on purpose. Trying to read them by hand is a known route to giving up on DMARC entirely. But early in a rollout — or when debugging a specific issue — you'll occasionally need to open one and find an answer. This article is the survival guide: structure, the fields that matter, and the smell tests that get you to a verdict fast.

02

Why this topic matters

A raw aggregate report from Google for a busy domain can be hundreds of records long. Reading every field is a waste of time; knowing which six to look at is the difference between "I can't deal with this" and "the answer is sender X, fix Y." Once you've learned the shortcuts, manual parsing is reasonable for one-off investigations.

For everything else, a platform reads them automatically.

03

The shape of an XML report

Three sections, in order:

  1. report_metadata — who sent it (Google, Yahoo, etc.) and the time window.
  2. policy_published — what your record looked like to them. Confirm this matches what you actually published; mismatches mean a record was changed mid-period.
  3. record entries — one per source IP. This is the data.

Each record has three sub-blocks: row (volume + disposition), identifiers (the From domain seen), and auth_results (raw SPF/DKIM outcomes).

04

The smell-test approach

Open the file. Skip metadata. Find the records. For each record, scan three things in order:

  1. source_ip — do I recognize it?
  2. count — is this volume normal for that sender?
  3. policy_evaluated/spf and dkim — both pass?

If source_ip is recognized, count is normal, and both pass → done. Move on.

If anything is off, dig into auth_results for the raw SPF and DKIM signing-domain detail.

That's the entire decision tree for 95% of records.

05

Reverse-DNS the IP

For unknown IPs, run a quick reverse DNS lookup:

bash dig +short -x 198.51.100.42

The hostname tells you the sender. mailgun.org is Mailgun, outlook.com is Microsoft, sendgrid.net is SendGrid. If the reverse DNS is generic (unassigned.example.com) or empty, the IP is suspicious — possibly an attacker.

How to handle third-party senders during DMARC projects covers the categorization once you've identified the sender.

06

What to ignore

A few patterns waste time:

  • Forwarder failures. SPF will often fail for forwarded mail because the forwarder's IP isn't in your SPF record. If DKIM passes, the message is fine — DKIM survives forwarding.
  • Sub-1% volume from unknown IPs. Spoofing attempts are common; you don't need to investigate every single probe. Look for patterns, not single rows.
  • Discrepancies between policy_published and what you actually published. Receivers cache records; small lags between your DNS update and their next report are normal.
07

The single XML quirk that trips everyone

Reports are gzipped. The .xml.gz arrives in email as an attachment; you have to decompress before reading. On the command line:

bash gunzip google.com!yourdomain.com!1735689600!1735776000.xml.gz xmllint --format google.com!yourdomain.com!1735689600!1735776000.xml

xmllint --format pretty-prints the XML into a human-readable shape. Without it, the file is one long line that no text editor handles well.

08

When to stop reading by hand

If you find yourself opening more than two reports per week, switch to a platform. Manual parsing is reasonable for occasional spot checks and unsustainable as a steady-state practice. The math: 20 senders × 5 receivers × daily reports = 100 reports per week. Nobody reads that.

The DMARC AI ingestion pipeline accepts raw rua= mail, parses every report, and surfaces the data as a sender-by-sender dashboard. The 10 minutes of manual parsing becomes a 30-second glance.

09

Best practices

  • Keep XML around for 90 days, but skim weekly. Long-term storage is for incident investigations; weekly review is for trend-spotting.
  • Filter to receivers that matter. Google, Microsoft, Yahoo cover most of your real traffic. Smaller receivers are noise unless you're a B2B targeting specific industries.
  • Track per-sender pass rate over time. A sender at 100% one week and 95% the next is drifting; investigate before enforcement breaks.
  • Don't try to dedupe IPs by hand. Senders often have IP pools that rotate. A platform handles this; you'll go mad doing it by hand.
  • Cross-reference unknown IPs with public data. AbuseIPDB, Spamhaus, and reverse DNS together identify most unknowns within a minute.
10

If you've been manually parsing more than 2-3 reports per week, the next move is a platform. The ROI is immediate: 30 minutes per week recovered, and the data quality improves because you're looking at aggregates instead of individual files.

For MSPs running multiple client domains, platform tooling is mandatory — there is no version of manual XML parsing that scales past one client.

11

FAQ

Can I just turn off ruf= and ignore forensic reports?

Yes. Most domains don't need forensic reports. DMARC failure reports covers when they're useful (rare).

Why are some reports HTTPS POSTs instead of email attachments?

The rua= tag supports https: destinations as well as mailto:. Some platforms prefer HTTPS for higher reliability. Most receivers still default to email.

How big are aggregate reports?

For a low-volume domain, a few KB. For a high-volume domain with many senders, sometimes hundreds of KB per receiver per day. Compressed transfer keeps total bandwidth modest.

Can I send aggregate reports somewhere other than my own domain?

Yes — rua=mailto:[email protected] is fine. If the destination domain differs from your published domain, the destination needs to publish a _report._dmarc.yourdomain.com TXT record authorising receipt. Platforms handle this for you.

What's the most common XML reading mistake?

Confusing the raw auth_results SPF/DKIM result with the alignment-aware policy_evaluated result. The first tells you whether SPF/DKIM passed at all; the second tells you whether they passed in a way that satisfies DMARC. The second is what matters for the policy.

12

Final thoughts

Reading DMARC XML by hand is a survival skill for the early phase of a rollout. The skill becomes obsolete once you have a platform reading them for you — and that handoff usually happens around the time you stop tolerating the manual workload. Don't fight the tooling; the goal is the data, not the parsing.

If you've read one report fully, you've essentially read them all. Lean on the smell test, escalate to deeper analysis only when something looks off, and put a platform in front of the firehose as soon as the per-week volume starts to bite.

Ready to Implement?

Get authenticated mail moving in minutes — start free, book a guided demo, or talk to the team about your stack.