Scenario Map: TCP/IP — 中间设备丢包 (Packet Drop in Middle)

Product/Service: Windows TCP/IP Stack / Network Path
Scope: 数据包在源端和目标端之间的网络路径上被丢弃
Last Updated: 2026-03-11

核心概念

“Middle” 指源端和目标端之间的 ANY 设备：路由器、交换机、防火墙、负载均衡器、ISP 设备、IDS/IPS。
证明”中间丢包”的唯一方法是：在源端和目标端同时抓包，源端发出了、目标端没收到 → 中间丢了。

1. 子场景分类 (Sub-types)

mindmap
  root((Packet Drop<br/>in Middle))
    Firewall / ACL
      Stateful FW dropping packets
      Port-specific blocking
      SYN allowed but data blocked
    MTU Black Hole
      Packet too large
      ICMP Type3 Code4 blocked
      PMTUD failure
    Router / Switch Drop
      Buffer overflow
      CRC errors
      Interface flap
    ISP / Carrier Issue
      Peering problems
      Rate limiting
      Backbone congestion
    IDS / IPS Silent Drop
      Signature match drop
      Anomaly-based drop
      No RST sent to client
    Asymmetric Routing
      Return path different
      Stateful FW drops return
      Source on different subnet
    Congestion / QoS
      Lower priority traffic dropped
      Queue overflow
      Traffic shaping limits
    Black Hole Route
      Route to Null0 interface
      Summarization error
      Missing return route

2. 典型症状

#	症状	可能的原因
1	间歇性连通： ping 有时通有时不通，有 % loss	拥塞丢包、接口错误、不稳定链路
2	pathping 在特定跳显示丢包	该跳设备或其上游链路存在问题
3	tracert 在某跳超时，但后续跳正常响应	该路由器限制了 ICMP 回复速率（不一定是问题）
4	*tracert 从某跳开始全部 ` * `*	真正的阻断点 — 流量到此为止
5	大文件传输失败，但小请求正常	MTU Black Hole — PMTUD 失败
6	TCP 连接建立成功，但数据传输挂起（Window Size 变为 0）	中间设备篡改或丢弃数据包
7	单向通信：A→B 正常，B→A 失败	非对称路由 — 回程走了不同路径，有状态防火墙丢弃
8	只有特定端口/协议不通，其他正常	防火墙/ACL 规则精确匹配
9	Wireshark 中大量 TCP Retransmission	数据包在路径中被丢弃，TCP 层不断重传
10	TTL 值异常偏低	可能存在路由环路，包在环路中被消耗

3. 排查流程图 (Troubleshooting Flowchart)

flowchart TD
    START["🔴 数据包未到达目标端<br/>Packet not arriving at destination"] --> CAPTURE["📡 在源端和目标端<br/>同时抓包<br/>Simultaneous capture on BOTH ends"]
    
    CAPTURE --> LEAVES{"源端抓包显示<br/>数据包已发出？<br/>Packet leaves source?"}
    
    LEAVES -->|No| LOCAL["🔧 本地问题<br/>WFP drop / Driver issue /<br/>Local firewall<br/>→ 不是 middle drop 场景"]
    
    LEAVES -->|Yes| ARRIVES{"目标端抓包显示<br/>数据包到达？<br/>Packet arrives at dest?"}
    
    ARRIVES -->|"Yes but modified"| CHANGED["⚠️ 不同场景<br/>→ Packet Changed in Middle<br/>检查 NAT / proxy 修改"]
    
    ARRIVES -->|"Yes, identical"| NOTMIDDLE["✅ 不是中间丢包<br/>目标端应用层问题<br/>Check dest app/OS stack"]
    
    ARRIVES -->|No| DISAPPEAR["❌ 确认：中间设备丢包<br/>Packet dropped in middle!<br/>需要定位丢包位置"]
    
    DISAPPEAR --> TRACE["🔍 运行 tracert + pathping<br/>识别路径和丢包跳<br/>tracert dest / pathping dest"]
    
    TRACE --> PATHRESULT{"pathping 结果分析"}
    
    PATHRESULT --> SPECIFIC_HOP["📍 特定跳显示丢包<br/>Loss at specific hop"]
    PATHRESULT --> ALL_STAR["⛔ 从某跳开始全部超时<br/>All * * * from hop N onward"]
    PATHRESULT --> INTERMITTENT["📊 多跳间歇性小比例丢包<br/>Intermittent low % loss"]
    
    SPECIFIC_HOP --> WHAT_TRAFFIC{"哪些流量受影响？"}
    
    WHAT_TRAFFIC -->|"仅特定端口/协议"| FW_ACL["🛡️ 防火墙 / ACL 规则<br/>Firewall/ACL on that device<br/>→ 联系网络团队检查规则"]
    
    WHAT_TRAFFIC -->|"所有流量到该目标"| ROUTING["🗺️ 路由问题 / 设备故障<br/>Routing issue or device down<br/>→ 检查路由表和设备状态"]
    
    WHAT_TRAFFIC -->|"仅大包失败"| MTU_TEST["📏 MTU 问题<br/>测试: ping dest -f -l 1472"]
    
    MTU_TEST --> MTU_RESULT{"ping -f 结果"}
    MTU_RESULT -->|"1472 失败<br/>1400 成功"| MTU_HOLE["🕳️ MTU Black Hole!<br/>某设备 MTU < 1500<br/>且阻止了 ICMP Frag Needed<br/>→ 二分查找精确 MTU 值"]
    MTU_RESULT -->|"1472 成功"| NOT_MTU["不是 MTU 问题<br/>→ 检查 TCP MSS 协商<br/>或应用层分段"]
    
    ALL_STAR --> BLOCK_POINT["⛔ 真正的阻断点<br/>该跳之后的设备/链路完全阻断"]
    BLOCK_POINT --> BLOCK_CHECK{"检查阻断原因"}
    BLOCK_CHECK --> BH_ROUTE["🕳️ Black Hole Route<br/>路由指向 Null 接口"]
    BLOCK_CHECK --> DEVICE_DOWN["💀 设备宕机<br/>链路物理故障"]
    BLOCK_CHECK --> TOTAL_BLOCK["🚫 完全封锁<br/>防火墙 DROP ALL"]
    
    INTERMITTENT --> CONGESTION["📈 拥塞 / 接口错误<br/>检查设备接口计数器"]
    CONGESTION --> COUNTERS{"接口计数器显示？"}
    COUNTERS -->|"CRC errors / Input errors"| PHYSICAL["🔌 物理层问题<br/>更换线缆/SFP/端口"]
    COUNTERS -->|"Output drops / Queue full"| QOS["📊 拥塞/QoS 问题<br/>调整 QoS 策略或增加带宽"]
    COUNTERS -->|"计数器正常"| IDS_CHECK["🔍 检查 IDS/IPS 日志<br/>可能是安全设备静默丢弃"]
    
    style START fill:#ff6b6b,stroke:#333,color:#fff
    style LOCAL fill:#ffd93d,stroke:#333
    style CHANGED fill:#ffd93d,stroke:#333
    style NOTMIDDLE fill:#6bff6b,stroke:#333
    style DISAPPEAR fill:#ff6b6b,stroke:#333,color:#fff
    style FW_ACL fill:#74b9ff,stroke:#333
    style ROUTING fill:#74b9ff,stroke:#333
    style MTU_HOLE fill:#fd79a8,stroke:#333
    style PHYSICAL fill:#a29bfe,stroke:#333
    style QOS fill:#a29bfe,stroke:#333
    style BH_ROUTE fill:#ff6b6b,stroke:#333,color:#fff
    style DEVICE_DOWN fill:#ff6b6b,stroke:#333,color:#fff
    style TOTAL_BLOCK fill:#ff6b6b,stroke:#333,color:#fff

4. 详细排查步骤与关键命令

Step 1: 确认”中间丢包” — 双端同时抓包

这是最关键的一步。 没有双端抓包，无法证明是中间丢包。

# === 源端 (Source) ===
# 方法 1: netsh trace (推荐，内置，低开销)
netsh trace start capture=yes tracefile=c:\source_trace.etl maxsize=512
# ... 复现问题 ...
netsh trace stop

# 方法 2: pktmon (Windows Server 2019+ / Windows 10 2004+)
pktmon start --capture --comp all -f c:\source_pktmon.etl
# ... 复现问题 ...
pktmon stop

# === 目标端 (Destination) ===
# 同样的命令，同时启动
netsh trace start capture=yes tracefile=c:\dest_trace.etl maxsize=512
# ... 复现问题 ...
netsh trace stop

分析方法：

源端抓包有 TCP SYN → 目标端没有 → SYN 被中间丢了（防火墙最常见）
源端抓包有 Data 包 → 目标端没有对应 Data → 数据被中间丢了（MTU / IDS 最常见）
源端有大量 TCP Retransmission → 目标端完全没有对应包 → 确认中间丢包

Step 2: 定位丢包跳 (Identify the problematic hop)

# tracert — 显示路径（快速，但不显示丢包率）
tracert <destination_ip>
tracert -d <destination_ip>          # -d 不解析 DNS，更快

# pathping — 显示每一跳的丢包率（需等待约 5 分钟，结果更精确）
pathping <destination_ip>
pathping -n <destination_ip>         # -n 不解析 DNS，更快
pathping -q 100 <destination_ip>     # -q 指定每跳发送的查询数（默认 100）

# 持续 ping — 观察丢包模式
ping <destination_ip> -t             # 持续 ping，Ctrl+C 停止，查看 % loss
ping <destination_ip> -t -l 1400     # 用较大包持续 ping，更容易暴露问题

解读 pathping 输出：

                   Source to Here   This Node/Link
Hop  RTT    Lost/Sent = Pct  Lost/Sent = Pct  Address
  0                                           192.168.1.10
                                0/ 100 =  0%   |
  1   1ms     0/ 100 =  0%     0/ 100 =  0%  192.168.1.1
                                0/ 100 =  0%   |
  2  15ms     0/ 100 =  0%     0/ 100 =  0%  10.0.0.1
                               12/ 100 = 12%   |       ← 这条链路有 12% 丢包！
  3  45ms    12/ 100 = 12%     0/ 100 =  0%  10.1.1.1  ← 到这里累积 12% 丢包

“This Node/Link” 列 才是关键 — 显示该跳本身贡献的丢包率
“Source to Here” 列 是累积丢包率

Step 3: MTU 问题诊断

# MTU 测试：-f = Don't Fragment，-l = 包大小
# 标准以太网 MTU = 1500，减去 IP(20) + ICMP(8) 头 = 1472
ping <destination_ip> -f -l 1472    # 如果失败 → MTU < 1500
ping <destination_ip> -f -l 1400    # 缩小测试
ping <destination_ip> -f -l 1300    # 继续缩小
# 二分查找直到找到精确的最大可通过大小

# 失败时会看到:
# "Packet needs to be fragmented but DF set."

# 查看本地接口 MTU
netsh interface ipv4 show subinterfaces

# 临时降低本地 MTU 作为 workaround
netsh interface ipv4 set subinterface "Ethernet" mtu=1400 store=persistent

MTU Black Hole 的经典场景：

VPN 隧道封装额外头部（GRE +24, IPsec +50-80），实际可用 MTU 减少
PPPoE 连接 MTU = 1492（而非 1500）
某些云环境 / SD-WAN MTU 更低
中间设备阻止了 ICMP Type 3 Code 4 (Fragmentation Needed) → PMTUD 失败 → 大包永远传不过去

Step 4: 防火墙 / ACL 排查

# 测试特定端口连通性
Test-NetConnection <destination_ip> -Port 443
Test-NetConnection <destination_ip> -Port 80
Test-NetConnection <destination_ip> -Port 3389

# 对比：ICMP 通但 TCP 不通 → 防火墙按端口/协议过滤
ping <destination_ip>                          # 成功
Test-NetConnection <destination_ip> -Port 443  # 失败
# → 几乎肯定是中间防火墙阻止了 TCP 443

# 检查是 SYN 被丢还是 SYN-ACK 被丢（在源端 Wireshark 中）
# 过滤: tcp.flags.syn==1 && tcp.flags.ack==0
# 如果只看到 SYN 重传，没有 SYN-ACK → SYN 被中间丢弃
# 如果看到 SYN-ACK 但后续 ACK/Data 丢失 → 可能是 stateful FW 问题

Step 5: 非对称路由排查

# 从两端分别 tracert 到对方
# === 在 A 端 ===
tracert -d <B_ip>

# === 在 B 端 ===
tracert -d <A_ip>

# 如果路径不同（非对称），且中间有 stateful firewall
# → 防火墙只在一个方向看到了连接建立，另一个方向的包被当作"无效连接"丢弃

Step 6: 高级诊断

# pktmon 查看丢包组件（Windows Server 2019+）
pktmon start --capture --comp all --type drop
# ... 复现 ...
pktmon stop
pktmon format c:\PktMon.etl -o c:\pktmon_drops.txt
# 查看 drop reason 字段

# TTL 分析（在 Wireshark 中）
# 正常 TTL: Linux 起始 64，Windows 起始 128
# 如果收到的 TTL 为 120 → 经过了 8 跳 (128 - 120 = 8)
# 如果 tracert 显示只有 5 跳但 TTL 减了 8 → 有隐藏跳/可能路由环路

# TCP 重传分析（Wireshark 过滤）
# tcp.analysis.retransmission — 显示所有重传
# tcp.analysis.fast_retransmission — 快速重传（收到 3 个 DupACK）
# tcp.analysis.rto — 超时重传（更严重，通常是完全丢包）

5. 解决方案

根因	解决方案	实施方
防火墙/ACL 阻止	在特定设备上添加允许规则；确保 ICMP Type 3 Code 4 放行	网络/安全团队
MTU Black Hole	降低客户端 MTU (`netsh interface ipv4 set subinterface`)；修复阻止 ICMP 的设备；启用 TCP MSS Clamping	网络团队 / 客户端
路由器接口错误	检查接口健康状态；更换 SFP/线缆/端口；清除接口计数器后观察	网络团队
ISP 问题	提供 traceroute/pathping 证据，联系 ISP 报告其跳位丢包	ISP
非对称路由	修复路由使路径对称；或将 stateful FW 改为允许非对称流量	网络团队
拥塞/QoS	调整 QoS 策略提高关键流量优先级；升级带宽	网络团队
IDS/IPS 静默丢弃	检查 IDS/IPS 日志匹配规则，添加例外白名单	安全团队
Black Hole Route	修复路由表，移除指向 Null 接口的错误路由	网络团队

6. 排查经验与 Tips

💡 Tip 1: 必须在两端同时抓包 — 这是证明”中间丢包”的唯一方法。单端抓包只能看到重传，无法证明包在哪里丢的。

💡 Tip 2: pathping 优于 tracert 定位丢包点。 tracert 只显示路径，pathping 显示每跳的丢包率 %。但 pathping 需要等约 5 分钟完成统计。

💡 Tip 3: tracert 某跳超时 ≠ 该跳有问题！ 很多路由器限制 ICMP 回复速率，所以 tracert 在该跳显示 * * *，但实际数据流量正常通过。只有从该跳开始全部超时才说明是真正的阻断点。

💡 Tip 4: MTU Black Hole 的经典表现是 “ping 通但大文件传不了”。 立刻用 ping -f -l 1472 测试。如果 1472 失败但 1400 成功，就是 MTU 问题。

💡 Tip 5: Wireshark 中大量 TCP Retransmission = 中间设备丢包的强信号。 特别是 RTO (Retransmission Timeout) 类型的重传，通常意味着包完全丢失而非仅仅延迟。

💡 Tip 6: 比较收到包的 TTL 值。 如果 TTL 比预期低很多，可能存在额外跳（路由环路会快速消耗 TTL）。

💡 Tip 7: 如果只有 TCP SYN 被丢但 ICMP (ping) 正常 → 几乎肯定是防火墙规则。 防火墙通常按端口/协议过滤，ICMP 可能被放行但 TCP 特定端口被阻止。

💡 Tip 8: ISP 问题排查：从两个方向做 traceroute。 使用 Looking Glass 工具从 ISP 侧反向 traceroute，证明丢包点在 ISP 网络内部。

💡 Tip 9: 时间相关的丢包（工作时间丢包多，夜间正常）→ 拥塞。 不依赖时间的持续丢包 → 配置问题或硬件故障。

💡 Tip 10: pktmon drop monitoring (pktmon start --type drop) 可以捕获 Windows 协议栈内部的丢包原因，对于排查本地是否丢包非常有用。

7. 参考资料

暂无可验证的参考文档

Scenario Map: TCP/IP — Packet Drop in Middle (Network Path)

Product/Service: Windows TCP/IP Stack / Network Path
Scope: Packets being dropped on the network path between source and destination
Last Updated: 2026-03-11

Core Concept

“Middle” means ANY device between source and destination: routers, switches, firewalls, load balancers, ISP equipment, IDS/IPS.
The ONLY way to prove “dropped in the middle” is: capture on BOTH source and destination simultaneously — source sent it, destination never received it → dropped in the middle.

1. Sub-types (Mindmap)

mindmap
  root((Packet Drop<br/>in Middle))
    Firewall / ACL
      Stateful FW dropping packets
      Port-specific blocking
      SYN allowed but data blocked
    MTU Black Hole
      Packet too large
      ICMP Type3 Code4 blocked
      PMTUD failure
    Router / Switch Drop
      Buffer overflow
      CRC errors
      Interface flap
    ISP / Carrier Issue
      Peering problems
      Rate limiting
      Backbone congestion
    IDS / IPS Silent Drop
      Signature match drop
      Anomaly-based drop
      No RST sent to client
    Asymmetric Routing
      Return path different
      Stateful FW drops return
      Source on different subnet
    Congestion / QoS
      Lower priority traffic dropped
      Queue overflow
      Traffic shaping limits
    Black Hole Route
      Route to Null0 interface
      Summarization error
      Missing return route

2. Typical Symptoms

#	Symptom	Likely Cause
1	Intermittent connectivity: ping works sometimes, fails sometimes (% loss)	Congestion, interface errors, unstable link
2	pathping shows loss at a specific hop	That hop’s device or upstream link has issues
3	tracert times out at a specific hop but later hops respond	That router rate-limits ICMP replies (not necessarily a problem)
4	*tracert shows all ` * ` from a certain hop onward*	Real blocking point — traffic stops here
5	Large file transfers fail but small requests work	MTU Black Hole — PMTUD failure
6	TCP connections establish but data transfer hangs (Window Size drops to 0)	Middle device dropping or modifying data packets
7	One-way communication: A→B works, B→A fails	Asymmetric routing — return path goes through a different (blocking) device
8	Only specific port/protocol fails, everything else works	Firewall/ACL rule matching precisely
9	Massive TCP Retransmissions in Wireshark	Packets dropped on the path, TCP layer retransmitting
10	TTL value abnormally low	Possible routing loop consuming TTL hops

3. Troubleshooting Flowchart

flowchart TD
    START["🔴 Packet not arriving at destination"] --> CAPTURE["📡 Take simultaneous captures<br/>on BOTH source and destination"]
    
    CAPTURE --> LEAVES{"Packet leaves source?<br/>Visible in source-side capture?"}
    
    LEAVES -->|No| LOCAL["🔧 Local issue<br/>WFP drop / Driver issue /<br/>Local firewall<br/>→ Not a middle-drop scenario"]
    
    LEAVES -->|Yes| ARRIVES{"Packet arrives at destination?<br/>Visible in dest-side capture?"}
    
    ARRIVES -->|"Yes but modified"| CHANGED["⚠️ Different scenario<br/>→ Packet Changed in Middle<br/>Check NAT / proxy modification"]
    
    ARRIVES -->|"Yes, identical"| NOTMIDDLE["✅ Not a middle drop<br/>Issue is at destination app/OS stack"]
    
    ARRIVES -->|No| DISAPPEAR["❌ CONFIRMED: Packet dropped in middle!<br/>Need to locate the drop point"]
    
    DISAPPEAR --> TRACE["🔍 Run tracert + pathping<br/>to identify path and loss hop<br/>tracert dest / pathping dest"]
    
    TRACE --> PATHRESULT{"Analyze pathping results"}
    
    PATHRESULT --> SPECIFIC_HOP["📍 Loss at specific hop"]
    PATHRESULT --> ALL_STAR["⛔ All * * * from hop N onward"]
    PATHRESULT --> INTERMITTENT["📊 Intermittent low % loss<br/>across multiple hops"]
    
    SPECIFIC_HOP --> WHAT_TRAFFIC{"What traffic is affected?"}
    
    WHAT_TRAFFIC -->|"Specific port/protocol only"| FW_ACL["🛡️ Firewall / ACL rule<br/>on that device<br/>→ Contact network team to check rules"]
    
    WHAT_TRAFFIC -->|"All traffic to that destination"| ROUTING["🗺️ Routing issue / Device down<br/>→ Check routing table and device status"]
    
    WHAT_TRAFFIC -->|"Only large packets fail"| MTU_TEST["📏 MTU issue<br/>Test: ping dest -f -l 1472"]
    
    MTU_TEST --> MTU_RESULT{"ping -f result"}
    MTU_RESULT -->|"1472 fails<br/>1400 succeeds"| MTU_HOLE["🕳️ MTU Black Hole!<br/>A device has MTU < 1500<br/>AND blocks ICMP Frag Needed<br/>→ Binary search for exact MTU"]
    MTU_RESULT -->|"1472 succeeds"| NOT_MTU["Not an MTU issue<br/>→ Check TCP MSS negotiation<br/>or application-layer segmentation"]
    
    ALL_STAR --> BLOCK_POINT["⛔ Real blocking point<br/>Device/link after this hop is completely blocked"]
    BLOCK_POINT --> BLOCK_CHECK{"Check blocking cause"}
    BLOCK_CHECK --> BH_ROUTE["🕳️ Black Hole Route<br/>Route points to Null interface"]
    BLOCK_CHECK --> DEVICE_DOWN["💀 Device down<br/>Physical link failure"]
    BLOCK_CHECK --> TOTAL_BLOCK["🚫 Total block<br/>Firewall DROP ALL"]
    
    INTERMITTENT --> CONGESTION["📈 Congestion / Interface errors<br/>Check device interface counters"]
    CONGESTION --> COUNTERS{"Interface counters show?"}
    COUNTERS -->|"CRC errors / Input errors"| PHYSICAL["🔌 Physical layer issue<br/>Replace cable / SFP / port"]
    COUNTERS -->|"Output drops / Queue full"| QOS["📊 Congestion / QoS issue<br/>Adjust QoS policy or add bandwidth"]
    COUNTERS -->|"Counters normal"| IDS_CHECK["🔍 Check IDS/IPS logs<br/>Security device may be silently dropping"]
    
    style START fill:#ff6b6b,stroke:#333,color:#fff
    style LOCAL fill:#ffd93d,stroke:#333
    style CHANGED fill:#ffd93d,stroke:#333
    style NOTMIDDLE fill:#6bff6b,stroke:#333
    style DISAPPEAR fill:#ff6b6b,stroke:#333,color:#fff
    style FW_ACL fill:#74b9ff,stroke:#333
    style ROUTING fill:#74b9ff,stroke:#333
    style MTU_HOLE fill:#fd79a8,stroke:#333
    style PHYSICAL fill:#a29bfe,stroke:#333
    style QOS fill:#a29bfe,stroke:#333
    style BH_ROUTE fill:#ff6b6b,stroke:#333,color:#fff
    style DEVICE_DOWN fill:#ff6b6b,stroke:#333,color:#fff
    style TOTAL_BLOCK fill:#ff6b6b,stroke:#333,color:#fff

4. Detailed Troubleshooting Steps with Key Commands

Step 1: Confirm “Middle Drop” — Simultaneous Capture on Both Ends

This is the most critical step. Without dual-end captures, you cannot prove it’s a middle drop.

# === Source Side ===
# Method 1: netsh trace (recommended — built-in, low overhead)
netsh trace start capture=yes tracefile=c:\source_trace.etl maxsize=512
# ... reproduce the issue ...
netsh trace stop

# Method 2: pktmon (Windows Server 2019+ / Windows 10 2004+)
pktmon start --capture --comp all -f c:\source_pktmon.etl
# ... reproduce the issue ...
pktmon stop

# === Destination Side ===
# Same commands, started at the same time
netsh trace start capture=yes tracefile=c:\dest_trace.etl maxsize=512
# ... reproduce the issue ...
netsh trace stop

Analysis approach:

Source has TCP SYN → Destination doesn’t → SYN dropped in middle (most likely firewall)
Source has Data packets → Destination missing corresponding Data → Data dropped in middle (MTU / IDS most likely)
Source shows massive TCP Retransmissions → Destination has none of those packets → Confirmed middle drop

Step 2: Locate the Problematic Hop

# tracert — shows the path (fast, but no loss percentage)
tracert <destination_ip>
tracert -d <destination_ip>          # -d = no DNS resolution, faster

# pathping — shows loss % at each hop (wait ~5 minutes for full statistics)
pathping <destination_ip>
pathping -n <destination_ip>         # -n = no DNS resolution, faster
pathping -q 100 <destination_ip>     # -q = queries per hop (default 100)

# Continuous ping — observe loss pattern
ping <destination_ip> -t             # continuous ping, Ctrl+C to stop, check % loss
ping <destination_ip> -t -l 1400     # larger packets, more likely to expose issues

Reading pathping output:

                   Source to Here   This Node/Link
Hop  RTT    Lost/Sent = Pct  Lost/Sent = Pct  Address
  0                                           192.168.1.10
                                0/ 100 =  0%   |
  1   1ms     0/ 100 =  0%     0/ 100 =  0%  192.168.1.1
                                0/ 100 =  0%   |
  2  15ms     0/ 100 =  0%     0/ 100 =  0%  10.0.0.1
                               12/ 100 = 12%   |       ← This LINK has 12% loss!
  3  45ms    12/ 100 = 12%     0/ 100 =  0%  10.1.1.1  ← Cumulative 12% to here

“This Node/Link” column is the key — shows loss contributed by that specific hop
“Source to Here” column is cumulative loss

Step 3: MTU Issue Diagnosis

# MTU test: -f = Don't Fragment, -l = payload size
# Standard Ethernet MTU = 1500, minus IP(20) + ICMP(8) headers = 1472
ping <destination_ip> -f -l 1472    # If fails → MTU < 1500 somewhere on path
ping <destination_ip> -f -l 1400    # Try smaller
ping <destination_ip> -f -l 1300    # Keep narrowing down
# Binary search until you find the exact maximum passable size

# Failure message:
# "Packet needs to be fragmented but DF set."

# Check local interface MTU
netsh interface ipv4 show subinterfaces

# Lower local MTU as a workaround
netsh interface ipv4 set subinterface "Ethernet" mtu=1400 store=persistent

Classic MTU Black Hole scenarios:

VPN tunnels add encapsulation headers (GRE +24, IPsec +50-80), reducing effective MTU
PPPoE connections: MTU = 1492 (not 1500)
Some cloud environments / SD-WAN have even lower MTU
A middle device blocks ICMP Type 3 Code 4 (Fragmentation Needed) → PMTUD fails → large packets never get through

Step 4: Firewall / ACL Investigation

# Test specific port connectivity
Test-NetConnection <destination_ip> -Port 443
Test-NetConnection <destination_ip> -Port 80
Test-NetConnection <destination_ip> -Port 3389

# Compare: ICMP works but TCP doesn't → firewall filtering by port/protocol
ping <destination_ip>                          # Success
Test-NetConnection <destination_ip> -Port 443  # Failure
# → Almost certainly a middle firewall blocking TCP 443

# In source-side Wireshark, check if SYN or SYN-ACK is being dropped:
# Filter: tcp.flags.syn==1 && tcp.flags.ack==0
# Only SYN retransmissions, no SYN-ACK → SYN dropped in middle
# SYN-ACK received but subsequent ACK/Data lost → possible stateful FW issue

Step 5: Asymmetric Routing Investigation

# Tracert from BOTH directions
# === On Host A ===
tracert -d <B_ip>

# === On Host B ===
tracert -d <A_ip>

# If paths are different (asymmetric) AND there's a stateful firewall in between:
# → The firewall only sees the connection setup on one direction
# → Return traffic on the other path is treated as "invalid connection" and dropped

Step 6: Advanced Diagnostics

# pktmon drop monitoring (Windows Server 2019+)
pktmon start --capture --comp all --type drop
# ... reproduce ...
pktmon stop
pktmon format c:\PktMon.etl -o c:\pktmon_drops.txt
# Check the "drop reason" field

# TTL analysis (in Wireshark):
# Default starting TTL: Linux = 64, Windows = 128
# If received TTL = 120 → traversed 8 hops (128 - 120 = 8)
# If tracert shows only 5 hops but TTL decreased by 8 → hidden hops / possible routing loop

# TCP retransmission analysis (Wireshark filters):
# tcp.analysis.retransmission — all retransmissions
# tcp.analysis.fast_retransmission — fast retransmit (3 DupACKs received)
# tcp.analysis.rto — retransmission timeout (more severe, usually total drop)

5. Solutions

Root Cause	Solution	Owner
Firewall/ACL blocking	Add allow rule on the specific device; ensure ICMP Type 3 Code 4 is permitted	Network / Security team
MTU Black Hole	Lower client MTU (`netsh interface ipv4 set subinterface`); fix device blocking ICMP; enable TCP MSS Clamping	Network team / Client
Router interface errors	Check interface health; replace SFP/cable/port; clear counters and monitor	Network team
ISP issue	Provide traceroute/pathping evidence showing loss at their hop; escalate to ISP	ISP
Asymmetric routing	Fix routing to ensure symmetric paths; or configure stateful FW to allow asymmetric traffic	Network team
Congestion/QoS	Adjust QoS policy to prioritize critical traffic; upgrade bandwidth	Network team
IDS/IPS silent drop	Check IDS/IPS logs for matching rule; add exception/whitelist	Security team
Black hole route	Fix routing table; remove erroneous route pointing to Null interface	Network team

6. Troubleshooting Tips

💡 Tip 1: ALWAYS capture on BOTH ends simultaneously — this is the ONLY way to prove “dropped in middle.” Single-end capture only shows retransmissions, not where the drop happens.

💡 Tip 2: pathping is better than tracert for finding the loss point. tracert shows the path; pathping shows the loss percentage at each hop. However, pathping takes ~5 minutes to gather statistics.

💡 Tip 3: tracert timeout at a specific hop ≠ that hop is the problem! Many routers rate-limit ICMP replies, so tracert shows * * * at that hop, but actual data traffic passes through fine. Only when all hops from that point onward time out is it a real blocking point.

💡 Tip 4: MTU Black Hole classic sign: “ping works but large data transfers hang.” Immediately test with ping -f -l 1472. If 1472 fails but 1400 succeeds, it’s an MTU issue.

💡 Tip 5: Massive TCP Retransmissions in Wireshark = strong indicator of middle device dropping. Especially RTO (Retransmission Timeout) type retransmissions — these usually mean complete packet loss, not just delay.

💡 Tip 6: Compare TTL values of received packets. If TTL is much lower than expected, extra hops exist (routing loops rapidly consume TTL).

💡 Tip 7: If only TCP SYN is dropped but ICMP (ping) works → almost certainly a firewall rule. Firewalls typically filter by port/protocol — ICMP may be allowed while specific TCP ports are blocked.

💡 Tip 8: For ISP issues, traceroute from BOTH directions. Use Looking Glass tools to run a reverse traceroute from the ISP side, proving the loss point is within their network.

💡 Tip 9: Time-correlated loss (high during business hours, fine at night) → congestion. Time-independent, persistent loss → configuration issue or hardware failure.

💡 Tip 10: pktmon drop monitoring (pktmon start --type drop) captures drop reasons inside the Windows networking stack — invaluable for determining whether drops are local or truly in the middle.

7. References

No verified reference documents at this time.