今天我们来看下chrome://webrtc-internals/页面中的一个重要统计参数:丢包率(Fraction lost)。

什么是丢包率
丢包率这个概念我想大家再熟悉不过了,一般来说,丢包率是指所丢失数据包数量占所发送数据包的比例。丢包率也是衡量网络性能的一个指标,我们通常使用丢包率判断当前网络质量,不过丢包也分为拥塞丢包,随机丢包等,随机丢包情况下,我们不能通过得到的丢包率认为当前网络质量差,发生拥塞。这也是现在很多拥塞控制算法不使用丢包率作为主要衡量指标的原因。
ping命令是最常见的一种估计丢包率的方法。下面是ping Google的示例,底部packet loss就是这次ping的丢包率检测结果。
| 1 2 3 4 5 6 7 8 9 10 11 | root@jeff:~# ping google.com PING google.com (172.217.14.78) 56(84) bytes of data. 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=1 ttl=117 time=170 ms 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=2 ttl=117 time=169 ms 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=3 ttl=117 time=174 ms 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=4 ttl=117 time=169 ms 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=5 ttl=117 time=170 ms 64 bytes from lax17s38-in-f14.1e100.net (172.217.14.78): icmp_seq=6 ttl=117 time=174 ms --- google.com ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5007ms rtt min/avg/max/mdev = 168.651/171.173/174.369/2.262 ms | 
RTCP中的丢包信息
在WebRTC中,一般是通过Receiver report(RR)反馈丢包信息。RR记录着丢包相关统计。首先看下RR报文格式:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |         0                   1                   2                   3         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ header |V=2|P|    RC   |   PT=RR=201   |             length            |        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        |                     SSRC of packet sender                     |        +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ report |                 SSRC_1 (SSRC of first source)                 | block  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   1    | fraction lost |       cumulative number of packets lost       |        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        |           extended highest sequence number received           |        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        |                      interarrival jitter                      |        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        |                         last SR (LSR)                         |        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+        |                   delay since last SR (DLSR)                  |        +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ report |                 SSRC_2 (SSRC of second source)                | block  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   2    :                               ...                             :        +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+        |                  profile-specific extensions                  | | 
跟丢包统计有关的几个字段如下:
- fraction lost,自上次发送RR后SSRC_n的丢包率
- cumulative number of packets lost: 24 bits,记录SSRC_n从开始到现在总共丢失的包
- extended highest sequence number received: 32 bits,将序列号扩展为32bit,用于标识当前收到的最大包序列号
如上3个字段中,fraction lost根据后两个字段即可算出。前后两个RR报文间隔内fraction lost计算公式入下:
| 1 | fraction lost = (cumulative_loss_ - last_report_cumulative_loss_) / (received_seq_max_ - last_report_seq_max_) | 
公式中:
- received_seq_max_:extended highest sequence number receive
- last_report_seq_max_:last extended highest sequence number received
- cumulative_loss_:cumulative number of packets lost
- last_report_cumulative_loss_:last cumulative number of packets lost
接下来我们结合接收端统计代码看下这几个参数如何计算。相关计算代码位于StreamStatisticianImpl中。
StreamStatisticianImpl代码导读
首先是根据RTP包记录相关信息:
| 1 2 3 | ReceiveStatisticsImpl::OnRtpPacket                 ↓ StreamStatisticianImpl::UpdateCounters | 
接着构造RR时,根据记录的信息进行相关丢包统计参数计算
| 1 2 3 4 5 6 7 | RTCPSender::BuildRR           ↓ RTCPSender::CreateReportBlocks           ↓ ReceiveStatisticsImpl::RtcpReportBlocks           ↓ StreamStatisticianImpl::MaybeAppendReportBlockAndReset | 
fraction lost
计算位于StreamStatisticianImpl::MaybeAppendReportBlockAndReset中,: 
| 1 2 3 4 5 6 7 8 9 10 11 |   // Calculate fraction lost.   // 计算该统计间隔内期望收到的包数   int64_t exp_since_last = received_seq_max_ - last_report_seq_max_;   RTC_DCHECK_GE(exp_since_last, 0);   // 根据累计丢包数计算该统计间隔内的丢包数   int32_t lost_since_last = cumulative_loss_ - last_report_cumulative_loss_;   if (exp_since_last > 0 && lost_since_last > 0) {     // Scale 0 to 255, where 255 is 100% loss.     stats.SetFractionLost(255 * lost_since_last / exp_since_last);   } | 
cumulative number of packets lost
对应代码中的cumulative_loss_计算。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | void StreamStatisticianImpl::UpdateCounters(const RtpPacketReceived& packet) {   // 每收到一个包就会先减下累积丢包数   --cumulative_loss_;   int64_t sequence_number =       seq_unwrapper_.UnwrapWithoutUpdate(packet.SequenceNumber());   // 记录收到的第一个包信息   if (!ReceivedRtpPacket()) {     received_seq_first_ = sequence_number;     last_report_seq_max_ = sequence_number - 1;     received_seq_max_ = sequence_number - 1;     receive_counters_.first_packet_time_ms = now_ms;   // 检查是否乱续包:   // 如果是重传、乱序、重复包,返回true,然后直接return,造成   // cumulative_loss_ - 1,影响是最后得到的丢包率不是原始丢包率(只丢包情况下),   // 所以一个包丢失了但是后面重传收到了,或者因为乱序等原因晚到,不影响最后的   // cumulative_loss_(非乱序丢包检查步骤中计数过,这里又减回去)   } else if (UpdateOutOfOrder(packet, sequence_number, now_ms)) {     return;   }   // In order packet.   // 非乱序时丢包检查:   // 1)如果没有丢包,sequence_number - received_seq_max_为1   //    这里cumulative_loss_ + 1,从而保持不变   // 2)假如中间丢了N个包,那么sequence_number - received_seq_max_值为N + 1   //    cumulative_loss_ + (N + 1),结合最开始减了一次,所以cumulative_loss_   //    最后是加N   cumulative_loss_ += sequence_number - received_seq_max_;   received_seq_max_ = sequence_number;   seq_unwrapper_.UpdateLast(sequence_number); } | 
StreamStatisticianImpl::UpdateOutOfOrder中,如果是重传、乱序或者重复包,当前包sequence_number 会小于received_seq_max_。
| 1 2 3 4 5 6 7 8 | bool StreamStatisticianImpl::UpdateOutOfOrder(const RtpPacketReceived& packet,                                               int64_t sequence_number,                                               int64_t now_ms) {   if (sequence_number > received_seq_max_)     return false;   return true; } | 
代码中提到得到的丢包率不是原始丢包率,是包括重传包后的丢包率,实际值会低于原始丢包率,这个会干扰我们的一些判断,假如每次丢包后面重传都收到,那么我们也许会得到一个丢包率为0的反馈,由于Sendside BWE中用到了这个丢包率,所以对带宽估计值有影响。那么有什么办法避免吗,如果要想得到原始丢包率,重传包就不能进入这里统计了,WebRTC提供了RTX机制,重传包用额外SSRC的包发送,这样重传包就不会干扰原始媒体包的统计。
extended highest sequence number received
收到的最新非乱序包序列号,前面提到这个字段用32bits表示,所以需要对16bits进行扩展处理。
StreamStatisticianImpl::UpdateCounters中得到64bits的序列号:
| 1 2 3 | int64_t sequence_number =     seq_unwrapper_.UnwrapWithoutUpdate(packet.SequenceNumber()); received_seq_max_ = sequence_number; | 
接着在StreamStatisticianImpl::MaybeAppendReportBlockAndReset中将该序列号传给ReportBlock:
| 1 | stats.SetExtHighestSeqNum(received_seq_max_); | 
传递给ReportBlock时会转为32bits。
| 1 2 3 | void ReportBlock::SetExtHighestSeqNum(uint32_t ext_highest_seq_num) {   extended_high_seq_num_ = ext_highest_seq_num; } | 
发送端中的丢包率
WebRTC中通过接收端统计包接收情况,反馈给发送端,然后发送端根据这些统计进行丢包率计算。发送端中丢包率主要用在Sendside BWE中。这里先看下主要代码流程:
| 1 2 3 4 5 6 7 8 9 | RtpTransportControllerSend::OnReceivedRtcpReceiverReport                      ↓ RtpTransportControllerSend::OnReceivedRtcpReceiverReportBlocks                      ↓ GoogCcNetworkController::OnTransportLossReport                      ↓ SendSideBandwidthEstimation::UpdatePacketsLost                      ↓ SendSideBandwidthEstimation::UpdateEstimate | 
RtpTransportControllerSend::OnReceivedRtcpReceiverReportBlocks中根据RTCP报文中的信息提取相关信息。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | void RtpTransportControllerSend::OnReceivedRtcpReceiverReportBlocks(     const ReportBlockList& report_blocks,     int64_t now_ms) {   int total_packets_lost_delta = 0;   int total_packets_delta = 0;   // Compute the packet loss from all report blocks.   for (const RTCPReportBlock& report_block : report_blocks) {     auto it = last_report_blocks_.find(report_block.source_ssrc);     if (it != last_report_blocks_.end()) {       auto number_of_packets = report_block.extended_highest_sequence_number -                                it->second.extended_highest_sequence_number;       total_packets_delta += number_of_packets;       auto lost_delta = report_block.packets_lost - it->second.packets_lost;       total_packets_lost_delta += lost_delta;     }     last_report_blocks_[report_block.source_ssrc] = report_block;   }   int packets_received_delta = total_packets_delta - total_packets_lost_delta;   TransportLossReport msg;   // 得到后续需要的丢包统计信息   msg.packets_lost_delta = total_packets_lost_delta;   msg.packets_received_delta = packets_received_delta; } | 
SendSideBandwidthEstimation::UpdatePacketsLost中根据提取的信息进行丢包率计算。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | void SendSideBandwidthEstimation::UpdatePacketsLost(int64_t packets_lost,                                                     int64_t number_of_packets,                                                     Timestamp at_time) {   last_loss_feedback_ = at_time;   if (first_report_time_.IsInfinite())     first_report_time_ = at_time;   // Check sequence number diff and weight loss report   if (number_of_packets > 0) {     int64_t expected =         expected_packets_since_last_loss_update_ + number_of_packets;     // Don't generate a loss rate until it can be based on enough packets.     if (expected < kLimitNumPackets) {       // Accumulate reports.       expected_packets_since_last_loss_update_ = expected;       lost_packets_since_last_loss_update_ += packets_lost;       return;     }     has_decreased_since_last_fraction_loss_ = false;     // 这里计算丢包率时乘以了256,也就是<<8,这样做是为了避免后面计算使用浮点表示     int64_t lost_q8 = (lost_packets_since_last_loss_update_ + packets_lost)                       << 8;     last_fraction_loss_ = std::min<int>(lost_q8 / expected, 255);   } } | 
最后在SendSideBandwidthEstimation::UpdateEstimate中根据不同档位的丢包率进行相应的码率调整。
总结
本文主要介绍了WebRTC中丢包相关统计,以及使用非RTX方式进行丢包重传时引入的问题,最后简单介绍了下丢包统计在发送端的使用。
参考
[1] RFC3550.https://tools.ietf.org/html/rfc3550.
文章评论
你好问一下,你说的拥塞丢包和随机丢包,有什么比较好用的区分策略吗?如果区分的清楚,可以针对于这两种情况使用不同带宽估计策略。
开头的这个公式好像写错了,fraction lost = (received_seq_max_ - last_report_seq_max_) / (cumulative_loss_ - last_report_cumulative_loss_),应该是这样吧(cumulative_loss_ - last_report_cumulative_loss_)/ (received_seq_max_ - last_report_seq_max_)。
@yy 是写错了,感谢指正