继完成rtmp服务器开发后,最近也写完了rtsp服务器,可以将国标ps流以及其他格式协议码流转rtsp协议输出。中间开发过程用了许多播放器测试,最常用的就是vlc。使用vlc测试过程,遇到了许多问题。今天就记录一个比较奇怪的问题。
使用rtp over udp模式播放时,没出现问题,但是使用rtp over tcp模式时,vlc播放几十秒后画面突然卡住不动了,看了vlc 的debug message没发现异常。用ffplay,live555,potplayer测了都没异常。后面换了不同版本vlc测试,更奇怪了,vlc3.0.0以及之前,3.0.5以及之后版本都正常。应该是vlc对rtp over tcp做了特殊处理。此时抓包分析rtsp交互数据,发现出现问题版本的vlc每隔一定时间除了会发送OPTIONS命令,然后还有以'$'开头的一串特殊字节,发送完这个播放画面就卡住了。为什么会卡住不播放了呢?只能看vlc源码查找问题了。
通过阅读相关源码,终于定位到了原因。这个是vlc的keep-alive机制造成的。由于vlc使用了live555做rtsp处理,所以对应处理代码在modules/access/live555.cpp这个文件里。下面结合代码说下原因。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
static void TimeoutPrevention( void *p_data ) { demux_t *p_demux = (demux_t *) p_data; demux_sys_t *p_sys = (demux_sys_t *)p_demux->p_sys; char *bye = NULL; if( var_GetBool( p_demux, "rtsp-tcp" ) ) return; /* Protect Live555 from us calling their functions simultaneously with Demux() or Control() */ vlc::threads::mutex_locker locker( p_sys->timeout_mutex ); /* If the timer fires while the demuxer owns the lock, and the demuxer * then torns the session down, the pointers will become NULL. By the time * this timer callback obtains the callback, either a new session was * created and the timer is rescheduled, or the pointers are still NULL * and the timer is descheduled. In the second case, bail out (then wait * for the timer to be rescheduled or destroyed). In the first case, this * might send an early refresh - that´s harmless but suboptimal (FIXME). */ if( p_sys->rtsp == NULL || p_sys->ms == NULL ) return; bool use_get_param = p_sys->b_get_param; /* Use GET_PARAMETERS if supported. wmserver dialect supports * it, but does not report this properly. */ if( var_GetBool( p_demux, "rtsp-wmserver" ) ) use_get_param = true; if( use_get_param ) p_sys->rtsp->sendGetParameterCommand( *p_sys->ms, default_live555_callback, bye ); else p_sys->rtsp->sendOptionsCommand( default_live555_callback, NULL ); if( !wait_Live555_response( p_demux ) ) { msg_Err( p_demux, "keep-alive failed: %s", p_sys->env->getResultMsg() ); /* Just continue, worst case is we get timed out later */ } } |
如上函数是vlc的rtsp超时处理代码,出现问题的vlc版本没有
1 2 |
if( var_GetBool( p_demux, "rtsp-tcp" ) ) return; |
这两行代码,我们先把这两行代码注释,分析下为什么会出现播放画面突然不动的现象。
1)rtsp交互开始vlc客户端会发送OPTIONS请求,我们服务器需要回应支持的方法。如果我们服务器回应包括GET_PARAMETER方法(可选),use_get_param
就为true,然后keep-alive机制就会定时sendGetParameterCommand
,否则sendOptionsCommand
,我这边服务没去做GET_PARAMETER方法的支持,所以会定时收到vlc发的OPTIONS命令请求。vlc发送完OPTIONS请求命令后,开始wait_Live555_response(p_demux)
。看下这个函数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
/* return true if the RTSP command succeeded */ static bool wait_Live555_response( demux_t *p_demux, int i_timeout = 0 /* ms */ ) { TaskToken task; demux_sys_t * p_sys = (demux_sys_t *)p_demux->p_sys; p_sys->event_rtsp = 0; if( i_timeout > 0 ) { /* Create a task that will be called if we wait more than timeout ms */ task = p_sys->scheduler->scheduleDelayedTask( i_timeout*1000, TaskInterruptRTSP, p_demux ); } p_sys->event_rtsp = 0; p_sys->b_error = true; p_sys->i_live555_ret = 0; p_sys->scheduler->doEventLoop( &p_sys->event_rtsp ); //here, if b_error is true and i_live555_ret = 0 we didn't receive a response if( i_timeout > 0 ) { /* remove the task */ p_sys->scheduler->unscheduleDelayedTask( task ); } return !p_sys->b_error; } |
传入的参数中i_timeout
为默认值0,所以没有超时时间,会一直等服务器响应请求。
2)我这边服务器有个命令解析类,只处理标准的命令(OPTIONS,DESCRIBE,PLAY等)。由于vlc会定时发送'$'开头数据,跟OPTIONS请求数据混在一起送到我的命令解析里,导致我这边没能正确解析,所以也没有回应vlc keep-alive机制的OPTIONS请求。我们再看下TimeoutPrevention
函数,该函数进入后会:
1 |
vlc::threads::mutex_locker locker( p_sys->timeout_mutex ); |
由于我的服务器没有回应OPTIONS请求,所以这个锁会一直阻塞,我们看下这个锁用在哪个地方:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
/***************************************************************************** * Demux: *****************************************************************************/ static int Demux( demux_t *p_demux ) { demux_sys_t *p_sys = (demux_sys_t *)p_demux->p_sys; TaskToken task; bool b_send_pcr = true; int i; /* Protect Live555 from simultaneous calls in TimeoutPrevention() during pause */ vlc::threads::mutex_locker locker( p_sys->timeout_mutex ); for( i = 0; i < p_sys->i_track; i++ ) |
可知由于TimeoutPrevention
一直阻塞,所以Demux过程不能执行了,所以播放画面不动了。
新版vlc已经通过
1 2 |
if( var_GetBool( p_demux, "rtsp-tcp" ) ) return; |
取消了rtp over tcp的keep-alive机制,所以3.0.5以及之后版本没有出现问题。我的rtsp服务器后面也针对'$'开头数据做了处理,测了下,一切都正常了。
'$'开头数据是做什么的呢?在我服务器发RTCP数据时才用到,没想到客户端也有类似机制。在rfc2326中,'$'(0x24)开头数据叫做:Embedded (Interleaved) Binary Data,称为嵌入式二进制数据。测试的那么多播放器,只有vlc实现了这个。而且这个Embedded (Interleaved) Binary Data只工作在rtp over tcp下。这个数据有什么作用呢?rfx2326 10.12这么介绍的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
10.12 Embedded (Interleaved) Binary Data Certain firewall designs and other circumstances may force a server to interleave RTSP methods and stream data. This interleaving should generally be avoided unless necessary since it complicates client and server operation and imposes additional overhead. Interleaved binary data SHOULD only be used if RTSP is carried over TCP. Stream data such as RTP packets is encapsulated by an ASCII dollar sign (24 hexadecimal), followed by a one-byte channel identifier, followed by the length of the encapsulated binary data as a binary, two-byte integer in network byte order. The stream data follows immediately afterwards, without a CRLF, but including the upper-layer protocol headers. Each $ block contains exactly one upper-layer protocol data unit, e.g., one RTP packet. The channel identifier is defined in the Transport header with the interleaved parameter(Section 12.39). When the transport choice is RTP, RTCP messages are also interleaved by the server over the TCP connection. As a default, RTCP packets are sent on the first available channel higher than the RTP channel. The client MAY explicitly request RTCP packets on another channel. This is done by specifying two channels in the interleaved parameter of the Transport header(Section 12.39). RTCP is needed for synchronization when two or more streams are interleaved in such a fashion. Also, this provides a convenient way to tunnel RTP/RTCP packets through the TCP control connection when required by the network configuration and transfer them onto UDP when possible. C->S: SETUP rtsp://foo.com/bar.file RTSP/1.0 CSeq: 2 Transport: RTP/AVP/TCP;interleaved=0-1 S->C: RTSP/1.0 200 OK CSeq: 2 Date: 05 Jun 1997 18:57:18 GMT Transport: RTP/AVP/TCP;interleaved=0-1 Schulzrinne, et. al. Standards Track [Page 40] RFC 2326 Real Time Streaming Protocol April 1998 Session: 12345678 C->S: PLAY rtsp://foo.com/bar.file RTSP/1.0 CSeq: 3 Session: 12345678 S->C: RTSP/1.0 200 OK CSeq: 3 Session: 12345678 Date: 05 Jun 1997 18:59:15 GMT RTP-Info: url=rtsp://foo.com/bar.file; seq=232433;rtptime=972948234 S->C: $\000{2 byte length}{"length" bytes data, w/RTP header} S->C: $\000{2 byte length}{"length" bytes data, w/RTP header} S->C: $\001{2 byte length}{"length" bytes RTCP packet} |
rtp over tcp模式下,就一个socket端口进行命令控制以及流传输,不像rtp over udp,另开udp socket传输数据。由于防火墙以及其他外部因素,可能造成rtsp方法与rtp流数据交织混在一起。为了避免这个,才有这个设计。通过:
1 |
'$'+信道编号(0或1)+数据 |
对控制信息以及流数据进行区分。具体介绍可以参考:
RTP over RTSP包混合发送的解决办法:https://blog.csdn.net/myslq/article/details/79819179
由于Embedded (Interleaved) Binary Data是在是在服务器回应PLAY推流后vlc才这样处理的,我这边没注意到,所以导致解析出现错误。不过除了vlc,其他播放器都没支持Embedded (Interleaved) Binary Data,因为推流是服务器端,前面命令交互完,服务器就开始推流了,对于客户端我觉得用处不大。
文章评论