boost::asio::io_service run crash 已解决

问题:X86下运行没有问题,Arm下运行没有规律的崩溃

原因:编译器版本不兼容

解决:升级arm交叉编译工具链版本,从4.9升级至6.3,参考:Ubuntu arm 交叉编译环境搭建

 

以下是解决过程中的记录及尝试过的方案。

代码:

virtual void Init() {
    threads_.create_thread(boost::bind(&CommChannel::HandleData, shared_from_this()));
    threads_.create_thread(boost::bind(&CommChannel::WorkThread, shared_from_this())); // io_service run
}
// 其他线程调用
void TcpClient::Send_(uint8_t *buf, uint32_t len) {
    // send data by using tcp/ip protocol
    boost::asio::async_write(socket_, boost::asio::buffer(buf, len),
        strand_.wrap(boost::bind(&TcpClient::OnSend, GetSharedPtr(), boost::asio::placeholders::error,
                boost::asio::placeholders::bytes_transferred)));
    return;
}

可能的原因:内存越界,线程同步

堆栈信息

*** Error in `/home/root/robot': corrupted size vs. prev_size: 0x0030c450 ***

Thread 14 "MotorCli" received signal SIGABRT, Aborted.
[Switching to Thread 0xb08f6450 (LWP 619)]
__libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
47 ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0 __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1 0xb6cfd3cc in __libc_signal_restore_set (set=0xb08f4fa8) at ../sysdeps/unix/sysv/linux/nptl-signals.h:79
#2 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3 0xb6cfe0ba in __GI_abort () at abort.c:89
#4 0xb6d24cda in __libc_message (do_abort=do_abort@entry=2, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175
#5 0xb6d291ca in malloc_printerr (action=<optimized out>, str=0xb6da7330 "corrupted size vs. prev_size", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5049
#6 0xb6d29f26 in _int_free (av=0xb6dc37a4 <main_arena>, p=0x30c3f0, have_lock=0) at malloc.c:4052
#7 0x000154e2 in boost::asio::detail::thread_info_base::deallocate (this_thread=0xb08f5534, pointer=0x30c3f8, size=88) at /usr/local/boost-arm/include/boost/asio/detail/thread_info_base.hpp:80
#8 0x00015526 in boost::asio::asio_handler_deallocate (pointer=0x30c3f8, size=88) at /usr/local/boost-arm/include/boost/asio/impl/handler_alloc_hook.ipp:67
#9 0x0010583a in boost_asio_handler_alloc_helpers::deallocate<boost::_bi::bind_t<void, boost::_mfi::mf2<void, rockbot::network::TcpClient, boost::system::error_code const&, unsigned int>, boost::_bi::list3<boost::_bi::value<std::shared_ptr<rockbot::network::TcpClient> >, boost::arg<1> (*)(), boost::arg<2> (*)()> > > (p=0x30c3f8, s=88, h=...) at /usr/local/boost-arm/include/boost/asio/detail/handler_alloc_helpers.hpp:48
#10 0x00105292 in boost::asio::detail::asio_handler_deallocate<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, rockbot::network::TcpClient, boost::system::error_code const&, unsigned int>, boost::_bi::list3<boost::_bi::value<std::shared_ptr<rockbot::network::TcpClient> >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running> (pointer=0x30c3f8, size=88, this_handler=0xb08f5470)
at /usr/local/boost-arm/include/boost/asio/detail/wrapped_handler.hpp:216
#11 0x00104cd6 in boost_asio_handler_alloc_helpers::deallocate<boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, rockbot::network::TcpClient, boost::system::error_code const&, unsigned int>, boost::_bi::list3<boost::_bi::value<std::shared_ptr<rockbot::network::TcpClient> >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running> > (p=0x30c3f8, s=88, h=...)
at /usr/local/boost-arm/include/boost/asio/detail/handler_alloc_helpers.hpp:48
#12 0x00105d42 in boost::asio::detail::asio_handler_deallocate<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, rockbot::network::TcpClient, boost::system::error_code const&, unsigned int>, boost::_bi::list3<boost::_bi::value<std::shared_ptr<rockbot::network::TcpClient> >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running> > (pointer=0x30c3f8, size=88, this_handler=0xb08f545c)
at /usr/local/boost-arm/include/boost/asio/impl/write.hpp:543
#13 0x00105a52 in boost_asio_handler_alloc_helpers::deallocate<boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, boost::_bi::bind_t<void, boost::_mfi::mf2<void, rockbot::network::TcpClient, boost::system::error_code const&, unsigned int>, boost::_bi::list3<boost::_bi::value<std::shared_ptr<rockbot::network::TcpClient> >, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::asio::detail::is_continuation_if_running> > > (p=0x30c3f8, s=88, h=...)
at /usr/local/boost-arm/include/boost/asio/detail/handler_alloc_helpers.hpp:48

...

参考:https://stackoverflow.com/questions/23902046/crash-when-calling-run-on-boostasioio-service
解答:

At quick glance, there are two potential problems:
The CallbackTimer class fails to guarantee the pre-condition thatio_service::reset() is not invoked when there are unfinished calls to to the run(), run_one(), poll() or poll_one(), resulting in unspecified behavior.
CallbackTimer::runCallback() may be invoked after the lifetime of the CallbackTimer instance has ended, invoking undefined behavior.
Both problems result from no synchronization with the thread running the io_service. The thread is detached after its creation in CallbackTimer::start(), and the the if-statement within CallbackTimer::cancel() will always be false, causing no synchronization to occur. This allows for the following cases:
io_service::reset() is invoked while thread_ is within io_service::run().
The CallbackTimer that owns the io_service is deallocated while thread_ is within io_service::run().
To resolve this, consider no longer detaching from thread_ so that synchronization may occur.
For further debugging, a stack trace or an mcve could help in identifying the exact source of the crash.

可能原因:io_service线程和发送函数不在一个线程导致不同步?
可能解决:将发送处理使用io_service::post交给io_service线程处理

void TcpClient::Send(uint8_t* buf, uint32_t len)
{
    if(!connected_)
    {
        SYS_WARN("[%s:%d] The connection is not established, msg not send!", ip_address_.c_str(), port_);
        return;
    }
    GetIoService()->post(boost::bind(&TcpClient::Send_, this, buf, len));
}