Gazebo | Ignition | Community
Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Intermittent segmentation fault possibly by custom WorldPlugin attaching and detaching child

My goal is to teleport a robot to an arbitrary pose in a 0-gravity world, while keeping the robot in a fixed pose after teleportation. I need to do the teleport a bunch of times.

I’ve tried many ways to do this, and here is the only way that does the task, but I get intermittent segmentation faults after running the simulation for a while.

I am doing the teleport by attaching and detaching a joint between the world link and the base link of the robot. Attaching the robot to world makes the robot fixed. Detaching the joint allows the robot to be moved freely.

The teleport is done by a WorldPlugin. It subscribes to a rosservice (among others) that says detach or attach. It then calls physics::Joint::Attach(world_link, base_link) or physics::Joint::Detach(). The two links are found by physics::WorldPtr->GetByName() and dynamically casted to physics::LinkPtr.

The intermittent seg fault ALWAYS happens after a Detach() has been done and the rosservice has returned successfully. But the seg fault doesn’t happen within my rosservice handler function inside the WorldPlugin. It happens somewhere else in the simulation loop, which leaves me very puzzled and have nowhere to start debugging!

I tried many things to fix this, and I’m still stuck with one big question, what is the seg fault caused by? Is it really from calling Attach() and Detach() too much? It happens after running simulation for more than 5 to 10 minutes, during which I’ve attached and detached many times.

Here is a backtrace from gdb:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x290b of process 11318]
0x0000000100600612 in dQMultiply3 ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib

(gdb) bt
#0  0x0000000100600612 in dQMultiply3 ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#1  0x0000000100617541 in getHingeAngle(dxBody*, dxBody*, double*, double*) ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#2  0x000000010061569f in dxJointHinge::getInfo1(dxJoint::Info1*) ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#3  0x00000001005f392a in dxQuickStepper(dxWorldProcessContext*, dxWorld*, dxBody* const*, int, dxJoint* const*, int, double) ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#4  0x000000010060ea36 in dxProcessIslands(dxWorld*, double, void (*)(dxWorldProcessContext*, dxWorld*, dxBody* const*, int, dxJoint* const*, int, double)) ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#5  0x00000001005e8229 in dWorldQuickStep ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_ode.2.dylib
#6  0x00000001005202c1 in gazebo::physics::ODEPhysics::UpdatePhysics() ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_physics_ode.2.dylib
#7  0x000000010038483d in gazebo::physics::World::Update() ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_physics.2.dylib
#8  0x0000000100383864 in gazebo::physics::World::Step() ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_physics.2.dylib
#9  0x0000000100382d66 in gazebo::physics::World::RunLoop() ()
   from /usr/local/Cellar/gazebo2/2.2.6/lib/libgazebo_physics.2.dylib
#10 0x00000001019ec6c5 in boost::(anonymous namespace)::thread_proxy(void*) ()
   from /usr/local/lib/libboost_thread-mt.dylib
#11 0x00007fff98cab268 in _pthread_body ()
   from /usr/lib/system/libsystem_pthread.dylib
#12 0x00007fff98cab1e5 in _pthread_start ()
   from /usr/lib/system/libsystem_pthread.dylib
#13 0x00007fff98ca941d in thread_start ()
   from /usr/lib/system/libsystem_pthread.dylib
#14 0x0000000000000000 in ?? ()

gdb shows there are 42 threads, and I looked at each one’s backtrace. The one running my WorldPlugin is NOT the problematic backtrace. I suspect it’s because I attach and detach a bunch, and this changed something that only later gets picked up when the simulation loop does an update, and that’s why it’s showing up in another thread?

Any help please! Or an alternative proper way to teleport (that I haven't tried)?

Thank you!