Message drops between plugins
In PX4 Gazebo we send messages between plugins. While trying to speed up the simulation, trying to run it faster than realtime, we discovered that the higher rate messages (e.g. IMU updates at 2500 Hz) quite often get dropped from one plugin to the next one. The message skips seem to be depending on CPU load and also increase if gzclient is running.
The publication in one plugin looks like this:
transport::PublisherPtr imu_pub_;
...
imu_pub_ = node_handle_->Advertise<sensor_msgs::msgs::Imu>("~/" + model_->GetName() + imu_topic_, 1);
And the publication is:
imu_pub_->Publish(imu_message_);
The subscription in the other plugin is just:
transport::SubscriberPtr imu_sub_;
imu_sub_ = node_handle_->Subscribe("~/" + model_->GetName() + imu_sub_topic_, &GazeboMavlinkInterface::ImuCallback, this);
And using a sequence variable in the message I'm checking if I miss updates in:
void GazeboMavlinkInterface::ImuCallback(ImuPtr& imu_message) {
...
What I found is that most times the callback is triggered correctly, however, sometimes skips of 1 happen, and seldom up to 10-20 can skip in a row. This is with a publish rate of 2500 Hz.
I also tried to use a queue bigger than 1 but this did not fix the actual issue and drops would still occur. Looking at Wireshark, I don't see this communication between the plugins as loopback TCP communication, so I'm assuming this is "local" communication?
Is this expected? Is there any setting or workaround I should try? I understand that this is a high rate, however, that's not the point because it can also happen at a lower rate given CPU usage is high. Ultimately, we would like to implement the simulation in lockstep with the controller, so we really should never miss a sample but track one timestamp all the way through the system.
I was able to work around it by adding a sequence to the message and then sleeping in `OnUpdate` until the sequence has changed and we can be sure that we have received the latest message.