Home | Tutorials | Wiki | Issues
Ask Your Question
4

Why does performance decrease when removing / inserting models?

asked 2015-11-08 07:15:30 -0500

Elte Hupkes gravatar image

I've a world plugin that does the following:

  • Insert a model (in this contrived example it's a simple sphere)
  • Run for 5 seconds simulation time
  • Measure the elapsed real time, calculate and print the real time factor, and remove the model
  • Repeat

This is a rather contrived simplification of a legitimate use case. I'm doing this with the server only, i.e. no GUI, in Gazebo 6.1. Even though the performed simulation is exactly the same each time, the real time factor decreases rapidly. The decrease is actually superlinear (quadratic?), as evidenced by this plot of the real time factors:

image description

It would therefore seem that something related to the old models lingers and slows everything down. I'd like to investigate and hopefully solve this issue, but I have no idea what could be causing this. My question, therefore, is whether anyone has an idea where this might come from?

edit retag flag offensive close merge delete

Comments

There are known issues with improper clean up when deleting models, both on the server and the client. Even though the problem goes beyond visuals, I think [this](https://bitbucket.org/osrf/gazebo/issues/1702/removing-models-leaves-behind-some-visuals) issue is the one capturing the problem.

chapulina gravatar imagechapulina ( 2015-11-08 19:14:02 -0500 )edit

do you delete and add the exactly same model ? if yes they should have the same performance

djou07 gravatar imagedjou07 ( 2015-11-09 01:18:00 -0500 )edit

Yes, it is exactly the same model - so yes the performance should be the same.. But it is not. @chapulina I'm gonna check that out.

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-09 08:35:06 -0500 )edit

I had this problem too. When doing evolution, if you keep the best individual of a generation, this individual should behave the same in next generation to learn. but if it will not, then the evolution has no effect. To avoid this I stopped and restarted simulation at each generation...

djou07 gravatar imagedjou07 ( 2015-11-09 09:05:07 -0500 )edit

@chapulina Are the visuals even loaded though when only the server is used?

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-09 10:19:09 -0500 )edit

On the server side, rendering is only used for sensors, like cameras.

chapulina gravatar imagechapulina ( 2015-11-09 11:55:40 -0500 )edit

I noticed that, but I'm not using any sensors for this test at the moment. Put some print statements in Visual just in case, none are created, so it can't be that...

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-09 15:48:06 -0500 )edit

@Elte Hupkes onlty server without visualisation

djou07 gravatar imagedjou07 ( 2015-11-10 01:50:52 -0500 )edit

@djou07 Similar use case then, since I'm also doing evolution. Restarting the simulation seems like a huge overhead... Also my requirements include some online evolution scenarios, so it's not really an option. Dead set on fixing this ;).

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-10 04:19:43 -0500 )edit

3 Answers

Sort by ยป oldest newest most voted
2

answered 2015-11-10 07:19:42 -0500

Elte Hupkes gravatar image

updated 2015-11-10 14:46:22 -0500

Thanks @silvio.traversaro for suggesting the flame graphs, I think I'm now on to what is happening here. It seems that ODEPhysics::CreateLink() calls dSimpleSpaceCreate for every model, but I can find zero code which deletes these spaces when models are removed from the world. Every model created thus results in an extra space. This would cause a quadratic slowdown since collision code is probably roughly O(n^2).

I'll move on to verify this is indeed correct, file a bug report and mark this as the correct answer.

EDIT: This does indeed appear to be what's going on, created issue https://bitbucket.org/osrf/gazebo/iss.... Unfortunately I'm still seeing performance degrade over time (though far less profound), likely caused by another issue. I'll continue to investigate but I'll mark this as the correct answer.

edit flag offensive delete link more
0

answered 2015-11-08 07:43:17 -0500

AndreiHaidu gravatar image

Hi,

how do you remove the model?

edit flag offensive delete link more

Comments

I send an "entity_delete" request - the only way that won't regularly crash the whole simulator in my experience. If I boot the client I can visually confirm that the model is indeed deleted (and World::GetModelCount() returns 1 every time) so they do in fact disappear if that's what you're wondering ;).

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-08 13:47:26 -0500 )edit
0

answered 2015-11-08 19:16:00 -0500

silvio.traversaro gravatar image

I have the impression that this kind of behaviors are due to https://bitbucket.org/osrf/gazebo/iss... .

edit flag offensive delete link more

Comments

I've been working on changing some of these things, haven't manged to get rid of the slowdown yet (dreading the moment I move into actual profiling ;]). What bugs me though is that it slows down quadratically - if this were a memory leak, I'd say it wouldn't affect performance much until memory runs out. I'm also not seeing memory usage increase in `top` when I run this as I did with the SDF memory leak, it mostly stays constant. So whatever's left behind is tiny in memory, large in computation.

Elte Hupkes gravatar imageElte Hupkes ( 2015-11-09 08:36:53 -0500 )edit

I always assumed that there is some kind of "list" of models that is not properly cleaned during model delete due to the shared pointer loop. An easy check is to see the name of the same model if you respawn it after that you delete it. In the case of unit_sphere_0 the names will be unit_sphere_0_0 , unit_sphere_0_1, unit_sphere_0_2 etc etc. Checking the logic of how this names are assigned could lead you to the data structure that is not properly cleared.

silvio.traversaro gravatar imagesilvio.traversaro ( 2015-11-09 16:11:16 -0500 )edit

By the way, are you sure that the slowdown is actually superlinear ? I assume that the RTF is proportional to the inverse of the physics loop computation time, so plotting the inverse of the RTF could give some more insight.

silvio.traversaro gravatar imagesilvio.traversaro ( 2015-11-09 16:13:53 -0500 )edit

Anyway doing some statistical profiling (for example using http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html) on your test should highlight the slow code path.

silvio.traversaro gravatar imagesilvio.traversaro ( 2015-11-09 16:32:07 -0500 )edit
Login/Signup to Answer

Question Tools

2 followers

Stats

Asked: 2015-11-08 07:15:30 -0500

Seen: 408 times

Last updated: Nov 10 '15