Gazebo | Ignition | Community
Ask Your Question

DRCSIM: how shall we jointly agree on simulation parameters? [closed]

asked 2013-02-08 18:29:08 -0500

cga gravatar image

updated 2013-02-08 18:29:30 -0500

I realize there is a lot of uncertainty still (will we use Bullet for the VRC or not?, how much will performance be improved in the ODE version with better load sharing among threads and a shared memory interface?), but I would like to get agreement that we will maximize the number of ODE/Bullet iterations (iter) on the VRC cloud-based simulator. This will improve simulation robustness, the simulator IMU measurements, and the simulated force measurements.

For example, on my Xeon-based machine (dual Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz) with Quadra 6000 and Tesla K20 GPUs (this machine is a slightly newer version of the cloud machine we plan to use), I was able to get 125 iterations (vs. 40 in the distributed version) with a real time factor in the range 1.00-1.05 (it varies slowly, I disabled the real time limitation to improve performance) with the following parameter values:

DRCSIM 2.0.1 as is except (angle brackets removed to make this visible)

     update_rate 0 /update_rate
          iters 125 /iters

Turning real time control back on with an update-rate of 1000 gets a real time factor of 0.95.

I was not able to decrease dt to 0.0005 and get a real time factor near 1.0, but perhaps further improvements might make that possible. In that case, we will need to hunt for the best tradeoff of iters and dt as a group.

It would be really good to nail down the exact dt and iters we will use well in advance of the VRC, say be the end of March after we have some experience with Bullet and the shared memory interface.

I am cross posting this to the DARPA forum as well.

Thanks, Chris

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by nkoenig
close date 2013-07-23 21:06:57.935422

3 Answers

Sort by ยป oldest newest most voted

answered 2013-02-08 19:13:27 -0500

nkoenig gravatar image

Hi Chris,

Thanks for the input.

The first thing I want to mention is that a shared memory interface will not improve performance in the context of the VRC. Team code will not run on the same machine as simulation, so a network interface is mandatory.

Bullet is under rapid development. However, I don't want people to get the idea that will magically solve performance issues.

Onto the core of this question, which is the performance of Gazebo on cloud machines and what will be the fixed physics time step. Here are the following items we need to take into account:

  • Performance (as measured by real-time factor) varies during simulation as objects move and interact with one another. The means, we need to be clever about measuring performance.

  • The fidelity of sensor generation can greatly effect performance. Higher resolution cameras and faster sensor update rates, while nice, will lead to a slower simulation.

  • The fidelity of collision objects also effect performance. For example, using triangle meshes as collision objects can result in highly degraded performance.

So, we need to pick a dt that:

1) Allows Atlas and other objects to behave properly 2) Allows sensors to generate realistic-enough data 3) Get as close to real-time as possible give #1 and #2.

We are actively testing Gazebo, and DRC sim on cloud computers with environments that will be used during the VRC. Based on these tests, we will determine what the appropriate parameters for simulation will be. The first set of parameters will be released with the VRC Qualification worlds (the week of Feb 11th).

As usual, your feedback is most welcome and we'll try our best to accommodate your needs. Please keep in mind that many aspects of simulation and the VRC have to be balanced, and not everyone will get exactly what they want.

edit flag offensive delete link more


I am suggesting a shared memory interface between different simulator threads which might contain the simulator collision part (thread 1), the simulator physics part (thread 2), and the simulator Gazebo/ROS plugin part (thread 3), for example.

cga gravatar imagecga ( 2013-02-08 21:44:59 -0500 )edit

Collision, and physics are in the same thread. And there is no need for shared memory between threads. Threads are in the same process and they share the same heap.

nkoenig gravatar imagenkoenig ( 2013-02-08 23:30:14 -0500 )edit

answered 2013-02-08 22:15:15 -0500

cga gravatar image

updated 2013-02-09 11:11:14 -0500

A concrete measure of the effect of playing with the iteration (iters) parameter, for example, is to measure how much the robot sags due to joint constraint violations (not controller error). This can be measured by using

pose = links[ i ]->GetWorldPose();

to get "ground truth", and comparing the answer to what forward kinematics predicts using the root link pose and the measured joint angles. I am using a forward kinematics that uses the left foot as the root, so its "ground truth" error is always zero.

These are typical measurements, with the robot in the standard start-up pose of standing with arms out. The robot is not moving. Feedforward torques and integral control is used to get the robot close to the desired joint angles (of zero).

With iters = 40, the neck joint is 2mm off, the right ankle is 2.2mm off, the left wrist is 2.6mm off, and the right wrist is 1.2mm off. Orientation errors range up to 0.0027radians.

With iters = 125, the neck joint is 0.7mm off, the right ankle is 0.8mm off, the left wrist is 0.5mm off, and the right wrist is 0.9mm off. Orientation errors range up to 0.0008radians.

With iters = 200, the neck joint is 0.4mm off, the right ankle is 0.1mm off, the left wrist is 0.1mm off, and the right wrist is 0.5mm off. Orientation errors range up to 0.0005radians.

With iters = 400, joint constraint errors are in the 0.1mm range. Orientation errors range up to 0.0002radians.

I am working on comparable measures of joint constraint error for linear velocity and acceleration, and angular velocity and acceleration. I need a measure of joint acceleration to estimate the acceleration and angular acceleration errors around the robot due to joint constraint errors.

On another note, I note that this person set iters to 3000 to get reasonable simulated force measurements:

Figure 3.5, document page 40.

edit flag offensive delete link more


To inform the discussion of how fast we need to run the simulator, we measured joint constraint errors during Atlas walking in Gazebo/ODE. The results are at:

cga gravatar imagecga ( 2013-02-13 12:35:26 -0500 )edit

answered 2013-02-09 08:27:37 -0500

cga gravatar image

updated 2013-02-09 11:20:16 -0500

This is a comment, but I can't do formatting in a comment.

We know handling plugins is a significant load. Right now you have one thread doing

collision-and-physics (CP) / handle-plugins (PL) / CP / PL / CP / PL ...

One way to get more simulation per unit of real time is to split this into two threads.

thread 1: CP step 1  / CP step 2  / CP step 3  / CP step 4 / ...
thread 2: PL from 0  / PL from 1  / PL from 2  / PL from 3 / ...

These threads are synchronized through a shared memory flag.

It is true that with this scheme, the controller at best can use information from step i to affect step i+2. 1) This matches reality, and the way we actually control our existing humanoid robot. It takes time to move data, and with individual joint controllers it takes a lot of UDP packets to do it. Moving a camera image from the imaging chip through some USB/Firewire/ethernet interface into computer memory takes time (usually 10s of msec) and happens in parallel with the robot physically moving. 2) If the simulation time step is smaller (say 0.5msec) than the control interval (say 1msec), this effect is reduced.

Plugins that happen less often but take more computation (vision) can be handled in a different thread that is only triggered every N simulation steps, but otherwise is idle (avoid creating and destroying threads for each plugin invocation to reduce overhead):

thread 1: CP step 1   / CP step 2   / CP step 3   / CP step 4  / ...
thread 2: joint PL 0  / joint PL 1  / joint PL 2  / joint PL 3 / ...
thread 3: idle        / vision-plugin using data from step 1 / idle                     ... 

In fact, to process the plugins as fast as possible, they should each run in a separate thread once the relevant simulation information is available from the simulation step.

I hope this clarifies what I am trying to say about threading the simulator more effectively.



edit flag offensive delete link more


Thank you for the clarification. Your proposal sounds worth investigating further. I've started a feature request for this:

nkoenig gravatar imagenkoenig ( 2013-02-11 17:49:08 -0500 )edit

Question Tools


Asked: 2013-02-08 18:29:08 -0500

Seen: 914 times

Last updated: Feb 09 '13