Home | Tutorials | Wiki | Issues
Ask Your Question
0

Performance feedback [closed]

asked 2013-01-09 14:05:09 -0500

cga gravatar image

updated 2013-01-09 15:14:28 -0500

I thought it might be useful to give the developers some more global performance feedback about how the DRCSIM simulation is running on our hardware. I urge all you DRC folks out that to comment (actually, add new answers) on this post with your own performance data.

On a Dell Precision T7600
Dual Eight Core XEON (E5-2687W, 3.1GHz, 20M, 8.0 GT/s, Turbo+)
128GB, DDR3 RDIMM Memory,1600MHz, ECC (8 x 16GB DIMMs)
6.0GB NVIDIA Quadro 6000, Dual MON, 2 DP & 1 DVI (glxinfo indicates the Quadro board is what is used for graphics).

DRCSIM 1.3.1 out of the box

Real Time Factor hovers around 0.8

ps augx says:

cga      11245  0.2  0.0 290808 18632 pts/5    Sl+  14:42   0:01 /usr/bin/python /opt/ros/fuerte/bin/roslaunch atlas_utils atlas.launch
cga      11261  0.2  0.0 570368 12960 ?        Ssl  14:42   0:01 /usr/bin/python /opt/ros/fuerte/bin/rosmaster --core -p 11311 __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/master.log
cga      11274  4.8  0.0 315416  6588 ?        Ssl  14:42   0:19 /opt/ros/fuerte/share/rosout/bin/rosout __name:=rosout __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/rosout-1.log
cga      11288  0.0  0.0   4404   620 ?        Ss   14:42   0:00 /bin/sh /usr/share/drcsim-1.3/ros/atlas_utils/scripts/run_gazebo atlas.world __name:=gazebo __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/gazebo-2.log
cga      11289  0.0  0.0 365848 30524 ?        Sl   14:42   0:00 gazebo -s libgazebo_ros_api_plugin.so atlas.world
cga      11290  7.7  0.0 501876 10332 ?        Ssl  14:42   0:31 /opt/ros/fuerte/stacks/robot_model/robot_state_publisher/bin/state_publisher __name:=robot_state_publisher __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/robot_state_publ
cga      11292  198  0.2 5054196 271044 ?      Rl   14:42  13:37 gazebo -s libgazebo_ros_api_plugin.so atlas.world
cga      11293 20.2  0.1 3528144 233880 ?      Sl   14:42   1:23 gazebo -s libgazebo_ros_api_plugin.so atlas.world
cga      11306  5.0  0.0 391440  6700 ?        Ssl  14:42   0:20 /opt/ros/fuerte/stacks/pr2_mechanism/pr2_mechanism_diagnostics/bin/pr2_mechanism_diagnostics __name:=pr2_mechanism_diagnostics __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69f
cga      11334  0.1  0.0 362572 16388 ?        Ssl  14:42   0:00 python /opt/ros/fuerte/share/rostopic/scripts/rostopic pub /calibrated std_msgs/Bool true __name:=fake_joint_calibration __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/fa
cga      11335  8.1  0.0 543696 26480 ?        Ssl  14:42   0:33 /opt/ros/fuerte/stacks/geometry_experimental/tf2_ros/bin/buffer_server __name:=tf2_buffer_server __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/tf2_buffer_server-6.log
cga      11362  6.8  0.0 505796 16456 ?        Ssl  14:42   0:28 python /opt/ros/fuerte/stacks/pr2_mechanism/pr2_controller_manager/scripts/spawner --wait-for=/calibrated atlas_controller __name:=atlas_controller_spawner __log:=/home/cga/.ros/log/bd0c
cga      11365  5.1  0.0 2957208 12364 ?       Ssl  14:42   0:21 /opt/ros/fuerte/stacks/image_pipeline/stereo_image_proc/bin/stereo_image_proc __name:=stereo_proc __log:=/home/cga/.ros/log/bd0ce618-5a94-11e2-8bc6-90b11c69ff24/multisense_sl-stereo_proc

In a ROS-free version of DRCSIM (all plugins removed except one Gazebo controller plugin using full state feedback ... (more)

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by nkoenig
close date 2013-07-23 17:31:01.341403

Comments

Comments suck (300 char limit, no formatting), just add more answers instead.

cga gravatar imagecga ( 2013-01-09 14:52:37 -0500 )edit

you can also edit the original post to add more info :-)

The only "formatting" option in comments I've found is shift-enterX2 creates newlines.

hsu gravatar imagehsu ( 2013-01-09 18:11:30 -0500 )edit

4 Answers

Sort by ยป oldest newest most voted
1

answered 2013-01-10 23:34:21 -0500

cga gravatar image

updated 2013-01-11 09:48:04 -0500

Okay, taking a look at various benchmarks begins to help us understand what is going on with performance, and give us a better answer to a previous question about what hardware works best, question 111. For example.

http://www.tomshardware.com/reviews/core-i7-3970x-sandy-bridge-e-benchmark,3348.html

Currently, Gazebo does not distribute its load well across threads. The simulation thread acts as a single heavy load. This means that CPUs that do well on a single thread, such as the Ivy Bridge i7-3770K, work well. This pushes you towards single CPU systems that you overclock as much as possible. You don't get a benefit out of more CPUs, or more cores, as these typically cut down your effective clock rate which reduces performance on the single rate limiting simulation thread.

Let's test this hypothesis. Anybody getting real time factors out of the box on DRCSIM 1.3.1 of close to 1.0 running just

source /usr/share/drcsim/setup.sh
roslaunch atlas_utils atlas.launch

?

Anybody with i7-39xx CPUS who can give timing feedback?

It will also be interesing to see what happens to the simulation thread loading with the Bullet dynamics engine with multi-threading and use of GPUs for dynamics. This may unload the simulation thread by making it several threads or off-loading it to a GPU, and flip the performance criteria back to slower CPUs with more cores, or more slower CPUs.

In the meantime, it would be very nice if the Gazebo folks could break the simulation thread into 2 or more threads, separating the dynamics code from the ROS code with a shared memory interface. This would allow the DRCSIM software to run on a wider range of machines out of the box.

Thanks, Chris

edit flag offensive delete link more

Comments

A shared memory interface is on our roadmap for an upcoming release. However, this is not a magic bullet that will make everything run faster. Most of ROS runs in separate threads/processes. Only the gazebo plugins run inline.

nkoenig gravatar imagenkoenig ( 2013-01-11 17:30:35 -0500 )edit

Okay, but how about having the Gazebo plugins run in a different thread? Eliminating the Gazebo plugins except for one controller plugin leaves the dynamics happily running in real time. So moving the Gazebo plugins would make the difference between real time and non-real time on many machines.

cga gravatar imagecga ( 2013-01-11 21:27:10 -0500 )edit
0

answered 2013-01-09 18:09:18 -0500

hsu gravatar image

updated 2013-01-09 23:12:54 -0500

Interesting you're getting 0.8X real-time, for me on my $1.5k desktop, out of box drcsim 1.3.1 with

roslaunch atlas_utils atlas.launch

yields ~0.97X real-time by default, but that's because the time throttling control is not perfect yet. Un-throttling simulation gets me about 1.4X real-time with the robot standing.

The big slow down comes from subscribing to the images or point clouds,

rostopic hz /multisense_sl/points2

yields about 0.8X real-time. This is due to GPU hardware limitations for large data rendering, (this also stresses one of the cores for stereoimageproc), but we plan to either reduce camera image resolution or reduce update rates.

Tasks: 390 total,   5 running, 382 sleeping,   0 stopped,   3 zombie
Cpu(s): 48.2%us,  7.9%sy,  0.0%ni, 43.5%id,  0.0%wa,  0.0%hi,  0.4%si,  0.0%st
Mem:  16388844k total, 16127708k used,   261136k free,   155036k buffers
Swap: 16727612k total,  5218848k used, 11508764k free,  1180008k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                     
 5565 hsu       20   0 3419m 312m  65m R  211  2.0  28:55.79 gazebo -s libgazebo_ros_api_plugin.so atlas.world                                                                                                           
 5653 hsu       20   0 1403m 169m  12m S  135  1.1   4:27.68 /opt/ros/fuerte/stacks/image_pipeline/stereo_image_proc/bin/stereo_image_proc __name:=stereo_proc __log:=/home/hsu/.ros/log/05e97bcc-5ab7-11e2-9704-90e2ba19
 5566 hsu       20   0 1830m 242m  85m S   31  1.5   3:45.82 gazebo -s libgazebo_ros_api_plugin.so atlas.world                                                                                                           
 7088 hsu       20   0  581m  85m 3776 S   15  0.5   0:02.45 /usr/bin/python /opt/ros/fuerte/bin/rostopic hz /multisense_sl/points2                                                                                      
 5637 hsu       20   0  493m  16m 3684 S    7  0.1   0:57.15 python /opt/ros/fuerte/stacks/pr2_mechanism/pr2_controller_manager/scripts/spawner --wait-for=/calibrated atlas_controller __name:=atlas_controller_spawner 
 5563 hsu       20   0  490m  10m 8216 S    5  0.1   0:49.76 /opt/ros/fuerte/stacks/robot_model/robot_state_publisher/bin/state_publisher __name:=robot_state_publisher __log:=/home/hsu/.ros/log/05e97bcc-5ab7-11e2-9704
 5608 hsu       20   0  530m  27m 6112 S    5  0.2   0:46.94 /opt/ros/fuerte/stacks/geometry_experimental/tf2_ros/bin/buffer_server __name:=tf2_buffer_server __log:=/home/hsu/.ros/log/05e97bcc-5ab7-11e2-9704-90e2ba191
 5590 hsu       20   0  382m 6700 5648 S    4  0.0   0:38.88 /opt/ros/fuerte/stacks/pr2_mechanism/pr2_mechanism_diagnostics/bin/pr2_mechanism_diagnostics __name:=pr2_mechanism_diagnostics __log:=/home/hsu/.ros/log/05e
 5547 hsu       20   0  380m 6688 5628 S    4  0.0   0:35.76 /opt/ros/fuerte/share/rosout/bin/rosout __name:=rosout __log:=/home/hsu/.ros/log/05e97bcc-5ab7-11e2-9704-90e2ba191409/rosout-1.log                          

We'll continually watch performance as we progress.

$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 58
model name  : Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
stepping    : 9
microcode   : 0x12
cpu MHz     : 3501.000
cache size  : 8192 KB
physical id : 0
siblings    : 8
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce ...
(more)
edit flag offensive delete link more

Comments

Now we have a detective story! What accounts for the difference? Higher effective clock rate? 32 vs. 64bit OS (which are you running?)? motherboard? Any other suggestions? GPU is irrelevant since we are only worried about the simulation thread. gzserver gives basically the same loading with no gra..

cga gravatar imagecga ( 2013-01-09 18:51:25 -0500 )edit
0

answered 2013-01-09 15:04:22 -0500

cga gravatar image

updated 2013-01-09 15:12:48 -0500

4x Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
4G memory
Nvidea GFORCE 9800 GT

DRCSIM 1.3.1 out of the box

Real Time Factor hovers around 0.55

ps augx says

cga      24022  0.3  0.4 292896 18756 pts/9    Sl+  15:45   0:01 /usr/bin/python /opt/ros/fuerte/bin/roslaunch atlas_utils atlas.launch
cga      24038  0.3  0.3 498756 12948 ?        Ssl  15:45   0:01 /usr/bin/python /opt/ros/fuerte/bin/rosmaster --core -p 11311 __log:=/home/cga/.ros/log/90863ae6-5a9d-11e2-b3e8-00188b0ade60
cga      24051  2.6  0.1 317612  6824 ?        Ssl  15:45   0:11 /opt/ros/fuerte/share/rosout/bin/rosout __name:=rosout __log:=/home/cga/.ros/log/90863ae6-5a9d-11e2-b3e8-00188b0ade60/rosout
cga      24054  0.0  0.0   4404   620 ?        Ss   15:45   0:00 /bin/sh /usr/share/drcsim-1.3/ros/atlas_utils/scripts/run_gazebo atlas.world __name:=gazebo __log:=/home/cga/.ros/log/90863a
cga      24055  0.0  0.7 360836 30264 ?        Sl   15:45   0:00 gazebo -s libgazebo_ros_api_plugin.so atlas.world
cga      24056  4.3  0.2 503988 10472 ?        Ssl  15:45   0:19 /opt/ros/fuerte/stacks/robot_model/robot_state_publisher/bin/state_publisher __name:=robot_state_publisher __log:=/home/cga/
cga      24057  3.1  0.1 393552  6836 ?        Ssl  15:45   0:13 /opt/ros/fuerte/stacks/pr2_mechanism/pr2_mechanism_diagnostics/bin/pr2_mechanism_diagnostics __name:=pr2_mechanism_diagnosti
cga      24058  0.1  0.4 358364 16380 ?        Ssl  15:45   0:00 python /opt/ros/fuerte/share/rostopic/scripts/rostopic pub /calibrated std_msgs/Bool true __name:=fake_joint_calibration __l
cga      24086  4.3  0.7 545808 31072 ?        Ssl  15:45   0:19 /opt/ros/fuerte/stacks/geometry_experimental/tf2_ros/bin/buffer_server __name:=tf2_buffer_server __log:=/home/cga/.ros/log/9
cga      24087  4.2  0.4 507916 16656 ?        Ssl  15:45   0:18 python /opt/ros/fuerte/stacks/pr2_mechanism/pr2_controller_manager/scripts/spawner --wait-for=/calibrated atlas_controller _
cga      24116  3.2  0.4 1053416 16952 ?       Ssl  15:45   0:14 /opt/ros/fuerte/stacks/image_pipeline/stereo_image_proc/bin/stereo_image_proc __name:=stereo_proc __log:=/home/cga/.ros/log/
cga      24156  174  6.1 3124276 249132 ?      Rl   15:46  12:44 gazebo -s libgazebo_ros_api_plugin.so atlas.world
cga      24157 21.7  5.1 1504336 210060 ?      Sl   15:46   1:35 gazebo -s libgazebo_ros_api_plugin.so atlas.world

In a ROS-free version of DRCSIM (all plugins removed except one Gazebo controller plugin using full state feedback (hint hint), model re-implemented in SDF 1.3)

Real Time Factor hovers around 0.95

ps augx says:

cga      25929  0.0  0.7 360836 30324 pts/1    Sl+  15:56   0:00 gazebo world
cga      25931 99.5  2.0 1036884 83340 pts/1   Sl+  15:56   2:25 gazebo world
cga      25932 24.2  4.6 1479344 189288 pts/1  Sl+  15:56   0:35 gazebo world

This was a substantial machine in its time. It got us through the Little Dog project, and we were proud of it. It can barely keep up with the ROS-free version.

edit flag offensive delete link more
0

answered 2013-01-09 15:22:45 -0500

nkoenig gravatar image

In terms of performance, we are working on solutions to simplify the models used in simulation. It may help (slightly) to run without the Gazebo GUI (run gzserver instead of gazebo).

edit flag offensive delete link more

Question Tools

Stats

Asked: 2013-01-09 14:05:09 -0500

Seen: 1,813 times

Last updated: Jan 11 '13