Scene Labeling from Camera Plugin

asked 2022-03-08 15:14:42 -0500

obag gravatar image

Hello,

I am attempting to generate a dataset using the camera plugin that I have on the robot. The task is essentially to generate a semantic mask for each class of object while driving the robot around in random patterns.

My initial approach was the following:

  1. Use the /gazebo/link_states/ channel to retrieve the pose of the objects in the scene.
  2. Knowing the camera properties from the camera_info channel, I would be able to then use a geometric approach to figure out where the object of class C at pose (X, Y, Z) falls on the image frame and label is accordingly.

The issue with this approach is that the pose isn't enough to generate a mask of the scene.

I figured that there might be a way in which the world is projected/casted into the camera plugin. I was hoping to figure out how this projection works under the hood for the camera plugin and see if there is a way that I could also extract the class of the objects that are cast to the image plane.

I am wondering if this sounds like a viable plan, and i anyone has any information as to how the camera plugin is implemented.

In addition, if anyone has experience with dataset generation/labeling using Gazebo, it would be greatly helpful to get some insight on this.

I am using Gazebo 11 for this task and my robot is the VRX robot: https://github.com/osrf/vrx

Thank you.

edit retag flag offensive close merge delete