[Discourse.ros.org] [Computer Vision / Perception] New Computer Vision Message Standards

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users


First off: Thanks for this proposal! A standardized set of vision messages has been sorely missing for years.

Regarding [Detection3D](https://github.com/Kukanani/vision_msgs_proposal/blob/28acc935ddf6ef887fd5b3f5999cd7e14e8ee7e8/msg/Detection3D.msg):

I strongly believe we need a separate Pose for each object hypothesis. For example, when meshes are used to represent the object classes, the Pose specifies the pose of the mesh's reference frame in the `Detection3D.header.frame_id` frame. For example, the reference frame of the following mesh is at the bottom of the mug and in the center of the mug's round part, not at the center of the mesh's bounding box:

<img src="/uploads/ros/original/1X/f102dd3bd238347f63c920557e3baa161ee6da6e.png" width="500" height="500">

Without a Pose for each object class, we cannot express "this object could be either a laptop in its usual orientation, or a book lying flat (i.e., rotated by 90 if your mesh is of a book standing upright)".

My proposal would be to either include an array of Poses in a 3D-specific `CategoryDistribution` message, or (since we now have 3 arrays that must be the same size) as an array of `ObjectHypothesis` messages (or whatever we want to call it) that would have one `id`, `score` and `Pose` each.

--------------------------------------------------

I was also sorry to see that [BoundingBox3D](https://github.com/Kukanani/vision_msgs_proposal/blob/2a1682f322dc08bcaf268d0833dd1fc4758aedaf/msg/BoundingBox3D.msg) was removed. (This was meant to represent a bounding box of the points in `source_cloud`, right?) I've always included this in my own message definitions, and I've found it extremely useful.

On the other hand, this information can be re-computed from the `source_cloud`, so I can live with that (although it's a bit wasteful). Also, other people might prefer to use a `shape_msgs/Mesh bounding_mesh`, like in [object_recognition_msgs/RecognizedObject](https://github.com/wg-perception/object_recognition_msgs/blob/d55acd8aa14fac162e19573ccce6a266b4df23c2/msg/RecognizedObject.msg#L25-L27), or something completely different, and it would overcomplicate the message if we'd include all possible kinds of extra information.





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/21) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


You're right, when you have a CategoryDistribution, it is likely to change per pixel.

For most applications I've seen, a single label per pixel is fine. So then:

    string[] results
    sensor_msgs/Image mask





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/22) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


what do you think about unifying the BoundingBox2D.msg
<https://github.com/Kukanani/vision_msgs_proposal/blob/master/msg/BoundingBox2D.msg>
 and BoundingBox3D.msg
<https://github.com/Kukanani/vision_msgs_proposal/blob/master/msg/BoundingBox3D.msg>
 messages?
we could have the three fields (or vector) to the bounding box size and, in
the case it is a 2d bounding box, the Z field would be zero. I think we
would gain in simplicity, in a similar way we have a pose message
(geometry_msgs/Pose) that is generic and serves both 2d and 3d poses,
according to what is set in Z.





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/23) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


something like this, for both 2d and 3d bounding boxes:

    # position and rotation of the bounding box
    geometry_msgs/Pose pose

    # size of the bounding box
    geometry_msgs/Point size





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/24) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


In case somebody is still following this thread: The discussion seems to have moved to the [GitHub repository](https://github.com/Kukanani/vision_msgs_proposal/). I've just discovered that my proposed changes were silently accepted! :tada:





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/25) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


I am forwarding a comment from a colleague, Jordi Pages:

"
In general the messages included so far look pretty well IMHO.

We could point to our own vision messages in https://github.com/pal-robotics/pal_msgs so you may have some new ideas.

For example, we could check how some specific detections like faces (along with person name/ID, emotions and facial expressions), actions, planar objects or legs (even though the latter is not really vision-based detection but laser-based) can be encoded in their messages or whether some additional fields might be required.

You include in some messages an optional field of type sensor_msgs/Image I suppose for debugging, monitorization of what part of the input image has been processed or for post-processing purposes. Depending on the main purpose you  might consider using instead a sensor_msgs/CompressedImage (or both) to not  penalizing remote subscribers as the frequency of the topic might reduce for them. The inclusion of an image is not mandatory because if the timestamp of the Header is set equal to the timestamp of the input image a TimeSyncrhonizer can later be used to get the same information using the BoundingBox2D. But I also like to have it here to save the burden of using a TimeSyncrhonizer...

An additional field I found useful in some cases is a geometry_msgs/TransformStamped expressing the pose of the camera in the moment that the processed image was acquired.

Regards
"





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/26) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


I've been requested to provide an overall feedback to the proposal. Please find below my comments.

In general I feel the effort very valuable and useful. However, from my side, I'd like to add some comments to the proposal:

BoundingBoxXD.msg:

- Do we need to agree where is the origin of the BB (top-left, center, ...)? If so, say it explicitly in the message comments.

- It's not rigurous to provide a pose (point+orientation) to something called "origin" which implicitly is just a point. May be calling it "pose" or "origin_pose" is more adequate.

- size in 2D is implemented with two ints, while in 3D is as a vector3 double. I'm trying to imagine if there is some situation where a float-2DBB is required (subpixel approaches) . Just warning on that.


DetectionXD.msg

- Sometimes detectors provide also uncertainty on the pose-space of the detection. Providing just a BB for the spatial-related data of a detection does not allow to give this valuable data, specially in fusion (i.e tracking) algorithms requiring to work with pose-space uncertainty.


Lack of Services

- Think about if it could be useful to add some services in the package, mainly based on the proposed messages. Thus, allowing detectors to work in a client-server mode, with customizable requests.


Best Regards, and thanks again for the effort!





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/27) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


[quote="andreucm, post:27, topic:1819"]
BoundingBoxXD.msg:
Do we need to agree where is the origin of the BB (top-left, center, ...)? If so, say it explicitly in the message comments.
[/quote]

+1. All the message comments should be much more explicit anyway IMO.

If we use rotated bounding boxes with float width + height, the origin should be the center (see below). If we don't have rotation and use a uint32 width + height, the origin should be the upper left corner, otherwise we can't represent even-sized bounding boxes properly (the center would need to be at a "*\*.5*" position).

[quote="andreucm, post:27, topic:1819"]
* It's not rigurous to provide a pose (point+orientation) to something called "origin" which implicitly is just a point. May be calling it "pose" or "origin_pose" is more adequate.

* size in 2D is implemented with two ints, while in 3D is as a vector3 double. I'm trying to imagine if there is some situation where a float-2DBB is required (subpixel approaches) . Just warning on that.
[/quote]

The 2D Bounding Box format is currently being discussed in [this PR](https://github.com/Kukanani/vision_msgs_proposal/pull/5); everyone, please feel free to contribute to that discussion.

The pose (in the current proposal) is actually meant to have an orientation; otherwise we shouldn't be using a Pose2D but a point, as you say. I'm unsure whether it's a good or a bad idea to have rotated bounding boxes and subpixel (float) resolution, see the discussion in the PR. Please chime in.

[quote="andreucm, post:27, topic:1819"]
Lack of Services

* Think about if it could be useful to add some services in the package, mainly based on the proposed messages. Thus, allowing detectors to work in a client-server mode, with customizable requests.
[/quote]

+1





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/28) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|

[Discourse.ros.org] [Computer Vision / Perception] Proposal - New Computer Vision Message Standards

William Woodall via ros-users
In reply to this post by William Woodall via ros-users


Good catch on including pose uncertainty information. I'll update the 3D poses to use a `geometry_msgs/PoseWithCovariance`.





---
[Visit Topic](https://discourse.ros.org/t/proposal-new-computer-vision-message-standards/1819/29) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
12