[Discourse.ros.org] [Client Libraries] Python3 and strings

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users


Hi everyone,

I am currently writing some python2/python3 libraries to work with ROS messages, and I am in need of some information.

How to treat the 'string' message field in python3 ??
There is no info about that in http://wiki.ros.org/msg , but in python3 we need to specify encoder/decoder whenever we change a string into a list of bytes and vice versa...

Any information about this I missed somewhere ? Thanks !





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/1) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users


That page does mention:
> unicode strings are currently not supported as a ROS data type. utf-8 should be used to be compatible with ROS string serialization. In python 2, this encoding is automatic for unicode objects, but decoding must be done manually. In python 3, both encoding and decoding are automatic.





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/2) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


It also says :

- Primitive Type: string
- Serialization: ascii string (4)
- C++: std::string
- Python: **str**

and

- Primitive Type: uint8[]
- Serialization: uint32 length prefix
- C++: std::vector
- Python: **bytes**


Also in python3 both encoding and decoding are automatic, based on the platform you are running on, provided you use the right type (bytes or str).

If two platforms use different encodings for two nodes communicating, then messages will probably arrived garbled, if we intend to send a string.

On the other hand, if we do not send a string with an encoding, then we are sending **bytes**, just like for a `uint8[]` field.

- Is `string` the same as `uint8[]` ? (and Python type should be **bytes**)
- OR should ROS enforce some unicode encoding for string ? (and Python type can be **str**)

In any instance it seems the wiki page should separately list Python2 and Python3 to avoid confusion...





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/3) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


>From the generated python code for a msg, when serializing the message into a buffer to send it, ROS encodes the string field as a utf-8 string (x is a string field in the ROS msg):

    _x = self.x
    if python3 or type(_x) == unicode:
      _x = _x.encode('utf-8')

And similarly, when deserializing the received buffer, it is converted into a Python str with utf-8 encoding:

    if python3:
      self.x = str[start:end].decode('utf-8')
    else:
      self.x = str[start:end]

So on the user side, you just need to make sure that the encoding for the string you're sending is utf-8.

With that in mind, that block from the msg wiki page seems sufficient to me:
> unicode strings are currently not supported as a ROS data type. utf-8 should be used to be compatible with ROS string serialization. In python 2, this encoding is automatic for unicode objects, but decoding must be done manually. In python 3, both encoding and decoding are automatic.





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/4) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


Interestingly, from this code, I understand the exact opposite of

> unicode strings are currently not supported


We are obviously using unicode codec `UTF-8` to encode and decode it, and the matching python type is a [unicode string](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str). So looking at this code, I would say :
' A `string` field in a ROS message is a unicode string, and will be encoded/decoded using UTF-8 for serialization/deserialization'

And in that case the wiki should state :

- Primitive Type: string
- Serialization: **utf-8** string (4)
- C++: std::string
- Python3: str
- Python2: **unicode**

**_On the other hand_**, if this is not true and the ROS serialization is only supporting ASCII, then the python matching type should be bytes, and the wiki should say :

- Primitive Type: string
- Serialization: ascii string (4)
- C++: std::string
- Python3: **bytes**
- Python2: str

and the serialization code needs to be fixed ( no need to encode/decode, unicode is not supported ).





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/5) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


Yes you're right, that statement doesn't seem to be correct.

As per your recommendation, I think mentioning utf-8 string as the serialization type would be fine (though not sure if that is the right thing with the C++ client library), but it would be better to use/recommend the str type for Python2 since there is no automatic decoding into a unicode string for Python 2. So it would just be type str for both Python 2 and 3.





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/6) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


But `str` in Python3 is `unicode` in Python2, and having different ways to serialize data between different versions of python will break a few things in many places ("why my message is garbled on this node and not that one?").
We could do that, but it would require a "big warning" everywhere we mention this topic...

=> I **could not find any REP** specification regarding the message serialization, and how to match the types of the supported languages and integrate deserialization with it. I seems it's something we need to drive implementation (especially given ROS supports multiple languages) and prevent "incomplete features" as much as possible.

The current serialization code **breaks** :
- when we pass a `bytes` in python 3 (no `encode` method) fix attempt [here](https://github.com/ros/genpy/pull/85/files)
- when we pass a `unicode` in python2 (receiving end lose the encoding)
- when we pass a `str` in python3 (receiving python2 end lose the encoding)

=> we need a solution (design fix) that integrates properly for all supported languages...





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/7) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


I think fully supporting unicode strings would require a lot of effort, more so on the C++ client libraries.
Sadly not a solution, but for now the recommendation of just sticking to ascii strings would prevent the issues mentioned in the the second and third bullets affecting user code.





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/8) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


Agreed. That means that the advised/documented python3 type should be `bytes`...
I went ahead and worked on an update on the wiki, to try to remove the confusion when talking about py2/py3.





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/9) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Discourse.ros.org] [Client Libraries] Python3 and strings

Dirk Thomas via ros-users
In reply to this post by Dirk Thomas via ros-users


Well, you can make both Python 2 & 3 string msg types be bytes with a note that bytes is the same as str in Python2.

    $ python2
    Python 2.7.13 (default, Jul 21 2017, 03:24:34)
    >>> a = str('123')
    >>> type(a)
    <type 'str'>
    >>> a = bytes('123')
    >>> type(a)
    <type 'str'>





---
[Visit Topic](https://discourse.ros.org/t/python3-and-strings/2392/10) or reply to this email to respond.


If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
______________________________________________________________________________
ros-users mailing list
[hidden email]
http://lists.ros.org/mailman/listinfo/ros-users
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>
Loading...