Ros on Arm Hanging

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Ros on Arm Hanging

Daniel Stonier
Hi all,

We're trying to get ros running on an armv6 core here and hitting a
few issues. If anyone has any insight on how to go about tracking down
the issues that would be great.

Some general details about the problem:

It's an armv6 core running a custom gentoo with cturtle. The example
programs I have to test with are a simple client-server pair. The
client pings the server with a timestamped rpc service request, the
server too, timestamps and then sends a response. When the response is
received by the client, it outputs and terminates.

So it's pretty simple, no more than a few lines of code in server and
client (I've added the code below). End result though, the server
catches the request, at which point both client and server hang and
server cpu % goes through the roof. Since the code is simple, I'm
figuring it must be behind the scenes somewhere (maybe in the rosdep
library builds or system configuration?)  I'm not sure how to track
down the issue though - anyone have any ideas?

Cheers,
Daniel.

***************************** client.cpp *****************************

class LatencyClient {
public:
    LatencyClient() :
        client(node.serviceClient< rpc_latency::ping_pong >("rpc_latency"))
    {}
    /**
     * Latency benchmarking service request handler. This initiates
the request and waits
     * for the response. Handling is done in main().
     * @return bool - success or failure of the request.
     */
    bool ping_pong() {
        rpc_latency::ping_pong srv;
        TimeStamp send_stamp;
        TimeStamp final_stamp;

        srv.request.ping_time = static_cast<double>(send_stamp.stamp());
        if ( client.call(srv) ) {
            final_stamp.stamp();
            ROS_INFO("Send   Time : %lf", srv.response.pong_time -
srv.request.ping_time);
            ROS_INFO("Return Time : %lf",
static_cast<double>(final_stamp) - srv.response.pong_time);
            ROS_INFO("Round  Trip : %lf",
static_cast<double>(final_stamp - send_stamp) );
            return true;
        } else {
            ROS_ERROR("Failed to call service rpc_latency");
            return false;
        }
    }

private:
    ros::NodeHandle node;
    ros::ServiceClient client;
};

int main(int argc, char **argv) {
    ros::init(argc, argv, "rpc_latency_client");

    LatencyClient latency;
    latency.ping_pong();
    return 0;
}

***************************** server.cpp *****************************

class LatencyServer {
public:
    LatencyServer() :
        service (node.advertiseService("rpc_latency",
&LatencyServer::ping_pong, this))
    {}

    void spin() {
        ros::spin();
    }
    bool ping_pong(rpc_latency::ping_pong::Request  &request,
                   rpc_latency::ping_pong::Response &response ) {
        response.pong_time = timestamp.stamp();
        ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
        return true;
    }

private:
    TimeStamp timestamp;
    ros::NodeHandle node;
    ros::ServiceServer service;
};

int main(int argc, char **argv) {
    ros::init(argc, argv, "rpc_latency_server");

    LatencyServer latency_server;
    latency_server.spin();
    return 0;
}



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Josh Faust
Are you able to break into it in a debugger to see where it's using cpu?  Do topics work properly?

Josh

On Wed, Jun 30, 2010 at 5:46 PM, Daniel Stonier <[hidden email]> wrote:
Hi all,

We're trying to get ros running on an armv6 core here and hitting a
few issues. If anyone has any insight on how to go about tracking down
the issues that would be great.

Some general details about the problem:

It's an armv6 core running a custom gentoo with cturtle. The example
programs I have to test with are a simple client-server pair. The
client pings the server with a timestamped rpc service request, the
server too, timestamps and then sends a response. When the response is
received by the client, it outputs and terminates.

So it's pretty simple, no more than a few lines of code in server and
client (I've added the code below). End result though, the server
catches the request, at which point both client and server hang and
server cpu % goes through the roof. Since the code is simple, I'm
figuring it must be behind the scenes somewhere (maybe in the rosdep
library builds or system configuration?)  I'm not sure how to track
down the issue though - anyone have any ideas?

Cheers,
Daniel.

***************************** client.cpp *****************************

class LatencyClient {
public:
   LatencyClient() :
       client(node.serviceClient< rpc_latency::ping_pong >("rpc_latency"))
   {}
   /**
    * Latency benchmarking service request handler. This initiates
the request and waits
    * for the response. Handling is done in main().
    * @return bool - success or failure of the request.
    */
   bool ping_pong() {
       rpc_latency::ping_pong srv;
       TimeStamp send_stamp;
       TimeStamp final_stamp;

       srv.request.ping_time = static_cast<double>(send_stamp.stamp());
       if ( client.call(srv) ) {
           final_stamp.stamp();
           ROS_INFO("Send   Time : %lf", srv.response.pong_time -
srv.request.ping_time);
           ROS_INFO("Return Time : %lf",
static_cast<double>(final_stamp) - srv.response.pong_time);
           ROS_INFO("Round  Trip : %lf",
static_cast<double>(final_stamp - send_stamp) );
           return true;
       } else {
           ROS_ERROR("Failed to call service rpc_latency");
           return false;
       }
   }

private:
   ros::NodeHandle node;
   ros::ServiceClient client;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_client");

   LatencyClient latency;
   latency.ping_pong();
   return 0;
}

***************************** server.cpp *****************************

class LatencyServer {
public:
   LatencyServer() :
       service (node.advertiseService("rpc_latency",
&LatencyServer::ping_pong, this))
   {}

   void spin() {
       ros::spin();
   }
   bool ping_pong(rpc_latency::ping_pong::Request  &request,
                  rpc_latency::ping_pong::Response &response ) {
       response.pong_time = timestamp.stamp();
       ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
       return true;
   }

private:
   TimeStamp timestamp;
   ros::NodeHandle node;
   ros::ServiceServer service;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_server");

   LatencyServer latency_server;
   latency_server.spin();
   return 0;
}



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Cedric Skybotix
In reply to this post by Daniel Stonier
Hi Daniel,

not sure if it is related, but when running ROS on the Gumstix Verdex, I
had to increase the timeout in roslaunch/launch.py
(_TIMEOUT_MASTER_START) from 10.0 seconds to 20.0 seconds. Maybe you're
also affected by some timeout problems or race conditions that happen
only on slow processors...

Best


On 07/01/2010 02:46 AM, Daniel Stonier wrote:

> Hi all,
>
> We're trying to get ros running on an armv6 core here and hitting a
> few issues. If anyone has any insight on how to go about tracking down
> the issues that would be great.
>
> Some general details about the problem:
>
> It's an armv6 core running a custom gentoo with cturtle. The example
> programs I have to test with are a simple client-server pair. The
> client pings the server with a timestamped rpc service request, the
> server too, timestamps and then sends a response. When the response is
> received by the client, it outputs and terminates.
>
> So it's pretty simple, no more than a few lines of code in server and
> client (I've added the code below). End result though, the server
> catches the request, at which point both client and server hang and
> server cpu % goes through the roof. Since the code is simple, I'm
> figuring it must be behind the scenes somewhere (maybe in the rosdep
> library builds or system configuration?)  I'm not sure how to track
> down the issue though - anyone have any ideas?
>
> Cheers,
> Daniel.
>
> ***************************** client.cpp *****************************
>
> class LatencyClient {
> public:
>      LatencyClient() :
>          client(node.serviceClient<  rpc_latency::ping_pong>("rpc_latency"))
>      {}
>      /**
>       * Latency benchmarking service request handler. This initiates
> the request and waits
>       * for the response. Handling is done in main().
>       * @return bool - success or failure of the request.
>       */
>      bool ping_pong() {
>          rpc_latency::ping_pong srv;
>          TimeStamp send_stamp;
>          TimeStamp final_stamp;
>
>          srv.request.ping_time = static_cast<double>(send_stamp.stamp());
>          if ( client.call(srv) ) {
>              final_stamp.stamp();
>              ROS_INFO("Send   Time : %lf", srv.response.pong_time -
> srv.request.ping_time);
>              ROS_INFO("Return Time : %lf",
> static_cast<double>(final_stamp) - srv.response.pong_time);
>              ROS_INFO("Round  Trip : %lf",
> static_cast<double>(final_stamp - send_stamp) );
>              return true;
>          } else {
>              ROS_ERROR("Failed to call service rpc_latency");
>              return false;
>          }
>      }
>
> private:
>      ros::NodeHandle node;
>      ros::ServiceClient client;
> };
>
> int main(int argc, char **argv) {
>      ros::init(argc, argv, "rpc_latency_client");
>
>      LatencyClient latency;
>      latency.ping_pong();
>      return 0;
> }
>
> ***************************** server.cpp *****************************
>
> class LatencyServer {
> public:
>      LatencyServer() :
>          service (node.advertiseService("rpc_latency",
> &LatencyServer::ping_pong, this))
>      {}
>
>      void spin() {
>          ros::spin();
>      }
>      bool ping_pong(rpc_latency::ping_pong::Request&request,
>                     rpc_latency::ping_pong::Response&response ) {
>          response.pong_time = timestamp.stamp();
>          ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
>          return true;
>      }
>
> private:
>      TimeStamp timestamp;
>      ros::NodeHandle node;
>      ros::ServiceServer service;
> };
>
> int main(int argc, char **argv) {
>      ros::init(argc, argv, "rpc_latency_server");
>
>      LatencyServer latency_server;
>      latency_server.spin();
>      return 0;
> }
>
>
>
>    


--
Cedric - Specialising in cat herding
- http://www.asl.ethz.ch/people/cedricp
- http://www.skybotix.com

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier
In reply to this post by Josh Faust
Ok, got waylaid with other jobs for a while, but having a look at it again.

Got into it with a debugger, and found it hanging at roslib/include/ros/serialization.h at

ROS_CREATE_SIMPLE_SERIALIZER(double);

I've got a backtrace obtained by attaching to the program as soon as it hung - pasted below. I also decided to test doubles in a more serious way on the board.

- Remade the srv's with int32 rather than float64 and it ran fine.
- Recompiled the entire ros tree and the original float64 program making sure I had the correct use flags for the board (-mfpu=vfp), still failed.
- Wrote a similar program passing std_msgs::Float64 back and forth between topics, that worked!

****************************************** Backtrace ******************************************

0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198

warning: Source file is more recent than executable.
198 ROS_CREATE_SIMPLE_SERIALIZER(double);
(gdb) bt
#0 0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198
#1 0x0001dcec in ros::serialization::serialize<double, ros::serialization::OStream> (stream=...,
t=@0x48af0) at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#2 0x0001ed4c in next<double> (stream=..., m=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:741
#3 ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::allInOne<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > const&> (stream=..., m=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:304
#4 0x0001eab0 in ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::write<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (stream=..., t=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:307
#5 0x0001e2b4 in ros::serialization::serialize<rpc_latency::float_ping_pongResponse_<std::allocator<void> >, ros::serialization::OStream> (stream=..., t=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#6 0x0001dae4 in ros::serialization::serializeServiceResponse<rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (ok=true, message=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:831
#7 0x0001cd70 in ros::ServiceCallbackHelperT<ros::ServiceSpec<rpc_latency::float_ping_pongRequest_<std::allocator<void> >, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > >::call (
this=0x45f20, params=...)
at /opt/iclebo/ros/core/roscpp/include/ros/service_callback_helper.h:189
#8 0x402336d0 in ros::ServiceCallback::call (this=0x49a10)
at /opt/iclebo/ros/core/roscpp/src/libros/service_publication.cpp:123
#9 0x40252c28 in ros::CallbackQueue::callOneCB (this=0x3b918, tls=0x48f90)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:360
#10 0x4025258c in ros::CallbackQueue::callAvailable (this=0x3b918, timeout=...)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:320


On 1 July 2010 14:17, Josh Faust <[hidden email]> wrote:
Are you able to break into it in a debugger to see where it's using cpu?  Do topics work properly?

Josh

On Wed, Jun 30, 2010 at 5:46 PM, Daniel Stonier <[hidden email]> wrote:
Hi all,

We're trying to get ros running on an armv6 core here and hitting a
few issues. If anyone has any insight on how to go about tracking down
the issues that would be great.

Some general details about the problem:

It's an armv6 core running a custom gentoo with cturtle. The example
programs I have to test with are a simple client-server pair. The
client pings the server with a timestamped rpc service request, the
server too, timestamps and then sends a response. When the response is
received by the client, it outputs and terminates.

So it's pretty simple, no more than a few lines of code in server and
client (I've added the code below). End result though, the server
catches the request, at which point both client and server hang and
server cpu % goes through the roof. Since the code is simple, I'm
figuring it must be behind the scenes somewhere (maybe in the rosdep
library builds or system configuration?)  I'm not sure how to track
down the issue though - anyone have any ideas?

Cheers,
Daniel.

***************************** client.cpp *****************************

class LatencyClient {
public:
   LatencyClient() :
       client(node.serviceClient< rpc_latency::ping_pong >("rpc_latency"))
   {}
   /**
    * Latency benchmarking service request handler. This initiates
the request and waits
    * for the response. Handling is done in main().
    * @return bool - success or failure of the request.
    */
   bool ping_pong() {
       rpc_latency::ping_pong srv;
       TimeStamp send_stamp;
       TimeStamp final_stamp;

       srv.request.ping_time = static_cast<double>(send_stamp.stamp());
       if ( client.call(srv) ) {
           final_stamp.stamp();
           ROS_INFO("Send   Time : %lf", srv.response.pong_time -
srv.request.ping_time);
           ROS_INFO("Return Time : %lf",
static_cast<double>(final_stamp) - srv.response.pong_time);
           ROS_INFO("Round  Trip : %lf",
static_cast<double>(final_stamp - send_stamp) );
           return true;
       } else {
           ROS_ERROR("Failed to call service rpc_latency");
           return false;
       }
   }

private:
   ros::NodeHandle node;
   ros::ServiceClient client;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_client");

   LatencyClient latency;
   latency.ping_pong();
   return 0;
}

***************************** server.cpp *****************************

class LatencyServer {
public:
   LatencyServer() :
       service (node.advertiseService("rpc_latency",
&LatencyServer::ping_pong, this))
   {}

   void spin() {
       ros::spin();
   }
   bool ping_pong(rpc_latency::ping_pong::Request  &request,
                  rpc_latency::ping_pong::Response &response ) {
       response.pong_time = timestamp.stamp();
       ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
       return true;
   }

private:
   TimeStamp timestamp;
   ros::NodeHandle node;
   ros::ServiceServer service;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_server");

   LatencyServer latency_server;
   latency_server.spin();
   return 0;
}



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users




--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

rob wheeler-2
Just out of curiosity...Is your arm processor running in big-endian or little-endian mode?

-R

On Thu, Jul 8, 2010 at 3:43 AM, Daniel Stonier <[hidden email]> wrote:
Ok, got waylaid with other jobs for a while, but having a look at it again.

Got into it with a debugger, and found it hanging at roslib/include/ros/serialization.h at

ROS_CREATE_SIMPLE_SERIALIZER(double);

I've got a backtrace obtained by attaching to the program as soon as it hung - pasted below. I also decided to test doubles in a more serious way on the board.

- Remade the srv's with int32 rather than float64 and it ran fine.
- Recompiled the entire ros tree and the original float64 program making sure I had the correct use flags for the board (-mfpu=vfp), still failed.
- Wrote a similar program passing std_msgs::Float64 back and forth between topics, that worked!

****************************************** Backtrace ******************************************

0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198

warning: Source file is more recent than executable.
198 ROS_CREATE_SIMPLE_SERIALIZER(double);
(gdb) bt
#0 0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198
#1 0x0001dcec in ros::serialization::serialize<double, ros::serialization::OStream> (stream=...,
t=@0x48af0) at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#2 0x0001ed4c in next<double> (stream=..., m=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:741
#3 ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::allInOne<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > const&> (stream=..., m=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:304
#4 0x0001eab0 in ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::write<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (stream=..., t=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:307
#5 0x0001e2b4 in ros::serialization::serialize<rpc_latency::float_ping_pongResponse_<std::allocator<void> >, ros::serialization::OStream> (stream=..., t=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#6 0x0001dae4 in ros::serialization::serializeServiceResponse<rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (ok=true, message=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:831
#7 0x0001cd70 in ros::ServiceCallbackHelperT<ros::ServiceSpec<rpc_latency::float_ping_pongRequest_<std::allocator<void> >, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > >::call (
this=0x45f20, params=...)
at /opt/iclebo/ros/core/roscpp/include/ros/service_callback_helper.h:189
#8 0x402336d0 in ros::ServiceCallback::call (this=0x49a10)
at /opt/iclebo/ros/core/roscpp/src/libros/service_publication.cpp:123
#9 0x40252c28 in ros::CallbackQueue::callOneCB (this=0x3b918, tls=0x48f90)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:360
#10 0x4025258c in ros::CallbackQueue::callAvailable (this=0x3b918, timeout=...)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:320


On 1 July 2010 14:17, Josh Faust <[hidden email]> wrote:
Are you able to break into it in a debugger to see where it's using cpu?  Do topics work properly?

Josh

On Wed, Jun 30, 2010 at 5:46 PM, Daniel Stonier <[hidden email]> wrote:
Hi all,

We're trying to get ros running on an armv6 core here and hitting a
few issues. If anyone has any insight on how to go about tracking down
the issues that would be great.

Some general details about the problem:

It's an armv6 core running a custom gentoo with cturtle. The example
programs I have to test with are a simple client-server pair. The
client pings the server with a timestamped rpc service request, the
server too, timestamps and then sends a response. When the response is
received by the client, it outputs and terminates.

So it's pretty simple, no more than a few lines of code in server and
client (I've added the code below). End result though, the server
catches the request, at which point both client and server hang and
server cpu % goes through the roof. Since the code is simple, I'm
figuring it must be behind the scenes somewhere (maybe in the rosdep
library builds or system configuration?)  I'm not sure how to track
down the issue though - anyone have any ideas?

Cheers,
Daniel.

***************************** client.cpp *****************************

class LatencyClient {
public:
   LatencyClient() :
       client(node.serviceClient< rpc_latency::ping_pong >("rpc_latency"))
   {}
   /**
    * Latency benchmarking service request handler. This initiates
the request and waits
    * for the response. Handling is done in main().
    * @return bool - success or failure of the request.
    */
   bool ping_pong() {
       rpc_latency::ping_pong srv;
       TimeStamp send_stamp;
       TimeStamp final_stamp;

       srv.request.ping_time = static_cast<double>(send_stamp.stamp());
       if ( client.call(srv) ) {
           final_stamp.stamp();
           ROS_INFO("Send   Time : %lf", srv.response.pong_time -
srv.request.ping_time);
           ROS_INFO("Return Time : %lf",
static_cast<double>(final_stamp) - srv.response.pong_time);
           ROS_INFO("Round  Trip : %lf",
static_cast<double>(final_stamp - send_stamp) );
           return true;
       } else {
           ROS_ERROR("Failed to call service rpc_latency");
           return false;
       }
   }

private:
   ros::NodeHandle node;
   ros::ServiceClient client;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_client");

   LatencyClient latency;
   latency.ping_pong();
   return 0;
}

***************************** server.cpp *****************************

class LatencyServer {
public:
   LatencyServer() :
       service (node.advertiseService("rpc_latency",
&LatencyServer::ping_pong, this))
   {}

   void spin() {
       ros::spin();
   }
   bool ping_pong(rpc_latency::ping_pong::Request  &request,
                  rpc_latency::ping_pong::Response &response ) {
       response.pong_time = timestamp.stamp();
       ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
       return true;
   }

private:
   TimeStamp timestamp;
   ros::NodeHandle node;
   ros::ServiceServer service;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_server");

   LatencyServer latency_server;
   latency_server.spin();
   return 0;
}



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users




--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

René Wagner
In reply to this post by Daniel Stonier
On Thu, 2010-07-08 at 19:43 +0900, Daniel Stonier wrote:
> Got into it with a debugger, and found it hanging at
> roslib/include/ros/serialization.h at
>
>
> ROS_CREATE_SIMPLE_SERIALIZER(double);

Depending on how the serializer is used, this may result in an
unaligned write. You can check whether an alignment error is involved
as follows:

echo 5 > /proc/cpu/alignment

and re-run the executable that's causing trouble. The program should
receive a signal (so you can see where exactly things go wrong in gdb)
and the kernel should report an "Alignment trap: ..." on the console or
in dmesg/syslog.

Cheers,

Rene

--
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
   Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                          Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                                Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Josh Faust
If this is the problem, the attached patch may help -- it turns the reinterpret_casts in the serialization of primitives into memcpys, which shouldn't require alignment.

Josh

On Thu, Jul 8, 2010 at 8:44 AM, René Wagner <[hidden email]> wrote:
On Thu, 2010-07-08 at 19:43 +0900, Daniel Stonier wrote:
> Got into it with a debugger, and found it hanging at
> roslib/include/ros/serialization.h at
>
>
> ROS_CREATE_SIMPLE_SERIALIZER(double);

Depending on how the serializer is used, this may result in an
unaligned write. You can check whether an alignment error is involved
as follows:

echo 5 > /proc/cpu/alignment

and re-run the executable that's causing trouble. The program should
receive a signal (so you can see where exactly things go wrong in gdb)
and the kernel should report an "Alignment trap: ..." on the console or
in dmesg/syslog.

Cheers,

Rene

--
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                         Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                               Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users

serialization_memcpy_instead_of_reinterpret.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier
In reply to this post by René Wagner
On 9 July 2010 00:44, René Wagner <[hidden email]> wrote:

>
> On Thu, 2010-07-08 at 19:43 +0900, Daniel Stonier wrote:
> > Got into it with a debugger, and found it hanging at
> > roslib/include/ros/serialization.h at
> >
> >
> > ROS_CREATE_SIMPLE_SERIALIZER(double);
>
> Depending on how the serializer is used, this may result in an
> unaligned write. You can check whether an alignment error is involved
> as follows:
>
> echo 5 > /proc/cpu/alignment
>
> and re-run the executable that's causing trouble. The program should
> receive a signal (so you can see where exactly things go wrong in gdb)
> and the kernel should report an "Alignment trap: ..." on the console or
> in dmesg/syslog.
>
> Cheers,
>
> Rene
>
> -

Thanks for the pointer Rene. Starting to actually find the right
documents when googling now. Some results:

INITIAL TESTS

- with 'echo 5 > proc/cpu/alignment' (i.e. signal + warning mode)
  - *both* my float64 and int32 test programs crash with a bus error
  - kernel message in both cases also gives an alignment trap msg :

Alignment trap: float_client (941) PC=0x40238ad4 Instr=0xe5933000
Address=0x00034ce1 FSR 0x011

- with 'echo 2 > proc/cpu/alignment' (i.e. fixup mode), both programs
work successfully
  - but this is an expensive workaround (I see google sets it to 4
(signal) on android to encourage people to program better)
- The output of 'cat /proc/cpu/alignment (should it be different to this?)

User: 2
System: 0
Skipped: 0
Half: 0
Word: 0
DWord: 0
Multi: 0
User faults: 5 (signal+warn)

AFTER PATCHING (MEMCPY)

- Patched (using the memcpy's rather than the reinterprets in the
SERIALIZER macro).
- In signal+warn mode (echo 5 > /proc/cpu/alignment)
  - (float64&int32) client gives a bus error and kernel outputs an
alignment trap message, but servers are fine
- In default mode (echo 0 > /proc/cpu/alignment)
  - (float64&int32) client and server function without hanging like before

So...its working, but is this the right fix?

 Cheers,
Daniel.

>
> -
> ------------------------------------------------------------
> Dipl.-Inf. René Wagner                     Junior Researcher
> DFKI Bremen                           Enrique-Schmidt-Str. 5
> Safe and Secure Cognitive Systems             D-28359 Bremen
>
> Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
> Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
> --- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>
> Geschäftsführung:
>   Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>                                          Dr. Walter Olthoff
> Vorsitzender des Aufsichtsrats:
>                                Prof. Dr. h.c. Hans A. Aukes
> Amtsgericht Kaiserslautern                          HRB 2313
> ------------------------------------------------------------
>
> _______________________________________________
> ros-users mailing list
> [hidden email]
> https://code.ros.org/mailman/listinfo/ros-users



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

René Wagner
On Fri, 2010-07-09 at 10:45 +0900, Daniel Stonier wrote:
> - Patched (using the memcpy's rather than the reinterprets in the
> SERIALIZER macro).
> - In signal+warn mode (echo 5 > /proc/cpu/alignment)
>   - (float64&int32) client gives a bus error and kernel outputs an
> alignment trap message, but servers are fine
> - In default mode (echo 0 > /proc/cpu/alignment)
>   - (float64&int32) client and server function without hanging like before
>
> So...its working, but is this the right fix?

You probably want to check where the unaligned memory access is
happening. Can you run your float_client from within gdb? This should
allow you to get a backtrace once the kernel has sent the signal (bus
error) from the alignment trap.

Cheers,

Rene

--
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
   Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                          Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                                Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier
In reply to this post by rob wheeler-2


On 8 July 2010 23:51, Rob Wheeler <[hidden email]> wrote:
Just out of curiosity...Is your arm processor running in big-endian or little-endian mode?

-R


Little endian.
 
On Thu, Jul 8, 2010 at 3:43 AM, Daniel Stonier <[hidden email]> wrote:
Ok, got waylaid with other jobs for a while, but having a look at it again.

Got into it with a debugger, and found it hanging at roslib/include/ros/serialization.h at

ROS_CREATE_SIMPLE_SERIALIZER(double);

I've got a backtrace obtained by attaching to the program as soon as it hung - pasted below. I also decided to test doubles in a more serious way on the board.

- Remade the srv's with int32 rather than float64 and it ran fine.
- Recompiled the entire ros tree and the original float64 program making sure I had the correct use flags for the board (-mfpu=vfp), still failed.
- Wrote a similar program passing std_msgs::Float64 back and forth between topics, that worked!

****************************************** Backtrace ******************************************

0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198

warning: Source file is more recent than executable.
198 ROS_CREATE_SIMPLE_SERIALIZER(double);
(gdb) bt
#0 0x0001e378 in ros::serialization::Serializer<double>::write<ros::serialization::OStream> (
stream=..., v=304.18260199999997)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:198
#1 0x0001dcec in ros::serialization::serialize<double, ros::serialization::OStream> (stream=...,
t=@0x48af0) at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#2 0x0001ed4c in next<double> (stream=..., m=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:741
#3 ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::allInOne<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > const&> (stream=..., m=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:304
#4 0x0001eab0 in ros::serialization::Serializer<rpc_latency::float_ping_pongResponse_<std::allocator<void> > >::write<ros::serialization::OStream, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (stream=..., t=...)
at /opt/iclebo/ycs/ycs_tools/rpc_latency/srv_gen/cpp/include/rpc_latency/float_ping_pong.h:307
#5 0x0001e2b4 in ros::serialization::serialize<rpc_latency::float_ping_pongResponse_<std::allocator<void> >, ros::serialization::OStream> (stream=..., t=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:149
#6 0x0001dae4 in ros::serialization::serializeServiceResponse<rpc_latency::float_ping_pongResponse_<std::allocator<void> > > (ok=true, message=...)
at /opt/iclebo/ros/core/roslib/include/ros/serialization.h:831
#7 0x0001cd70 in ros::ServiceCallbackHelperT<ros::ServiceSpec<rpc_latency::float_ping_pongRequest_<std::allocator<void> >, rpc_latency::float_ping_pongResponse_<std::allocator<void> > > >::call (
this=0x45f20, params=...)
at /opt/iclebo/ros/core/roscpp/include/ros/service_callback_helper.h:189
#8 0x402336d0 in ros::ServiceCallback::call (this=0x49a10)
at /opt/iclebo/ros/core/roscpp/src/libros/service_publication.cpp:123
#9 0x40252c28 in ros::CallbackQueue::callOneCB (this=0x3b918, tls=0x48f90)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:360
#10 0x4025258c in ros::CallbackQueue::callAvailable (this=0x3b918, timeout=...)
at /opt/iclebo/ros/core/roscpp/src/libros/callback_queue.cpp:320


On 1 July 2010 14:17, Josh Faust <[hidden email]> wrote:
Are you able to break into it in a debugger to see where it's using cpu?  Do topics work properly?

Josh

On Wed, Jun 30, 2010 at 5:46 PM, Daniel Stonier <[hidden email]> wrote:
Hi all,

We're trying to get ros running on an armv6 core here and hitting a
few issues. If anyone has any insight on how to go about tracking down
the issues that would be great.

Some general details about the problem:

It's an armv6 core running a custom gentoo with cturtle. The example
programs I have to test with are a simple client-server pair. The
client pings the server with a timestamped rpc service request, the
server too, timestamps and then sends a response. When the response is
received by the client, it outputs and terminates.

So it's pretty simple, no more than a few lines of code in server and
client (I've added the code below). End result though, the server
catches the request, at which point both client and server hang and
server cpu % goes through the roof. Since the code is simple, I'm
figuring it must be behind the scenes somewhere (maybe in the rosdep
library builds or system configuration?)  I'm not sure how to track
down the issue though - anyone have any ideas?

Cheers,
Daniel.

***************************** client.cpp *****************************

class LatencyClient {
public:
   LatencyClient() :
       client(node.serviceClient< rpc_latency::ping_pong >("rpc_latency"))
   {}
   /**
    * Latency benchmarking service request handler. This initiates
the request and waits
    * for the response. Handling is done in main().
    * @return bool - success or failure of the request.
    */
   bool ping_pong() {
       rpc_latency::ping_pong srv;
       TimeStamp send_stamp;
       TimeStamp final_stamp;

       srv.request.ping_time = static_cast<double>(send_stamp.stamp());
       if ( client.call(srv) ) {
           final_stamp.stamp();
           ROS_INFO("Send   Time : %lf", srv.response.pong_time -
srv.request.ping_time);
           ROS_INFO("Return Time : %lf",
static_cast<double>(final_stamp) - srv.response.pong_time);
           ROS_INFO("Round  Trip : %lf",
static_cast<double>(final_stamp - send_stamp) );
           return true;
       } else {
           ROS_ERROR("Failed to call service rpc_latency");
           return false;
       }
   }

private:
   ros::NodeHandle node;
   ros::ServiceClient client;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_client");

   LatencyClient latency;
   latency.ping_pong();
   return 0;
}

***************************** server.cpp *****************************

class LatencyServer {
public:
   LatencyServer() :
       service (node.advertiseService("rpc_latency",
&LatencyServer::ping_pong, this))
   {}

   void spin() {
       ros::spin();
   }
   bool ping_pong(rpc_latency::ping_pong::Request  &request,
                  rpc_latency::ping_pong::Response &response ) {
       response.pong_time = timestamp.stamp();
       ROS_INFO("TimeStamp: %lf", static_cast<double>(timestamp) );
       return true;
   }

private:
   TimeStamp timestamp;
   ros::NodeHandle node;
   ros::ServiceServer service;
};

int main(int argc, char **argv) {
   ros::init(argc, argv, "rpc_latency_server");

   LatencyServer latency_server;
   latency_server.spin();
   return 0;
}



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users




--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users




--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier
In reply to this post by René Wagner


On 12 July 2010 17:24, René Wagner <[hidden email]> wrote:
On Fri, 2010-07-09 at 10:45 +0900, Daniel Stonier wrote:
> - Patched (using the memcpy's rather than the reinterprets in the
> SERIALIZER macro).
> - In signal+warn mode (echo 5 > /proc/cpu/alignment)
>   - (float64&int32) client gives a bus error and kernel outputs an
> alignment trap message, but servers are fine
> - In default mode (echo 0 > /proc/cpu/alignment)
>   - (float64&int32) client and server function without hanging like before
>
> So...its working, but is this the right fix?

You probably want to check where the unaligned memory access is
happening. Can you run your float_client from within gdb? This should
allow you to get a backtrace once the kernel has sent the signal (bus
error) from the alignment trap.

Cheers,

Rene

Getting some more meaningful backtraces now - catch bus errors for both the float64's and the int32's in some cases, some with, some without that memcopy patch above. I'll put together some results when I get back to work in the morning.


 
-
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                         Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                               Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier

This seems to be a bit of problem code with regards to the alignment issues:

roscpp/src/libros/service_server_link.cpp

*****************************************************************************************

void ServiceServerLink::onResponseOkAndLength(const ConnectionPtr& conn, const boost::shared_array<uint8_t>& buffer, uint32_t size, bool success)
{
ROS_ASSERT(conn == connection_);
ROS_ASSERT(size == 5);

if (!success)
return;

uint8_t ok = buffer[0];
uint32_t len = *((uint32_t*)(buffer.get() + 1));

*****************************************************************************************

That last line is where the signal is caught. I think the problem is, the arm core reads the uint32 values only at alignmed 4 byte boundaries - but that +1 generally shifts it off the 4 byte boundary and causes the signal.

Do intel machines even have an alignment trap like this arm core does? I noticed my machine does not have /proc/cpu/alignment.

Sorry, don't have a potential fix for it as I'm not familiar with this section of code yet.

More to come,
Regards,
Daniel.


On 12 July 2010 19:04, Daniel Stonier <[hidden email]> wrote:


On 12 July 2010 17:24, René Wagner <[hidden email]> wrote:
On Fri, 2010-07-09 at 10:45 +0900, Daniel Stonier wrote:
> - Patched (using the memcpy's rather than the reinterprets in the
> SERIALIZER macro).
> - In signal+warn mode (echo 5 > /proc/cpu/alignment)
>   - (float64&int32) client gives a bus error and kernel outputs an
> alignment trap message, but servers are fine
> - In default mode (echo 0 > /proc/cpu/alignment)
>   - (float64&int32) client and server function without hanging like before
>
> So...its working, but is this the right fix?

You probably want to check where the unaligned memory access is
happening. Can you run your float_client from within gdb? This should
allow you to get a backtrace once the kernel has sent the signal (bus
error) from the alignment trap.

Cheers,

Rene

Getting some more meaningful backtraces now - catch bus errors for both the float64's and the int32's in some cases, some with, some without that memcopy patch above. I'll put together some results when I get back to work in the morning.


 
-
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                         Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                               Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Josh Faust
x86 and x64 have no problems with unaligned read/write as far as I know -- which is why this hasn't come up.  Can you open a ticket and continue this discussion there?  It should be fairly easy to do some conditional changes based on CPU type.

Josh

On Mon, Jul 12, 2010 at 5:38 PM, Daniel Stonier <[hidden email]> wrote:

This seems to be a bit of problem code with regards to the alignment issues:

roscpp/src/libros/service_server_link.cpp

*****************************************************************************************

void ServiceServerLink::onResponseOkAndLength(const ConnectionPtr& conn, const boost::shared_array<uint8_t>& buffer, uint32_t size, bool success)
{
ROS_ASSERT(conn == connection_);
ROS_ASSERT(size == 5);

if (!success)
return;

uint8_t ok = buffer[0];
uint32_t len = *((uint32_t*)(buffer.get() + 1));

*****************************************************************************************

That last line is where the signal is caught. I think the problem is, the arm core reads the uint32 values only at alignmed 4 byte boundaries - but that +1 generally shifts it off the 4 byte boundary and causes the signal.

Do intel machines even have an alignment trap like this arm core does? I noticed my machine does not have /proc/cpu/alignment.

Sorry, don't have a potential fix for it as I'm not familiar with this section of code yet.

More to come,
Regards,
Daniel.


On 12 July 2010 19:04, Daniel Stonier <[hidden email]> wrote:


On 12 July 2010 17:24, René Wagner <[hidden email]> wrote:
On Fri, 2010-07-09 at 10:45 +0900, Daniel Stonier wrote:
> - Patched (using the memcpy's rather than the reinterprets in the
> SERIALIZER macro).
> - In signal+warn mode (echo 5 > /proc/cpu/alignment)
>   - (float64&int32) client gives a bus error and kernel outputs an
> alignment trap message, but servers are fine
> - In default mode (echo 0 > /proc/cpu/alignment)
>   - (float64&int32) client and server function without hanging like before
>
> So...its working, but is this the right fix?

You probably want to check where the unaligned memory access is
happening. Can you run your float_client from within gdb? This should
allow you to get a backtrace once the kernel has sent the signal (bus
error) from the alignment trap.

Cheers,

Rene

Getting some more meaningful backtraces now - catch bus errors for both the float64's and the int32's in some cases, some with, some without that memcopy patch above. I'll put together some results when I get back to work in the morning.


 
-
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                         Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                               Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: Ros on Arm Hanging

Daniel Stonier


On 13 July 2010 09:42, Josh Faust <[hidden email]> wrote:
x86 and x64 have no problems with unaligned read/write as far as I know -- which is why this hasn't come up.  Can you open a ticket and continue this discussion there?  It should be fairly easy to do some conditional changes based on CPU type.

Josh


On Mon, Jul 12, 2010 at 5:38 PM, Daniel Stonier <[hidden email]> wrote:

This seems to be a bit of problem code with regards to the alignment issues:

roscpp/src/libros/service_server_link.cpp

*****************************************************************************************

void ServiceServerLink::onResponseOkAndLength(const ConnectionPtr& conn, const boost::shared_array<uint8_t>& buffer, uint32_t size, bool success)
{
ROS_ASSERT(conn == connection_);
ROS_ASSERT(size == 5);

if (!success)
return;

uint8_t ok = buffer[0];
uint32_t len = *((uint32_t*)(buffer.get() + 1));

*****************************************************************************************

That last line is where the signal is caught. I think the problem is, the arm core reads the uint32 values only at alignmed 4 byte boundaries - but that +1 generally shifts it off the 4 byte boundary and causes the signal.

Do intel machines even have an alignment trap like this arm core does? I noticed my machine does not have /proc/cpu/alignment.

Sorry, don't have a potential fix for it as I'm not familiar with this section of code yet.

More to come,
Regards,
Daniel.


On 12 July 2010 19:04, Daniel Stonier <[hidden email]> wrote:


On 12 July 2010 17:24, René Wagner <[hidden email]> wrote:
On Fri, 2010-07-09 at 10:45 +0900, Daniel Stonier wrote:
> - Patched (using the memcpy's rather than the reinterprets in the
> SERIALIZER macro).
> - In signal+warn mode (echo 5 > /proc/cpu/alignment)
>   - (float64&int32) client gives a bus error and kernel outputs an
> alignment trap message, but servers are fine
> - In default mode (echo 0 > /proc/cpu/alignment)
>   - (float64&int32) client and server function without hanging like before
>
> So...its working, but is this the right fix?

You probably want to check where the unaligned memory access is
happening. Can you run your float_client from within gdb? This should
allow you to get a backtrace once the kernel has sent the signal (bus
error) from the alignment trap.

Cheers,

Rene

Getting some more meaningful backtraces now - catch bus errors for both the float64's and the int32's in some cases, some with, some without that memcopy patch above. I'll put together some results when I get back to work in the morning.


 
-
------------------------------------------------------------
Dipl.-Inf. René Wagner                     Junior Researcher
DFKI Bremen                           Enrique-Schmidt-Str. 5
Safe and Secure Cognitive Systems             D-28359 Bremen

Phone: (+49) 421-218-64224       Fax: (+49) 421-218-98-64224
Web: http://www.informatik.uni-bremen.de/agebv/en/ReneWagner
--- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Deutsches Forschungszentrum für Künstliche Intelligenz  GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschäftsführung:
  Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                                         Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
                               Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern                          HRB 2313
------------------------------------------------------------

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl



--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users



_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users




--
Phone : +82-10-5400-3296 (010-5400-3296)
Home: http://snorriheim.dnsdojo.com/
Yujin Robot: http://www.yujinrobot.com/
Embedded Control Libraries: http://snorriheim.dnsdojo.com/redmine/wiki/ecl

_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users