ntp and multi-machines

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

ntp and multi-machines

Enea Scioni
Hi all!

After trying to use the navigation stack on my platform with success, and after discovered that unfortunately my embedded board that I use is too less powerful to run all the navigation stack, I'm trying to run on embedded board few nodes (driver nodes for my robot) and the "high-level" of navigation stack on my laptop.

For do this, my laptop (where I run also the master) communicate as well on wireless with embedded board, and I launch all nodes using a .launch file, where I set up my driver nodes to run on remote machine.
Nodes, parameters and everything is the same that I used for run everything on my laptop, and before I tried do control with keyboard, in remote way my robot, so (I guess) I don't have multi-machine problems.

(But!) When I run the navigation stack, the move_base node can't start because it's waiting for an odom-map frame transform. Infact I receive the warning: \map frame doesn't exist. After check that map frame is not published (amcl should publish it), I checked the amcl node and I received the warning of 100% dropped on /odom message filter. Usually it means that there's something wrong in the tf stream, but the nodes and the naming "stuff" are the same that I used before (And they worked before). So I thought that probably it could be a time problem and I read that the pr2 has a similar problem http://www.ros.org/wiki/pr2_computer_monitor#ntp_monitor.py , or in general, each robot that runs different nodes in different machines.

So I would like to ask if somebody already had (and resolve) my same problem, and If the problem could be the different time.
If yes, how can I use the ntp_monitor.py script? I took a look at pr2.launch file, in pr2_bringup package and I did the same in my launch file, but if I use rostopic echo diagnostics I receive 2 different messages: one of them, relative at my laptop ntp adjust, with "level 0", but the other one, relative at my embedded board, with level 2 (Error running ntpupdate). For the embedded board, I set up the host to my laptop, where an ntpd run as well, and If I try to run the command (in a bash shell in embedded board) "sudo ntpdate laptop", I'm able to upgrade the time, so the ntp server works.

Thank you!!

Greetings,
Enea Scioni



Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Kevin Watts
For viewing diagnostics messages, it might be easier to use the runtime_monitor (http://www.ros.org/wiki/runtime_monitor).

ntp_monitor runs "ntpdate -q <server>" and "ntpdate -q <hostname>" to check against the host, and itself. If both of these commands work (and return 0), ntp_monitor should give you the offset.

It sounds like "sudo ntpdate <host>" works, so this might be some kind of configuration or permissions problem on your board.

Kevin



On Thu, Apr 29, 2010 at 7:44 AM, Enea Scioni <[hidden email]> wrote:

Hi all!

After trying to use the navigation stack on my platform with success, and
after discovered that unfortunately my embedded board that I use is too less
powerful to run all the navigation stack, I'm trying to run on embedded
board few nodes (driver nodes for my robot) and the "high-level" of
navigation stack on my laptop.

For do this, my laptop (where I run also the master) communicate as well on
wireless with embedded board, and I launch all nodes using a .launch file,
where I set up my driver nodes to run on remote machine.
Nodes, parameters and everything is the same that I used for run everything
on my laptop, and before I tried do control with keyboard, in remote way my
robot, so (I guess) I don't have multi-machine problems.

(But!) When I run the navigation stack, the move_base node can't start
because it's waiting for an odom-map frame transform. Infact I receive the
warning: \map frame doesn't exist. After check that map frame is not
published (amcl should publish it), I checked the amcl node and I received
the warning of 100% dropped on /odom message filter. Usually it means that
there's something wrong in the tf stream, but the nodes and the naming
"stuff" are the same that I used before (And they worked before). So I
thought that probably it could be a time problem and I read that the pr2 has
a similar problem
http://www.ros.org/wiki/pr2_computer_monitor#ntp_monitor.py , or in general,
each robot that runs different nodes in different machines.

So I would like to ask if somebody already had (and resolve) my same
problem, and If the problem could be the different time.
If yes, how can I use the ntp_monitor.py script? I took a look at pr2.launch
file, in pr2_bringup package and I did the same in my launch file, but if I
use rostopic echo diagnostics I receive 2 different messages: one of them,
relative at my laptop ntp adjust, with "level 0", but the other one,
relative at my embedded board, with level 2 (Error running ntpupdate). For
the embedded board, I set up the host to my laptop, where an ntpd run as
well, and If I try to run the command (in a bash shell in embedded board)
"sudo ntpdate laptop", I'm able to upgrade the time, so the ntp server
works.

Thank you!!

Greetings,
Enea Scioni




--
View this message in context: http://ros-users.122217.n3.nabble.com/ntp-and-multi-machines-tp765453p765453.html
Sent from the ROS-Users mailing list archive at Nabble.com.

------------------------------------------------------------------------------
_______________________________________________
ros-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ros-users
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Enea Scioni
I tried also to log in to embedded board using root account, but I didn't resolve the problem...

Enea Scioni
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Kevin Watts
Can you post the output of "ntpdate -q <localhost>" and "ntpdate -q <server>" and the return codes?

On Wed, May 5, 2010 at 8:52 AM, Enea Scioni <[hidden email]> wrote:

I tried also to log in to embedded board using root account, but I didn't
resolve the problem...

Enea Scioni
--
View this message in context: http://ros-users.122217.n3.nabble.com/ntp-and-multi-machines-tp765453p779303.html
Sent from the ROS-Users mailing list archive at Nabble.com.

------------------------------------------------------------------------------
_______________________________________________
ros-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ros-users
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Enea Scioni
Sure!
On my laptop the output is:
$ ntpdate -q 127.0.0.1
server 127.0.0.1, stratum 3, offset -0.000012, delay 0.02565
 5 May 19:22:28 ntpdate[5000]: adjust time server 127.0.0.1 offset -0.000012 sec

and

$ ntpdate -q 134.58.255.1
server 134.58.255.1, stratum 2, offset -0.000195, delay 0.02599
 5 May 19:22:57 ntpdate[5001]: adjust time server 134.58.255.1 offset -0.000195 sec

where 134.58.255.1 is the server ip.

Instead, on embedded board:
$ ntpdate -q 127.0.0.1
server 127.0.0.1, stratum 0, offset 0.000000, delay 0.00000
 1 Jan 01:08:09 ntpdate[1535]: no server suitable for synchronization found
 
(expected, because there isn't the ntp deamon running on embedded board), and
$ ntpdate -q 10.8.30.30
server 10.8.30.30, stratum 3, offset 1273079765.187633, delay 0.02666
 1 Jan 01:09:58 ntpdate[1541]: step time server 10.8.30.30 offset 1273079765.187633 sec

where 10.8.30.30 is the ip assigned at my pc on LAN.

Enea Scioni
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Kevin Watts
First of all, it looks like the offset is really, really high between your embedded board and your laptop. You can probably use "chrony" to sync them: http://chrony.tuxfamily.org/

ntp_monitor.py checks against both the remote host, and the computer. If either one of these commands fails, then ntp_monitor.py won't work. I wrote a ticket, so I can disable this call in a future release: https://code.ros.org/trac/wg-ros-pkg/ticket/4277

The exact command ntp_monitor.py runs is against the hostname of the computer, not the IP.

Can you try:
$ python
> import socket
> socket.gethostname()
MY_HOSTNAME
$ ntpdate -q MY_HOSTNAME

Can you post the output from that command, and the return code (from "echo $?"). Thanks.

Kevin




On Wed, May 5, 2010 at 10:41 AM, Enea Scioni <[hidden email]> wrote:

Sure!
On my laptop the output is:
$ ntpdate -q 127.0.0.1
server 127.0.0.1, stratum 3, offset -0.000012, delay 0.02565
 5 May 19:22:28 ntpdate[5000]: adjust time server 127.0.0.1 offset -0.000012
sec

and

$ ntpdate -q 134.58.255.1
server 134.58.255.1, stratum 2, offset -0.000195, delay 0.02599
 5 May 19:22:57 ntpdate[5001]: adjust time server 134.58.255.1 offset
-0.000195 sec

where 134.58.255.1 is the server ip.

Instead, on embedded board:
$ ntpdate -q 127.0.0.1
server 127.0.0.1, stratum 0, offset 0.000000, delay 0.00000
 1 Jan 01:08:09 ntpdate[1535]: no server suitable for synchronization found

(expected, because there isn't the ntp deamon running on embedded board),
and
$ ntpdate -q 10.8.30.30
server 10.8.30.30, stratum 3, offset 1273079765.187633, delay 0.02666
 1 Jan 01:09:58 ntpdate[1541]: step time server 10.8.30.30 offset
1273079765.187633 sec

where 10.8.30.30 is the ip assigned at my pc on LAN.

Enea Scioni

--
View this message in context: http://ros-users.122217.n3.nabble.com/ntp-and-multi-machines-tp765453p779546.html
Sent from the ROS-Users mailing list archive at Nabble.com.

------------------------------------------------------------------------------
_______________________________________________
ros-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ros-users
_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Enea Scioni
Yes, infact embedded board doesn't have a button battery, so every reboot means that the date will reset on 1 Jan 1970: for that reason the offset is very high. I'll try chrony to fix this, and I let's you know the result.

The output it's the same, both IP and Hostname:
5 May 20:55:46 ntpdate[1558]: no server suitable for synchronization found

The problem is not the Hostname; before I posted the command using the IP, but the behaviour it's always the same.
(I checked also /etc/hosts )

Enea Scioni
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Blaise Gassend
You will definitely want to use the initstepslew option in your chrony
configuration.

On Wed, 2010-05-05 at 12:00 -0700, Enea Scioni wrote:

> Yes, infact embedded board doesn't have a button battery, so every reboot
> means that the date will reset on 1 Jan 1970: for that reason the offset is
> very high. I'll try chrony to fix this, and I let's you know the result.
>
> The output it's the same, both IP and Hostname:
> 5 May 20:55:46 ntpdate[1558]: no server suitable for synchronization found
>
> The problem is not the Hostname; before I posted the command using the IP,
> but the behaviour it's always the same.
> (I checked also /etc/hosts )
>
> Enea Scioni


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users
Reply | Threaded
Open this post in threaded view
|

Re: ntp and multi-machines

Kevin Watts
Thanks for the heads up. It sound like ntp_monitor.py won't work on the embedded board until I close that ticket. Patches are always welcome.

Kevin

On Wed, May 5, 2010 at 12:13 PM, Blaise Gassend <[hidden email]> wrote:
You will definitely want to use the initstepslew option in your chrony
configuration.

On Wed, 2010-05-05 at 12:00 -0700, Enea Scioni wrote:
> Yes, infact embedded board doesn't have a button battery, so every reboot
> means that the date will reset on 1 Jan 1970: for that reason the offset is
> very high. I'll try chrony to fix this, and I let's you know the result.
>
> The output it's the same, both IP and Hostname:
> 5 May 20:55:46 ntpdate[1558]: no server suitable for synchronization found
>
> The problem is not the Hostname; before I posted the command using the IP,
> but the behaviour it's always the same.
> (I checked also /etc/hosts )
>
> Enea Scioni


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users


_______________________________________________
ros-users mailing list
[hidden email]
https://code.ros.org/mailman/listinfo/ros-users