I/O shows 'could not connect' after 254 update

7 replies [Last post]

12 November, 2010 - 03:37

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

I just updated to 254 and now all my I/O shows 'could not connect'.

RF, relay, input, temp, and LCD. (LCD does display IP)

I've tried rebooting several times and got it back very briefly once for a few secs but then it went again... is this just a coincident hardware fail with the update or could it be a firmware issue ?

12 November, 2010 - 08:53

derek

Offline

Glasgow, United Kingdom

Joined: 26 Oct 2009

Death of a process

254 is good on my HAH. If the LCD is showing the IP address, it's unlikely that it's a hardware failure.

You will have a lot more xAP traffic on your LAN than either Brett or I. I'm thinking that perhaps something on your network is exposing a bug in the new xaplib2 and causing the HAH process that drives the interface to the UI to panic and die.

After you see the '?'s on the UI, can you telnet into the HAH, issue a 'ps' command and post the results here?

Then, we'll be able to narrow this one down.

Derek.

12 November, 2010 - 11:55

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

Arthur Miller says...

livebox login: root
Password:
# ps
PID USER       VSZ STAT COMMAND
    1 root      2260 S    init
    2 root         0 SW   [keventd]
    3 root         0 SWN [ksoftirqd_CPU0]
    4 root         0 SW   [kswapd]
    5 root         0 SW   [bdflush]
    6 root         0 SW   [kupdated]
    7 root         0 SW   [mtdblockd]
    8 root         0 SW   [khubd]
   29 root         0 SWN [jffs2_gcd_mtd2]
   95 root      2252 S    udhcpc -T 10 -i br0
110 root      1668 S    dropbear -p 22
118 root      2244 S    telnetd -p 23
128 root      1248 S    pure-ftpd (SERVER)
135 root      2244 S    inetd
141 root      1020 S    /usr/bin/xap-hub -i br0
152 root      4956 S    /usr/bin/kloned
153 root      4968 S    /usr/bin/kloned
156 root      4992 S    /usr/bin/kloned
181 root      2272 S    -ash
182 root      2248 R    ps
#

12 November, 2010 - 12:16

derek

Offline

Glasgow, United Kingdom

Joined: 26 Oct 2009

Yup ... it's gone

OK. So, the xap-livebox process is indeed dead.

Next thing is to re-start it, in the foreground, in debug mode.

From a telnet session use ...

/usr/bin/xap-livebox -d 7 -s /dev/ttyS0 -i br0

Hopefully, this might give some info re why the process is having a panic.

12 November, 2010 - 13:00

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

xAP goes the distance.. and more

It's a long incoming xAP message - the sender ensures that the packet is within a UDP packet size of 1500 bytes and if larger spreads the groups reported across multiple xAP packets but I think it's proving too long for the new library.

[dbg][rx.c:23:readXapData] Rx xAP packet
xap-header
{
v=13
hop=1
uid=FF.6E17:0000
class=lighting.info
source=UKUSA.GHgateway.C-Bus
}
Status.GroupState
{
Group00=Off
Group02=On
Group03=Off
Group04=Err
Group05=Off

<snip>

Group50=Off
Group51=Off
Group52=Off
Group53=Off
Group54=Er
Segmentation fault
#

The mi4 xAP News application and xAP TV applications bring it down as well..

12 November, 2010 - 13:00

derek

Offline

Glasgow, United Kingdom

Joined: 26 Oct 2009

Killer xAP message on the wire

Yes. The thing to do would be to drop Brett an email with the exact message that is giving the problem.

Then, this issue should be reproducable and a fix could be rolled into xaplib2.

It's good to have somebody with a variety of xAP enabled kit helping out on the testing.

Derek.

12 November, 2010 - 13:17

brett

Offline

Providence, United States

Joined: 9 Jan 2010

Large packets

If you could capture the entire xAP packet and post it I'll use this as a tester to find out why its breaking. Failing that I'll just make a large packet up and try it out.

12 November, 2010 - 14:28

brett

Offline

Providence, United States

Joined: 9 Jan 2010

A couple of bugs

The segmentation fault was due to me having an > instead of >= in my parse code when detecting when I can't store any more key/value pairs. Having said that I only stored 50 which wouldn't have been enough for your message anyway so it wuold have silently overwritten you data which would have been harder to figure out. So the SEGV was a lucky break in the end.

I've increased the number of key=value pairs to 150. When I run out a storage a message will be logged so at least if this does happen you can find out why.

Pushed 255.

This bug only affects those with LARGE xAP messages which is why it went unnoticed during my testing.

Brett

Site Navigator

Shop

Who's online

I/O shows 'could not connect' after 254 update