iserver crashing/freezing

21 replies [Last post]
Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010

The iserver on my system is freezing quite often, at least a couple of times a day.

On the web portal, I get Connection refused: check iServer is running

On XFX viewer, I can see that the heartbeat has stopped after just short of 3 hours. Obviously the joggler is not responding until I reboot the livebox.

Any ideas how to diagnose this?

I am running Build: 289/1.0

processes running after crash are:

  PID USER       VSZ STAT COMMAND

    1 root      2256 S    init

    2 root         0 SW   [keventd]

    3 root         0 RWN  [ksoftirqd_CPU0]

    4 root         0 SW   [kswapd]

    5 root         0 SW   [bdflush]

    6 root         0 SW   [kupdated]

    7 root         0 SW   [mtdblockd]

    8 root         0 SW   [khubd]

   29 root         0 SWN  [jffs2_gcd_mtd2]

  107 root      1668 S    dropbear -p 22

  120 root      2240 S    inetd

  126 root      1028 S    /usr/bin/xap-hub -i br0

  129 root      3316 S    /usr/bin/xap-pachube -i br0

  130 root      1068 S    /usr/bin/xap-livebox -s /dev/ttyS0 -i br0

  140 root      5000 S    /usr/bin/kloned

  141 root      5012 S    /usr/bin/kloned

  142 root      2636 S    /usr/bin/xap-currentcost -s /dev/ttyUSB0 -i br0

  144 root      3628 S    /usr/bin/xap-twitter -i br0

  147 root      3708 S    lua /etc_ro_fs/plugboard/plugboard.lua

  157 root      5012 S    /usr/bin/kloned

  163 root      1740 S    dropbear -p 22

  164 root      2272 S    -ash

  165 root      2244 R    ps

 

Many thanks 

 

Alan

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
I can restart it by saving settings on web portal

Obviously not a solution. but it all kicks into life and reports two iserver processes on console (server and client maybe?)

 

 

171 root      1456 S    /usr/bin/iServer -i br0

172 root      1456 S    /usr/bin/iServer -i br0

 

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
Anyone help please?

Anyone help please?

brett
Offline
Providence, United States
Joined: 9 Jan 2010
Run in debug mode

Alan

The two processes are normal one is a thread.
Can you run the iServer manually from the command line in debug mode and see if you get any interesting feedback.

# killall iServer
# iServer -i br0 -d 9

I've not had the iServer crash on me and I run it 24/7.  Do you run other devices on your network besides the joggler talking to it?

Brett

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
Thanks BrettRunning it now,

Thanks Brett

Running it now, it seems to be running ok. Not sure what I am looking for, but output is flying up the screen. Will wait until it freezes and take a look.

Is there a way to get the output from debug mode to save in a log file?

the only thing that talks to iServer is the joggler, but I suppose I could run xAPflash in web browser? would that prove anything?

Alan

derek
Offline
Glasgow, United Kingdom
Joined: 26 Oct 2009
Solid for me

My Joggler/iServer combo runs 24x7 without issue. I do a planned reboot of the HAH every fortnight.

Definately need to run iServer in debug more to get an insight as to what might be going on.

Derek.

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
I am talking about 3 hours...

until it freezes typically.

Got the debug going - lots of output and still working so far.

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
It took a while but gave up

It took a while but gave up after about 4 hours this morning and here is the debug output from the iserver.

 

{

v=12

hop=1

uid=FF00D800

class=xap-hbeat.alive

source=dbzoo.livebox.Plugboard

interval=60

port=3645

}

[dbg][parse.c:33:xapGetValueF] section=xap-header key=target

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][rx.c:23:readXapData] Rx xAP packet

xap-header

{

v=12

hop=1

uid=FF00DB80

class=xAPBSC.event

source=dbzoo.livebox.Controller:1wire.1

}

input.state

{

state=on

text=22.5

displaytext=Sensor 1 22.5

}

 

[dbg][parse.c:33:xapGetValueF] section=xap-header key=target

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][parse.c:42:xapGetValueF] found value=dbzoo.livebox.Controller:1wire.1

[dbg][filter.c:53:xapFilterAddrSubaddress] filterAddr=use.last.device addr=dbzoo.livebox.Controller:1wire.1

[dbg][filter.c:136:xapCompareFilters] match=0

[dbg][parse.c:33:xapGetValueF] section=xap-header key=source

[dbg][parse.c:42:xapGetValueF] found value=dbzoo.livebox.Controller:1wire.1

[dbg][filter.c:53:xapFilterAddrSubaddress] filterAddr=dbzoo.livebox.Controller:1wire.1 addr=dbzoo.livebox.Controller:1wire.1

[dbg][filter.c:136:xapCompareFilters] match=1

[dbg][main.c:121:sendToClient] 192.168.1.25 msg <xap>xap-header

{

v=12

hop=1

uid=FF00DB80

class=xapbsc.event

source=dbzoo.livebox.controller:1wire.1

}

input.state

{

state=on

text=22.5

displaytext=Sensor 1 22.5

}

</xap>

[err][main.c:97:sendAll] errno 145 (Connection timed out)

[err][main.c:97:sendAll] send

: Bad font file format

 

brett
Offline
Providence, United States
Joined: 9 Jan 2010
First solid bit of evidence

GREAT... this is first bit of solid evidence that I've had of the iServer breaking.

errno 145 is ETIMEDOUT it means the Connection has been dropped, because of a network failure or because the system on the other end went down without notice.  Ie Your joggler.

This could be the elusive bug that Kevin was hitting too and I could never reproduce or get more information on so I couldn't fix it.

Did you happen to notice if your joggler rebooted or if its network connection got dropped?  Are you running wireless to it?  I can fix the problem but I'm just curious how this is coming about.

Brett

brett
Offline
Providence, United States
Joined: 9 Jan 2010
Build 290

Update to build 290 and let me know if you still experience any problems with the iServer stability.

Brett

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
Great news that you can fix

Great news that you can fix it Brett

Joggler is wireless, livebox is wired. Neither rebooted.

In fact nothing noticeable apart from a lack of response or display on the joggler end. I will run it again before I upgrade to 290 to see if I can spot when the hearbeats fail on joggler and livebox (ie see which drops first).

kevin
Offline
Huddersfield, United Kingdom
Joined: 17 May 2010
I'm now trying this firmware

I'm now trying this firmware too , it'll take a couple of days to know, as it was infrequent. 

My symptoms were similar although it was a wired Ethernet connection to the Joggler and  the HAH iServer heartbeats still continued even though iServer had died.  I couldn't track down a rogue xAP message that might have caused this so a socket issue seems a good possibility.  I see Joggler reboots about once per week. Also the Joggler iServer wasn't crashing and it's based on the same code so again the socket break issue sounds a good candidate.

Brett - just a thought - would it be possible to have an 'update' button on the web screen that shows the firware release ?

 

K

BoxingOrange
Offline
United Kingdom
Joined: 11 Jun 2010
+1 for that suggestion

+1 for that suggestion

brett
Offline
Providence, United States
Joined: 9 Jan 2010
This is already implement

All the firmware release ARE on the web interface I'm not sure what more you are after?

Firmware

See screeshot.  The current AVR and BUILD are on the top of every page.... this TAB will report what release is available vs what you have installed.

If you want to have an UPDATE button here that auto updates the firmware sorry this isn't on high on my list of things to do.

Brett

kevin
Offline
Huddersfield, United Kingdom
Joined: 17 May 2010
I'm not a Linux software guy

I'm not a Linux software guy .. and each time I have to upgrade my software I search on your website as to what I have to do....  I just thought it might be more user friendly to be able to just click a button...

login via Telnet and then 
# /etc/init.d/update

I know you live , breathe and dream these things but hopefully for us lesser mortals it'll move up on your list over time .....

K

BodgeIT
Offline
London, United Kingdom
Joined: 10 Jun 2010
I get occasional HAH freezes.

I get occasional HAH freezes. Not just iServer, HAH totally non responsive, not even ssh.  I also get occasional Joggler reboots probably a bit more frequent than Kevin 4 or 5 times weekly.  Not running iServer on Joggler.

 

Will upgrade and monitor.

kevin
Offline
Huddersfield, United Kingdom
Joined: 17 May 2010
Still OK

Well done, I think that fixed it Brett - iServer still running since I installed 290. 

K

kevin
Offline
Huddersfield, United Kingdom
Joined: 17 May 2010
Joggler reboots

My HAH doesn't ever freeze...been really solid for me.

On the Joggler side ...is it wireless and have you ever tried it wired, does it improve ?   Originally I suspected xAPFlash but I have seen reboots on Jogglers on which it isn't even installed.  Network losses seem to trigger this especially when using DHCP.  Also the periodic firmware check even if no update is needed.  Maybe any app checking an internet resource (time, weather, news) that fails to connect might throw a network error which triggers  a reboot.

There is a definite gremlin in the sound management of Jogglers (and xAPFlash uses that) .  All Joggler applications report that sound disintegrates into crackles after some time , until a Joggler or application restart.   I am not sure if this is related to the time audio plays or just usage of the API.  Regardless something's wrong firmware/hardware wise there but OP/O2 arent't interested in fixing it, which makes me think it's a hardware issue.

xAPFlash could very well have memory creep issues but I did monitor this for a while and it seems OK.  There are some design aspects I don't think are ideal but those are mainly one time as the XML is loaded.

  K

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
Crashed for me today :-(

but I wasnt running a debug console.

Any way to sent the debug output to a log file?

Alanmh
Offline
Reading, United Kingdom
Joined: 7 Jul 2010
That said....

it was 48 hours, not 4! :-)

kevin
Offline
Huddersfield, United Kingdom
Joined: 17 May 2010
I had a good run but my

I had a good run but my problems with iServer have returned.  I've been running debug and here's one problem - although I'm not convinced this is my main one..  I'll keep debug running

[err][main.c:97:sendAll] errno 148 (No route to host)
[err][main.c:97:sendAll] send
: Unknown error 117
#

admin
Offline
Joined: 26 Oct 2009
That's strange

OK this one is VERY weird - this error is useful it gives me something to investigate.

Can you give me a few more DEBUG lines about this error that path it took to get there is as important as where it crashed, but I have a clue.   Perhaps your devices are renegotiating their IP address and disappearing of the network, and the iServer doesn't know this so when it tries to forward a message the device has gone!

No route to host - would fit.  I can certainly fix this.

Brett

Hardware Info