Losing serial connectivity/lockups.
I seem to be having an issue where the Livebox looses connectivity with my Jeenode and also the HAH board after a couple of days, sometimes its possible to ssh into the unit and see what is going on, at other times ssh will start to connect but will then freeze before login.
Using xFx Viewer, I can see xap messages for the hub and the attached CurrentCost but not for the Controller or Jeenode...
I managed to get the following output -
# date
Tue Nov 22 19:48:41 UTC 2011
# ps
PID USER VSZ STAT COMMAND
1 root 2256 S init
2 root 0 SW [keventd]
3 root 0 SWN [ksoftirqd_CPU0]
4 root 0 SW [kswapd]
5 root 0 SW [bdflush]
6 root 0 SW [kupdated]
7 root 0 SW [mtdblockd]
8 root 0 SW [khubd]
31 root 0 SWN [jffs2_gcd_mtd2]
97 root 2248 S udhcpc -T 10 -i br0
112 root 1668 S dropbear -p 22
125 root 2240 S inetd
131 root 1028 S /usr/bin/xap-hub -i br0
134 root 11536 R /usr/bin/xap-pachube -i br0
144 root 5000 S /usr/bin/kloned
145 root 5012 S /usr/bin/kloned
147 root 2608 S /usr/bin/xap-currentcost -s /dev/ttyUSB0 -i br0
162 root 5012 S /usr/bin/kloned
251 root 1748 S dropbear -p 22
252 root 2268 S -ash
262 root 2244 R ps
#
There were no additional messages is /var/log/messages other than the normal startup messages.
The web GUI showed question marks in all the locations where values would have been expected - then the box stopped responding to SSH or Web access (I was going to try and restart the lost processes manually). Ping still worked however XAP messages also stopped.
The Pachube process is running and 'good' values can be seen in the XAP messages prior to the lockup, however, Pachube itself is reporting no data.
Are there any other logs I can look at to see what may be causing the problem here?
Many thanks,
Andy.
None of the C program log. If you run them on the command line they will report any fatal exit conditions. Perhaps I might fix this and have them ALWAYS output to /var/log too so if they fail fatally at least this will be logged. I'll do that.
I've been running them 24/7 for months and have yet to have one fail, esp so many at once.
If the lua (plugboard) process dies it does write to /var/log/xap-plugboard.log however you *must* check this file before rebooting as its in the memory filesystem so its will not persist past a reboot.
Brett
Like Brett, I've been running a few JeeNodes for quite a while now. I do a 'planned reboot' of my HAH every fortnight, but apart from that, I've not seen the sort of process chaos that you describe.
It might be interesting to see if your Livebox is happy when running in a 'minimal configuration' e.g. just the xap-livebox process, no Lua, no xap-serial. If this is good, it helps to rule out the possibility of a hardware glitch on the box.
One other thought ... the BaseNode firmware can run either in a mode where 'junk' packets of data are reported to the HAH or in a mode where they are not passed on. Since the strings of 'junk' data can be pretty long and give the xap-serial process a lot more work to do, it might be worth running with these suppressed (this is the mode that I use).
Derek.
I've posted up a replacement for the RF12demo that you can run on the base HAHnode. This operates by default in QUIET mode, also it won't accept A-Z as commands to change the node ID which can mess with your configuration.
Flashing from the HAH would be like this:
/etc/init.d/xap stop xap-serial
stty -F /dev/ttyUSB0 hupcl
avrdude -v -c arduino -p m328p -P /dev/ttyUSB0 -b 57600 -Uflash:w:HAHcentral.hex
reboot
Where is the xap-serial process gone? Also where is the xap-livebox process? Wait where it the plugboard process too.
You've got some sort of massive failure here .... Perhaps some sort of serial misconfiguration?
xap-serial is REQUIRED to be up to talk to the based JeeNode which will probably be on /dev/ttyUSB1 - given currentcost is on /dev/ttyUSB0.
Is this how you configured the jeenodeApplet.lua settings?