Release 257

25 November, 2010 - 02:59

#1

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

Whooah that was fast :-)

Great stuff Brett,

Could you double check the pacing is working ? I am seeing about 30 messages sent in 20ms according to Viewer.

When I tried various speeds in iServer I settled on 10/sec ie a 100ms gap between messages, I'm sure we can go a bit faster and I'll try it out on the various devices here to see at what point it starts to lose efficiency , where devices cant respond within the gaps.

I also have one device that wont synch now - but it is responding with xapbsc.info messages to the queries being sent so I'm guessing it's message format is slightly different or maybe its a subnet thing again and HAH can't hear it. I'll investigate and see if anything is being passed back to xAP Flash...

[UPDATE] Ahh.. Is it possible that HAH doesn't like addresses / sub addresses that have spaces within them ? It is changing the query target for say

source=A.B.C:Contains space

and sending

target=A.B.C:

which is invalid and also not passing responses from these devices back to xAPFlash (spaces are allowable in source addresses).

K

PS All @xapautomation.org email was interrupted and potentially undelivered for most of yesterday (Weds) - so if anyone emailed on that domain please send again.

25 November, 2010 - 03:21

#2

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

Ahh - there's one more -

Ahh - there's one more - Within the BSC schema (only) iServer lowercases the state= parameter value to 'off' 'on' or '?' before passing back messages to xAPFlash, the HAH version is passing them unchanged. This is in all BSC .info .event and .cmd messages. Again this was a downstream optimisation thing to cater for iServer clients lacking a string lowercase function. I can of course alter this in xAPFlash but it's there for those other clients too.

As a neat feature request it would be great to see a list of the connected client names (or at least the # clients) in the HAH webpage. I was going to suggest you could report the # clients in iServers heartbeat too but you can't have bodies on heartbeats in xAP v1.2 (OK in xAPv1.3) so it would have to be a separate message .

K

25 November, 2010 - 14:01

#3

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

All quiet on the Eastern front

This morning HAH has died. Overnight I'm seeing quite a few xAP messages originating from Joggler clients via HAH that are truncated and flagged as errors in Viewer. After around 5 hours of running and immediately after such a truncation HAH died . The truncations are in varying places within a message. All heartbeats have ceased, web interface down and a Telnet session only creates a blank screen. Ping does work.

I dont think there's anything useful I can do is there rather than just reboot it (as I have no command line access) ? I dont think its concatenated messages within the iServer socket but could it be caused by <xap>...<xap> messages that span a TCP packet boundary ?

Actually in examining a few messages from Jogglers relayed by HAH they are mostly missing the end of the message block - the closing } - Viewer doesn't flag these as errors which is surprising.

Also I notice another device taking a long time to synch - I have a feeling this might be a device issue but just checking there isn't a limit on the number of dot hierarchies in either the main or sub address - or the lengths of either - this is long address

source=Idratek.Cortex.SERVER:World.Gledholt.Downstairs.LivingRoom.Sitting.LightLevel

K

26 November, 2010 - 08:26

#4

BodgeIT

Offline

London, United Kingdom

Joined: 10 Jun 2010

Not dead

My HAH died overnight on 256. After upgrading to 257 and waiting for morning, it's not dead but very porrly. It takes about 5-10 secs to come out of Screensaver and seems very grogy and confused. SSH still active:

Top shows:

Mem: 13340K used, 540K free, 0K shrd, 0K buff, 1416K cached
CPU: 1% usr 2% sys 0% nice 95% idle 0% io 0% irq 0% softirq
Load average: 0.07 0.12 0.08

Strangely, I'm seeing two iServer entries in list:

166     1 root     S     6048 44%   1% /usr/bin/iServer -i br0
136     1 root     S     1052   8%   0% /usr/bin/xap-livebox -s /dev/ttyS0 -i br0
134     1 root     S     3532 25%   0% /usr/bin/xap-pachube -i br0
156     1 root     S     5780 42%   0% /usr/bin/xap-googlecal -i br0
163     1 root     S     3660 26%   0% /usr/bin/xap-plugboard -i br0
151     1 root     S     2604 19%   0% /usr/bin/xap-currentcost -s /dev/ttyUSB1 -i br0
177   166 root     S     6048 44%   0% /usr/bin/iServer -i br0
143   141 root     S     5084 37%   0% /usr/bin/kloned
146   141 root     S     5084 37%   0% /usr/bin/kloned
141     1 root     S     5072 37%   0% /usr/bin/kloned

Hope this provides a clue?

G.

26 November, 2010 - 09:20

#5

derek

Offline

Glasgow, United Kingdom

Joined: 26 Oct 2009

Ill patient

Thanks for the info.

Having a box that is not 'dead' but 'sick' is exactly what is needed to find out the cause of the illness. Not sure why there are two instances. It might be a simple enough change to check for and disallow a second instance.

It seems that this issues caused by iServer only manifests itself in environments where there are 'lots' of BSC Endpoints. I'm adding more to my test environment to see if I can replicate the issue that others are seeing.

Derek

26 November, 2010 - 13:23

#6

brett

Offline

Providence, United States

Joined: 9 Jan 2010

It does provide some clues

The memory consumption has gone from 1Mb, what is should use, to 6Mb so there is a memory leak somewhere which is why over time the box dies. Thanks I'll see if I can track it down. I've made a number of changes and pushed 258. There are still things I need to work on but while I'm doing this I can get some feedback on these changes. Thanks for your patience while I sort these out. If it was easy everybody would be doing it !

26 November, 2010 - 13:32

#7

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

Joggler waffles...

BSC TextBox endpoints are much larger than others, especially if they contain a lot of text, The endpoint reports the content in both plain text and html. If you have a few they can report back to back and likely become concatenated within the TCP stream, traversing a TCP packet boundary - they are delimited to allow iServer to preserve the message boundaries. This is where I think the problem may be .

Initially when the BSC endpoint feature was introduced , periodic reporting within xAPFlash was pretty verbose which also won't help but should just result in needless traffic. This was reworked but that change may not be in the current released beta. Checks were also put in to stop any xAP messages ever exceeding 1500 bytes . This could happen if for example text boxes contained a lot of text. Initially the html representation is discarded and replaced with 'Removed' and then the text itself if necessary.

Although the above exacerbates the issue it's not the cause - I'm seeing the issue in the latest build still and I don't think users typically have a lot of text within text boxes.

xFX Viewer doesn't flag all these errors - you only see them if you inspect a few messages. These are messages shown as originating from xAPFlash, not iServer, although iServer is actually originating them on the clients behalf as Flash can't send UDP.

xap-header
{
v=13
hop=1
uid=FF.6996:1029
class=xAPBSC.info
source=UKUSA.xAPFlash.CS4:Button.State.Spots
}
input.state
{
state=?
level=0

This shows a truncation in the middle of the level parameter value.. No error was flagged in Viewer but a truncation within a key name or the header is flagged.

K

26 November, 2010 - 14:45

#8

brett

Offline

Providence, United States

Joined: 9 Jan 2010

iServer memory leak solved

What do you mean they are delimited? Is there some other token being injected into the stream I'm not aware of? I found the memory leak BTW. Its was in the protocol tokenizer which is why after leaving the system on overnight it would be dead in the morning. The busier your network the quicker the iServer memory would leak until the unit would lock up.

I've pushed 259 for this issue which will stablize it, while I figure out why some messages get truncated. For small messages we should be ok now.

26 November, 2010 - 14:58

#9

brett

Offline

Providence, United States

Joined: 9 Jan 2010

Its just a thread

There are two instances as I start another thread to handle the BSC query when an initial connection is made. The linux kernel reports a thread as a separate process so its looks like two are running. Rest assured there aren't.

26 November, 2010 - 15:24

#10

kevin

Offline

Huddersfield, United Kingdom

Joined: 17 May 2010

By delimited I meant the

By delimited I meant the <xap></xap> element tags. Within the TCP socket stream howevever this is essentially transparent.

I am thinking about supporting STX ETX alternatively so that it would be more inkeeping with intentions within a TCP hub and also avoid any confusion that XML <xap> tags included within a xAP message could create eg in a text or displaytext field.

K

26 November, 2010 - 18:07

#11

BodgeIT

Offline

London, United Kingdom

Joined: 10 Jun 2010

Japanese...

...sounds like the easy alternative at the moment. You guys are out there!

Site Navigator

Shop

Who's online