Anyone fancy a challenge?! (WebSockets server)

BlitzMax Forums/BlitzMax Programming/Anyone fancy a challenge?! (WebSockets server)

BlitzSupport(Posted 2012) [#1]
Hi all,

I'm trying to write a simple WebSockets server but have run into a specific problem that's completely baffling me. This is a long shot, I know, as it requires some understanding of networking and WebSockets (or the desire to learn about them!), but there are probably not many people concerned with this particular area...

Anyway, just in case, here's a quick summary of WebSockets and what's happening:

1) a WebSocket client sends 'frames' to the server (my program) in a specific format;

2) the frame contains some information about the data contained within, plus a 'mask' that the client has XOR-applied to this frame of data, plus the data itself;

3) the length of the data may be recorded as 0-125 bytes, but can be a special code denoting that the actual length is different: if it's 126, then the next TWO bytes contain the actual length of the data; if it's 127, the next EIGHT bytes contain the length of the data;

4) regardless of the length of data, you then read each byte of the data and XOR it with each of the 4 mask bytes in turn. (In BlitzMax, the ~ symbol operates as XOR.)

This works perfectly for strings 0-125 bytes in length, but for reasons I just cannot fathom, fails for larger strings. (I'm testing where the data length's 'special code' of 126 kicks in.)

I'm receiving the correct data length from the relevant two bytes, but the data simply isn't decoded correctly despite using the correct mask offsets.

Here's the project as it stands, and the relevant code is in the loop marked "WEB SOCKET LOOP:" (Ctrl-F it!), while the specific decoding part is marked "XOR DATA:"...

https://dl.dropbox.com/u/3592022/blitz/websocketserver.zip [UPDATE: NOW WORKING!]

To test:

1) run the program and visit http://www.websocket.org/echo.html using a WebSockets-compliant browser (latest Chrome or Firefox ought to do it);

2) enter Location as "ws://localhost" (no quotes) and hit Connect;

3) enter a short string in the Message box;

4) hit Send.

That should return the correct string. However, if the string is longer than 125 bytes, you'll run into the problem described above and get utter garbage back!

I thought I might be reading from an incorrect byte offset, but it seems to be correct and there are no bytes left over after reading.

What confuses me further, based on this, is that typing in a 'short' string after a 'long' string also returns garbage, yet it 'should' be receiving valid data and mask at this point, given that there are no extra bytes in the stream.

Anyway, as I said, this is pretty obscure stuff, so I'm not expecting much of a response, but if anyone fancies a proper challenge, please take a look!

My best guess is that I'm somehow reading the data from the wrong offset, but I've been through it over and over, and all appears correct...

References:

http://tools.ietf.org/html/rfc6455#section-5.2
http://stackoverflow.com/questions/8125507/how-can-i-send-and-receive-websocket-messages-on-the-server-side
http://www.websocket.org/echo.html

Last edited 2012


BlitzSupport(Posted 2012) [#2]
God, that looks horrible! If anyone even bothers to try solving this, I'll nominate them for a Nobel prize of some sort!


skidracer(Posted 2012) [#3]
Only skimmed your post but just wanted you to check you were applying network order to your byte sequences.


col(Posted 2012) [#4]
Hiya,

I don't know too much about network etc :P but when the datalength variable is 126, using

Local maskbyte:Byte = maskptr [3 - loop Mod 4] ' Use mask order 3,2,1,0 instead of 0,1,2,3

the mask byte order decodes the message lovely. I wonder why that works in reverse as opposed to datalength <= 125 which decodes with using the mask order 0,1,2,3. I'm not sure where the real bug is, as this seems like an endian issue, however changing the endian in the code doesn't seem to work.

Nobel prize nomination eh ? :D

Last edited 2012


Floyd(Posted 2012) [#5]
So byte order is the problem, but I'm still baffled by my results.

I tried a 150 byte message consisting of 0123456789 repeated fifteen times.

Running from the BlitzMax IDE I see the default message and the new one diplayed as:

MESSAGE RECEIVED: Rock it with HTML5 WebSock
MESSAGE RECEIVED: 21  65  :3  07  4; ( and this is repeated )

And when I try to copy/paste into Notepad it gets turned into:

MESSAGE RECEIVED: Rock it with HTML5 WebSock
MESSAGE RECEIVED: 2165:3074; ( repeated )

When I try to save this there is a warning about Unicode characters being lost.

I had to type in some of that by hand. Trying to copy/paste the BlitzMax output into this forum produced yet another variation:
MESSAGE RECEIVED: &#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:3&#149;&#150;07&#145;&#146;4;&#148;21&#151;65&#147;&#156;:
The incomprehensible thing is that output in BlitzMax, and output pasted into Notepad, are short repeated patterns.
The BlitzMax output pasted directly into this forum is not.


col(Posted 2012) [#6]
There's also another problem, maybe with the logic.

Using the temporary fix of [3 - loop mod 4]

As a test I did this...

The first string to send is '0123456789' - This is scrambled.
The second string of '0123456789' repeated 13 times the message is clean.
The third string to send is '0123456789' is also clean!!
All messages are decoded properly after this :/

Sorry i don't have more real time to look into it. I'm just hacking about while waiting for something else :P

Last edited 2012


BlitzSupport(Posted 2012) [#7]
@skidracer: yes, using network byte order for the multi-byte sequences, as per spec. However, that doesn't seem to be required for the "non-length" results (the ReadInt on the mask works there for < 126 lengths), and switching endian-ness on the mask doesn't work when applied to multi-byte results either!

@col: sadly too drunk at this point to test (yet more sadly out of lager, lest I'd try a little more), but if that's the case then I will be your best friend forever! (Makes no sense, though!) Will try that tomorrow!

@Floyd: again, unfortunately can't interpret until sober, but it certainly sounds like I'm doing what's expected, and kinda glad I'm not the only one baffled!

I'll have to try what col sez tomorrow... kinda tricky as you can't see what the client is sending as a mask, which would help greatly. I have tried with an alternative client (a Chrome plugin) but got similar results, so I'm bravely/naively assuming it's not the client sending dodgy data/mask.

Thanks, all, as I didn't expect so many responses on this one and really appreciate it! Will pick up again tomorrow and update... hoping col has it, though, regardless of whether or not the spec agrees!

Last edited 2012


BlitzSupport(Posted 2012) [#8]
EDIT:@ Just saw your new post, col... will parse tomorrow, but thanks in advance!


Htbaa(Posted 2012) [#9]
Maybe this module can be of help to you. Since the web socket spec changed a lot I doubt this one is fully compliant since it hasn't been updated in over a year.

https://github.com/FWeinb/websocket.mod


BlitzSupport(Posted 2012) [#10]

the mask byte order decodes the message lovely. I wonder why that works in reverse as opposed to datalength <= 125 which decodes with using the mask order 0,1,2,3. I'm not sure where the real bug is, as this seems like an endian issue, however changing the endian in the code doesn't seem to work.


Wow, not only does it work for large strings, it also works on the < 125 strings. How the heck can they be decoding both ways?! I'm also getting the weird problem you mention later, where a short string doesn't decode unless I've decoded a long string.

Totally bizarre... will require some head-scratching/banging! It's obviously an endianness problem somewhere, but...

@Htbaa, thanks, I've had a look but it just closes the connection with the Chat sample as soon as it connects. However, I will try your unMask function against some sample data I receive in my program and see what happens -- it appears to do the same thing, though! I'll also study your processMessage and see what's different. Interesting to see a different (and more comprehensive) approach to the same thing!


BlitzSupport(Posted 2012) [#11]
GAAAH! Got it! Bloody endianness, of course.

Turns out I just needed to read the entire stream as Big Endian from the start, not bother with switching the stream's endianness for multi-byte values, and use col's 3 - loop Mod 4 thing.

What I still don't get is a) why other languages apparently don't need to do this (perhaps they default to Big Endian for their web streams?), and b) why it didn't work anyway, since endianness should only affect the multi-byte values.

Anyway, seems to be working fine now, so thanks all!

https://dl.dropbox.com/u/3592022/blitz/websocketserver.zip


Floyd(Posted 2012) [#12]
James: your post appeared as I was doing some more experiments. So you can ignore everything but my remark about LongBin and LongHex.


I noticed a comment in the source code that Bin() only accepts 32-bit values. There are separate functions LongBin and LongHex.

Now back to the confusion. In my 0123456789... test the output contained this:

	DATA		-	MASK

	10010100	-	Byte 0: 10101011
	00110010	-	Byte 1: 00000110
	00110001	-	Byte 2: 00000101
	10010111	-	Byte 3: 00001111
	10010000	-	Byte 0: 10101011
	00110110	-	Byte 1: 00000110
	00110101	-	Byte 2: 00000101
	10010011	-	Byte 3: 00001111
	10011100	-	Byte 0: 10101011
	00111010	-	Byte 1: 00000110
	00110011	-	Byte 2: 00000101
	10010101	-	Byte 3: 00001111
Now here's a mystery. My original plain text would have ASCII values less than 128 even if I had not used only the characters 0-9. So in every case the high order ( leftmost ) bit would be 0. When you see a leftmost 1-bit in the received data it must have come from a 1 in the mask. Of the four mask bytes only one of them begins with a 1. Thus, no matter what the order, only one in four of the received data bytes should begin with a 1. But in fact half of them do.

Last edited 2012


BlitzSupport(Posted 2012) [#13]
Completely forgot about LongBin/LongHex, and nice point about the leftmost bit! Many thanks for taking the time to look into this anyway, much appreciated.


Floyd(Posted 2012) [#14]
nice point about the leftmost bit

Yep, I was this close ( imagine my thumb and index finger about a millimeter apart ) to figuring it out. The pattern of leftmost 1s meant that mask0 was being applied to the first and fourth data bytes. That happened because of the endian confusion.

Encryption and decryption had been done with the masks in opposite orders. Thus ( mask0 ~ mask3 ) had been applied to the first and fourth data bytes. Likewise with ( mask1 ~ mask2 ) being applied to the second and third data bytes.


BlitzSupport(Posted 2012) [#15]
I would quite like to go back and try to figure out why I got the results I did, as I'm pretty sure it should work out the same. It may be that I just didn't hit on the exact combination when manually switching between endiannesses (is that a word?) and on reading/applying the mask.

Might just brush all that under the carpet for my sanity, though...

I like to dabble in some of this bit-level of coding now and then, but for some reason the logic has just never fallen naturally into place for me, even though I understand most of it when I'm reading about it!


BlitzSupport(Posted 2012) [#16]
Finally dawned on me that the other languages are reading byte-by-byte and shift-combining them manually as they go, rather than reading a short, reading an int, reading a long, etc -- hence no mention of the endianness problem! Gah!