axe.lua String conversion bug

BlitzMax Forums/BlitzMax Module Tweaks/axe.lua String conversion bug

Rozek(Posted 2007) [#1]
Hello!

I might have found a bug in the Lua interface library (axe.lua)

Problem: Lua strings may contain embedded zeros (as do BlitzMAX strings) - however, a conversion using String.fromCString stops at the first zero byte/char found in a (C) string: it uses the C function "strlen" to calculate the length of a (C) string.

Test:
  local Text:String = "abc˜0def"
  local Text2:String = String.fromCString(Text.toCString())
  print Text2.length

will display 3 instead of 7

A possible bug fix (in axe.mod/lua.mod/lua.bmx) could be:

  function lua_tolstring:byte ptr (lua_state:byte ptr, index:int, size:int ptr)

  function lua_tostring:String (lua_state:byte ptr, index:int)
    local Length:int
    local StringPtr:byte ptr = lua_tolstring(lua_state, index, VarPtr Length)
    if (StringPtr = null) then
      return null
    else
      return String.fromBytes(StringPtr,Length)
    end if
  end function

While this fix "breaks" the lua_tolstring function (which does no longer return a BlitzMAX string), it lets the lua_tostring function work as expected (and described in the Lua Reference Manual)

lua_typename could remain unmodified as no Lua type has any ˜0 in its name.

luaL_checklstring, luaL_gsub, luaL_optlstring and luaL_prepbuffer can not be modified directly unless it is possible to define "aliases" for external functions and define a BlitzMAX wrapper for that external function under its original name. luaL_optstring may be modified as shown above for lua_tostring.

Any comments?


Rozek(Posted 2007) [#2]
Simon, you've got mail ;-)

Just in case, that my mail get's lost (for whatever reason): I've built a new version of the axe.lua module - with the following changes:

- uses Lua 5.1.2 (instead of 5.1.1)
- fixes a few bugs concerning strings with embedded zeros
- adds a new "lua_tobytearray" function which retrieves a Lua string and yields a BlitzMAX byte array with the same contents (I need that function for my Lua-BlitzMAX interfaces)

I've tested the code under WinXP - and it seems to work. As I haven't added any platform-specific code, it should also work under MacOSX and Linux - but you are more an expert in that area than I am...

The module (and all related (source) files) may be found at

http://www.andreas-rozek.de/BlitzMAX/lua.mod.zip

If you think it's worth to be published, please feel free to add it to your BlitzMAX module server...


skidracer(Posted 2007) [#3]
Andreas, it looks like my email has been down for last couple of days due to a new blitzbasic.com server, please resend.


Rozek(Posted 2007) [#4]
Good morning - and Happy Easter!

I had a little time during the last few days and looked over axe.lua again - making the following (hopefully useful) decisions:

- whereever the Lua API expects string parameters without an additional length parameter, I left the "$z" suffix behind the string parameter name (you won't be able to provide strings with embedded zeros anyway)
- whereever the Lua API expects string parameters together with an additional length parameter, I changed the type of the "string" parameter back to "byte ptr" in order to avoid any conversions on that parameter
- whereever the Lua API returns a string without providing additional length information (using a modifyable length parameter), I left the "$z" suffix behind the function name (you won't be able get strings with embedded zeros anyway)
- methods returning strings with additional length information (e.g. tolstring, checklstring, optlstring) usually have a companion without that length parameter (e.g., tostring, checkstring, optstring). Those "companion" functions have been implemented in BlitzMax, using BlitzMax strings (which *may* contain embedded zeros as they also provide length information themselves) internally calling the "l" function (of the Lua API) and using the length information to construct the BlitzMax string result.

As a consequence you now get the whole functionality of the Lua API together with the comfort of BlitzMax:

- strings may act as "buffers" for binary data (as foreseen by Lua). Just use the "l" API functions in order to avoid any conversions on the string
- you may still provide strings with embedded zeros (e.g., because of a Unicode encoding) which act as containers for text rather than binary stuff - just use the normal string functions (those without "l")

Additionally, I provided "lua_tobytearray" and "lua_pushbytearray" functions which internally call "lua_tolstring" or "lua_pushlstring", resp., but use byte arrays on the BlitzMax side - such arrays are probably better suited for "buffers" than strings

Well, I also

- applied several bug fixes
- added Lua API functions, which were still missing in the interface
- made modifications for Lua 5.1.2 (which is now part of axe.lua)

The complete module sources can (again) be found at

http://www.andreas-rozek.de/BlitzMAX/lua.mod.zip

The axe.lua file itself is also available at

http://www.andreas-rozek.de/BlitzMAX/lua.bmx

What I still don't know, however, is how BlitzMax converts its (internal) Unicode strings to/from Lua ASCII strings - are there any UTF-8 functions? If there are (i.e., if one can control the conversion) it'll probably worth to look at the Lua API methods with string parameters/results again and provide versions with and without UTF-8 conversion ("without" for all 7-bit ASCII users or those using an 8-bit character set such as ISO Latin-1, "with" for all other users or for internationalized software)

Thus, the actual version won't be the last one (sigh)


Rozek(Posted 2007) [#5]
Sorry for the many "deleted" postings:

I tried to post a message, but the server did not respond - surprisingly, I could *preview* my posting, but actually *posting* it did not work. Additionally, it did not appear in the thread (when I looked at it from a different computer), thus, I thought it would not have reached the server and tried to post again...and again after some minutes...and again...

If I could, I would completely delete the extra messages, but so I could just delete their contents.

(Edit: some helpful ghost deleted those "extra messages", thus, this posting is no longer relevant - "thanks" to the "ghost")


Rozek(Posted 2007) [#6]
Now another "real" message:

the current axe.lua module has now been tested under WinXP and MacOSX - before it has been WinXP only.