String class and UTF8

Monkey Targets Forums/iOS/String class and UTF8

secondgear(Posted 2013) [#1]
When I load an UTF-8 encoded text file, the iOS target doesn't deal with 2-byte characters properly. Printing the string to console or trying to perform any string manipulation (case conversion, substring etc.) results in something else displayed instead of these characters.

I don't know if the problem lies in String<->NSString conversion or the Monkey string class itself, but with my limited knowledge of C++ and Objective-C I cannot figure it out.

Does anybody have a solution? I'm still using ver.66, sorry :(


secondgear(Posted 2013) [#2]
Ok, here is a partial solution to my problem, in case anybody is interested.

Problem 1 (printing to console)
Already solved by Mark. I copied Print implementation from ver.71c lang.cpp and console output is now what it's supposed to be.

Problem 2 (uppercase/lowercase conversion)
If you try something like
Print "русский".ToUpper()

it will print русский

Here is my ugly patch for lang.cpp String implementation (inefficient, but the best I could come up with)
	String ToUpper()const{
#if __OBJC__
		NSString *nss = this->ToNSString();
		NSString *uc = [nss uppercaseString];
		return String(uc);
#else
		for( int i=0;i<rep->length;++i ){
			Char t=toupper( rep->data[i] );
			if( t==rep->data[i] ) continue;
			Rep *p=Rep::alloc( rep->length );
			Char *q=p->data;
			t_memcpy( q,rep->data,i );
			for( q[i++]=t;i<rep->length;++i ){
				q[i]=toupper( rep->data[i] );
			}
			return String( p );
		}
		return *this;
#endif
	}


This actually produces РУССКИЙ in the example above.