Jump to content

Unicode in Leadwerks 5

Josh

1,527 views

I've begun implementing unicode in Leadwerks Game Engine 5.  It's not quite as simple as "switch all string variables to another data type".

First, I will give you a simple explanation of what unicode is.  I am not an expert so feel free to make any corrections in the comments below.

When computers first started drawing text we used a single byte for each character.  One byte can describe 256 different values and the English language only has 26 letters, 10 numbers, and a few other characters for punctuation so all was well.  No one cared or thought about supporting other languages like German with funny umlauts or the thousands of characters in the Chinese language.

Then some people who were too smart for their own good invented a really complicated system called unicode.  Unicode allows characters beyond the 256 character limit of a byte because it can use more than one byte per character.  But unicode doesn't really store a letter, because that would be too easy.  Instead it stores a "code point" which is an abstract concept.  Unfortunately the people who originally invented unicode were locked away in a mental asylum where they remain to this day, so no one in the real world actually understands what a code point is.

There are several kinds of unicode but the one favored by nerds who don't write software is UTF-8.  UTF-8 uses just one byte per character, unless it uses two, or sometimes four.  Because each character can be a different length there is no way to quickly get a single letter of a string.  It would be like trying to get a single byte of a compressed zip file; you have to decompress the entire file to read a byte at a certain position.  This means that commands like Replace(), Mid(), Upper(), and basically any other string manipulation commands simply will not work with UTF-8 strings.

Nonetheless, some people still promote UTF-8 religiously because they think it sounds cool and they don't actually write software.  There is even a UTF-8 Everywhere Manifesto.  You know who else had a manifesto?  This guy, that's who:

Karl_Marx.thumb.jpg.1ece92f31b80f2b01f9a2b0bd51b0381.jpg

Typical UTF-8 proponent.

Here's another person with a "manifesto":

Theodore_Kaczynski.jpg.c537d9a7ae01f7b53e773b19a7c0bc49.jpg

The Unabomber (unibomber? Coincidence???)

The fact is that anyone who writes a manifesto is evil, therefore UTF-8 proponents are evil and should probably be imprisoned for crimes against humanity.  Microsoft sensibly solved this problem by using something called a "wide string" for all the windows internals.  A C++ wide string (std::wstring) is a string made up of wchar_t values instead of char values.  (The std::string data type is sometimes called a "narrow string").  In C++ you can set the value of a wide string by placing a letter "L" (for long?) in front of the string:

std::wstring s = L"Hello, how are you today?";

The C++11 specification defines a wchar_t value as being composed of two bytes, so these strings work the same across different operating systems.  A wide string cannot display a character with an index greater than 65535, but no one uses those characters so it doesn't matter.  Wide strings are basically a different kind of unicode called UTF-16 and these will actually work with string manipulation commands (yes there are exceptions if you are trying to display ancient Vietnamese characters from the 6th century but no one cares about that).

For more detail you can read this article about the technical details and history of unicode (thanks @Einlander).

First Pass

At first I thought "no problem, I will just turn all string variables into wstrings and be done with it".  However, after a couple of days it became clear that this would be problematic.  Leadwerks interfaces with a lot of third-party libraries like Steamworks and Lua that make heavy use of strings.  Typically these libraries will accept a chr* value for the input, which we know might be UTF-8 or it might not (another reason UTF-8 is evil).  The engine ended up with a TON of string conversions that I might be doing for no reason.  I got the compiler down to 2991 errors before I started questioning whether this was really needed.

Exactly what do we need unicode strings for?  There are three big uses:

  • Read and save files.
  • Display text in different languages.
  • Print text to the console and log.

Reading files is mostly an automatic process because the user typically uses relative file paths.  As long as the engine internally uses a wide string to load files the user can happily use regular old narrow strings without a care in the world (and most people probably will).

Drawing text to the screen or on a GUI widget is very important for supporting different languages, but that is only one use.  Is it really necessary to convert every variable in the engine to a wide string just to support this one feature?

Printing strings is even simpler.  Can't we just add an overload to print a wide string when one is needed?

I originally wanted to avoid mixing wide and narrow strings, but even with unicode support most users are probably not even going to need to worry about using wide strings at all.  Even if they have language files for different translations of their game, they are still likely to just load some strings automatically without writing much code.  I may even add a feature that does this automatically for displayed text.  So with that in mind, I decided to roll everything back and convert only the parts of the engine that would actually benefit from unicode and wide strings.

Second Try + Global Functions

To make the API simpler Leadwerks 5 will make use of some global functions instead of trying to turn everything into a class.  Below are the string global functions I have written:

std::string String(const std::wstring& s);
std::string Right(const std::string& s, const int length);
std::string Left(const std::string& s, const int length);
std::string Replace(const std::string& s, const std::string& from, const std::string& to);
int Find(const std::string& s, const std::string& token);
std::vector<std::string> Split(const std::string& s, const std::string& sep);
std::string Lower(const std::string& s);
std::string Upper(const std::string& s);

There are equivalent functions that work with wide strings.

std::wstring StringW(const std::string& s);
std::wstring Right(const std::wstring& s, const int length);
std::wstring Left(const std::wstring& s, const int length);
std::wstring Replace(const std::wstring& s, const std::wstring& from, const std::wstring& to);
int Find(const std::string& s, const std::wstring& token);
std::vector<std::wstring> Split(const std::wstring& s, const std::wstring& sep);
std::wstring Lower(const std::wstring& s);
std::wstring Upper(const std::wstring& s);

The System::Print() command has become a global Print() command with a couple of overloads for both narrow and wide strings:

void Print(const std::string& s);
void Print(const std::wstring& s);

The file system commands are now global functions as well.  File system commands can accept a wide or narrow string, but any functions that return a path will always return a wide string:

std::wstring SpecialDir(const std::string);
std::wstring CurrentDir();
bool ChangeDir(const std::string& path);
bool ChangeDir(const std::wstring& path);
std::wstring RealPath(const std::string& path);
std::wstring RealPath(const std::wstring& path);

This means if you call ReadFile("info.txt") with a narrow string the file will still be loaded even if it is located somewhere like "C:/Users/约书亚/Documents" and it will work just fine.  This is ideal since Lua 5.3 doesn't support wide strings, so your game will still run on computers around the world as long as you just use local paths like this:

LoadModel("Models/car.mdl");

Or you can specify the full path with a wide string:

LoadModel(CurrentDir() + L"Models/car.mdl");

The window creation and text drawing functions will also get an overload that accepts wide strings.  Here's a window created with a Chinese title:

Image1.jpg.987ae08e739ae771c6a6bda0bef35ec0.jpg

So in conclusion, unicode will be used in Leadwerks and will work for the most part without you needing to know or do anything different, allowing games you develop (and Leadwerks itself) to work correctly on computers all across the world.



5 Comments


Recommended Comments

Great stuff. It'll be useful for multiplayer, especially game applications that uses the Steamworks API since a lot of Steam users have wingdings on their names, the symbols disappear when their names show on chat box in the current version of LE. 

Share this comment


Link to comment

Worked all day on it, and now both these lines of code will successfully load a model;.  The first uses the wide string function overload and the second uses a UTF-8 narrow string which is then converted into a wide string by the engine:

auto model = LoadModel(L"Models/汽车/formation1.mdl");
auto model = LoadModel(u8"Models/汽车/formation1.mdl");

This code, however, will not work:

auto model = LoadModel("Models/汽车/formation1.mdl");

But this will:

ChangeDir(L"Models/汽车");
auto model = LoadModel("formation1.mdl");

 

Share this comment


Link to comment

Are you wrapping your definitions in #extern "C"{ }?

Would be cool to use something like Golang with Leadwerks. Performance might suck, but it would be worth a shot IMO.

Share this comment


Link to comment

It would be interesting if the asian market latches on to Leadwerks since it is a lower long term cost than Unity and Unreal. Thats a huge market. Them and russia.

Share this comment


Link to comment
2 hours ago, Einlander said:

It would be interesting if the asian market latches on to Leadwerks since it is a lower long term cost than Unity and Unreal. Thats a huge market. Them and russia.

Russia is actually our #2 or 3 market after the U.S, if you don't count western Europe as one region.

Share this comment


Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Blog Entries

    • By Josh in Josh's Dev Blog 4
      Previously I described how multiple cameras can be combined in the new renderer to create an unlimited depth buffer. That discussion lead into multi-world rendering and 2D drawing. Surprisingly, there is a lot of overlap in these features, and it makes sense to solve all of it at one time.
      Old 2D rendering systems are designed around the idea of storing a hierarchy of state changes. The renderer would crawl through the hierarchy and perform commands as it went along, rendering all 2D elements in the order they should appear. It made sense for the design of the first graphics cards, but this style of rendering is really inefficient on modern graphics hardware. Today's hardware works best with batches of objects, using the depth buffer to handle which object appears on top. We don't sort 3D objects back-to-front because it would be monstrously inefficient, so why should 2D graphics be any different?
      We can get much better results if we use the same fast rendering techniques we use for 3D graphics and apply it to 2D shapes. After all, the only difference between 3D and 2D rendering is the shape of the camera projection matrix. For this reason, Turbo Engine will use 2D-in-3D rendering for all 2D drawing. You can render a pure 2D scene by setting the camera projection mode to orthographic, or you can create a second orthographic camera and render it on top of your 3D scene. This has two big implications:
      Performance will be incredibly fast. I predict 100,000 uniquely textured sprites will render pretty much instantaneously. In fact anyone making a 2D PC game who is having trouble with performance will be interested in using Turbo Engine. Advanced 3D effects will be possible that we aren't used to seeing in 2D. For example, lighting works with 2D rendering with no problems, as you can see below. Mixing of 3D and 2D elements will be possible to make inventory systems and other UI items. Particles and other objects can be incorporated into the 2D display.
      The big difference you will need to adjust to is there are no 2D drawing commands. Instead you have persistent objects that use the same system as the 3D rendering.
      Sprites
      The primary 2D element you will work with is the Sprite entity, which works the same as the 3D sprites in Leadwerks 4. Instead of drawing rectangles in the order you want them to appear, you will use the Z position of each entity and let the depth buffer take care of the rest, just like we do with 3D rendering. I also am adding support for animation frames and other features, and these can be used with 2D or 3D rendering.

      Rotation and scaling of sprites is of course trivial. You could even use effects like distance fog! Add a vector joint to each entity to lock the Z axis in the same direction and Newton will transform into a nice 2D physics system.
      Camera Setup
      By default, with a zoom value of 1.0 an orthographic camera maps so that one meter in the world equals one screen pixel. We can position the camera so that world coordinates match screen coordinates, as shown in the image below.
      auto camera = CreateCamera(world); camera->SetProjectionMode(PROJECTION_ORTHOGRAPHIC); camera->SetRange(-1,1); iVec2 screensize = framebuffer->GetSize(); camera->SetPosition(screensize.x * 0.5, -screensize.y * 0.5); Note that unlike screen coordinates in Leadwerks 4, world coordinates point up in the positive direction.

      We can create a sprite and reset its center point to the upper left hand corner of the square like so:
      auto sprite = CreateSprite(world); sprite->mesh->Translate(0.5,-0.5,0); sprite->mesh->Finalize(); sprite->UpdateBounds(); And then we can position the sprite in the upper left-hand corner of the screen and scale it:
      sprite->SetColor(1,0,0); sprite->SetScale(200,50); sprite->SetPosition(10,-10,0);
      This would result in an image something like this, with precise alignment of screen pixels:

      Here's an idea: Remember the opening sequence in Super Metroid on SNES, when the entire world starts tilting back and forth? You could easily do that just by rotating the camera a bit.
      Displaying Text
      Instead of drawing text with a command, you will create a text model. This is a series of rectangles of the correct size with their texture coordinates set to display a letter for each rectangle. You can include a line return character in the text, and it will create a block of multiple lines of text in one object. (I may add support for text made out of polygons at a later time, but it's not a priority right now.)
      shared_ptr<Model> CreateText(shared_ptr<World> world, shared_ptr<Font> font, const std::wstring& text, const int size) The resulting model will have a material with the rasterized text applied to it, shown below with alpha blending disabled so you can see the mesh background. Texture coordinates are used to select each letter, so the font only has to be rasterized once for each size it is used at:

      Every piece of text you display needs to have a model created for it. If you are displaying the framerate or something else that changes frequently, then it makes sense to create a cache of models you use so your game isn't constantly creating new objects. If you wanted, you could modify the vertex colors of a text model to highlight a single word.

      And of course all kinds of spatial transformations are easily achieved.

      Because the text is just a single textured mesh, it will render very fast. This is a big improvement over the DrawText() command in Leadwerks 4, which performs one draw call for each character.
      The font loading command no longer accepts a size. You load the font once and a new image will be rasterized for each text size the engine requests internally:
      auto font = LoadFont("arial.ttf"); auto text = CreateText(foreground, font, "Hello, how are you today?", 18); Combining 2D and 3D
      By using two separate worlds we can control which items the 3D camera draws and which item 2D camera draws: (The foreground camera will be rendered on top of the perspective camera, since it is created after it.) We need to use a second camera so that 2D elements are rendered in a second pass with a fresh new depth buffer.
      //Create main world and camera auto world = CreateWorld(); auto camera = CreateCamera(world); auto scene = LoadScene(world,"start.map"); //Create world for 2D rendering auto foreground = CreateWorld() auto fgcam = CreateCamera(foreground); fgcam->SetProjection(PROJECTION_ORTHOGRAPHIC); fgcam->SetClearMode(CLEAR_DEPTH); fgcam->SetRange(-1,1); auto UI = LoadScene(foreground,"UI.map"); //Combine rendering world->Combine(foreground); while (true) { world->Update(); world->Render(framebuffer); } Overall, this will take more work to set up and get started with than the simple 2D drawing in Leadwerks 4, but the performance and additional control you get are well worth it. This whole approach makes so much sense to me, and I think it will lead to some really cool possibilities.
      As I have explained elsewhere, performance has replaced ease of use as my primary design goal. I like the results I get with this approach because I feel the design decisions are less subjective.
    • By Josh in Josh's Dev Blog 25
      Current generation graphics hardware only supports up to a 32-bit floating point depth buffer, and that isn't adequate for large-scale rendering because there isn't enough precision to make objects appear in the correct order and prevent z-fighting.

      After trying out a few different approaches I found that the best way to support large-scale rendering is to allow the user to create several cameras. The first camera should have a range of 0.1-1000 meters, the second would use the same near / far ratio and start where the first one left off, with a depth range of 1000-10,000 meters. Because the ratio of near to far ranges is what matters, not the actual distance, the numbers can get very big very fast. A third camera could be added with a range out to 100,000 kilometers!
      The trick is to set the new Camera::SetClearMode() command to make it so only the furthest-range camera clears the color buffer. Additional cameras clear the depth buffer and then render on top of the previous draw. You can use the new Camera::SetOrder() command to ensure that they are drawn in the order you want.
      auto camera1 = CreateCamera(world); camera1->SetRange(0.1,1000); camera1->SetClearMode(CLEAR_DEPTH); camera1->SetOrder(1); auto camera2 = CreateCamera(world); camera2->SetRange(1000,10000); camera2->SetClearMode(CLEAR_DEPTH); camera2->SetOrder(2); auto camera3 = CreateCamera(world); camera3->SetRange(10000,100000000); camera3->SetClearMode(CLEAR_COLOR | CLEAR_DEPTH); camera3->SetOrder(3); Using this technique I was able to render the Earth, sun, and moon to-scale. The three objects are actually sized correctly, at the correct distance. You can see that from Earth orbit the sun and moon appear roughly the same size. The sun is much bigger, but also much further away, so this is exactly what we would expect.

      You can also use these features to render several cameras in one pass to show different views. For example, we can create a rear-view mirror easily with a second camera:
      auto mirrorcam = CreateCamera(world); mirrorcam->SetParent(maincamera); mirrorcam->SetRotation(0,180,0); mirrorcam=>SetClearMode(CLEAR_COLOR | CLEAR_DEPTH); //Set the camera viewport to only render to a small rectangle at the top of the screen: mirrorcam->SetViewport(framebuffer->GetSize().x/2-200,10,400,50); This creates a "picture-in-picture" effect like what is shown in the image below:

      Want to render some 3D HUD elements on top of your scene? This can be done with an orthographic camera:
      auto uicam = CreateCamera(world); uicam=>SetClearMode(CLEAR_DEPTH); uicam->SetProjectionMode(PROJECTION_ORTHOGRAPHIC); This will make 3D elements appear on top of your scene without clearing the previous render result. You would probably want to move the UI camera far away from the scene so only your HUD elements appear in the last pass.
    • By Marcousik in Marcousik's Creations Blog 5
      I updated the moto I was working on and now the physics let having fun with this.
       
×
×
  • Create New...