Jump to content

Getting Started with Vulkan



The latest design of my OpenGL renderer using bindless textures has some problems, and although these can be resolved, I think I have hit the limit on how useful an initial OpenGL implementation will be for the new engine. I decided it was time to dive into the Vulkan API. This is sort of scary, because I feel like it sets me back quite a lot, but at the same time the work I do with this will carry forward much better. A Vulkan-based renderer can run on Windows, Linux, Mac, iOS, Android, PS4, and Nintendo Switch.

So far my impressions of the API are pretty good. Although it is very verbose, it gives you a lot of control over things that were previously undefined or vendor-specific hacks. Below is code that initializes Vulkan and chooses a rendering device, with a preference for discrete GPUs over integrated graphics.

VkInstance inst;
VkResult res;
VkDevice device;

VkApplicationInfo appInfo = {};
appInfo.pApplicationName = "MyGame";
appInfo.applicationVersion = VK_MAKE_VERSION(1, 0, 0);
appInfo.pEngineName = "TurboEngine";
appInfo.engineVersion = VK_MAKE_VERSION(1, 0, 0);
appInfo.apiVersion = VK_API_VERSION_1_0;

// Get extensions
uint32_t extensionCount = 0;
vkEnumerateInstanceExtensionProperties(nullptr, &extensionCount, nullptr);
std::vector<VkExtensionProperties> availableExtensions(extensionCount);
vkEnumerateInstanceExtensionProperties(nullptr, &extensionCount, availableExtensions.data());
std::vector<const char*> extensions;

VkInstanceCreateInfo createInfo = {};
createInfo.pApplicationInfo = &appInfo;
createInfo.enabledExtensionCount = (uint32_t)extensions.size();
createInfo.ppEnabledExtensionNames = extensions.data();

#ifdef DEBUG
createInfo.enabledLayerCount = 1;
const char* DEBUG_LAYER = "VK_LAYER_LUNARG_standard_validation";
createInfo.ppEnabledLayerNames = &DEBUG_LAYER;

res = vkCreateInstance(&createInfo, NULL, &inst);
	std::cout << "cannot find a compatible Vulkan ICD\n";
else if (res)
	std::cout << "unknown error\n";

//Enumerate devices
uint32_t gpu_count = 1;
std::vector<VkPhysicalDevice> devices;
res = vkEnumeratePhysicalDevices(inst, &gpu_count, NULL);
if (gpu_count > 0)
	res = vkEnumeratePhysicalDevices(inst, &gpu_count, &devices[0]);
	assert(!res && gpu_count >= 1);

//Sort list with discrete GPUs at the beginning
std::vector<VkPhysicalDevice> sorteddevices;
for (int n = 0; n < devices.size(); n++)
	VkPhysicalDeviceProperties deviceprops = VkPhysicalDeviceProperties{};
	vkGetPhysicalDeviceProperties(devices[n], &deviceprops);
	if (deviceprops.deviceType == VkPhysicalDeviceType::VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU)
devices = sorteddevices;

VkDeviceQueueCreateInfo queue_info = {};
unsigned int queue_family_count;

for (int n = 0; n < devices.size(); ++n)
	vkGetPhysicalDeviceQueueFamilyProperties(devices[n], &queue_family_count, NULL);
	if (queue_family_count >= 1)
		std::vector<VkQueueFamilyProperties> queue_props;
		vkGetPhysicalDeviceQueueFamilyProperties(devices[n], &queue_family_count, queue_props.data());

		if (queue_family_count >= 1)
			bool found = false;
			for (int i = 0; i < queue_family_count; i++)
				if (queue_props[i].queueFlags & VK_QUEUE_GRAPHICS_BIT)
					queue_info.queueFamilyIndex = i;
					found = true;
			if (!found) continue;

			float queue_priorities[1] = { 0.0 };
			queue_info.pNext = NULL;
			queue_info.queueCount = 1;
			queue_info.pQueuePriorities = queue_priorities;

			VkDeviceCreateInfo device_info = {};
			device_info.pNext = NULL;
			device_info.queueCreateInfoCount = 1;
			device_info.pQueueCreateInfos = &queue_info;
			device_info.enabledExtensionCount = 0;
			device_info.ppEnabledExtensionNames = NULL;
			device_info.enabledLayerCount = 0;
			device_info.ppEnabledLayerNames = NULL;
			device_info.pEnabledFeatures = NULL;

			res = vkCreateDevice(devices[n], &device_info, NULL, &device);
			if (res == VK_SUCCESS)
				VkPhysicalDeviceProperties deviceprops = VkPhysicalDeviceProperties{};
				vkGetPhysicalDeviceProperties(devices[n], &deviceprops);
				std::cout << deviceprops.deviceName;

				vkDestroyDevice(device, NULL);

vkDestroyInstance(inst, NULL);


  • Like 3


Recommended Comments

Do you already know if it will be faster than OpenGL and if so, by how much?  Or is that subject to testing?

Share this comment

Link to comment
5 hours ago, gamecreator said:

Do you already know if it will be faster than OpenGL and if so, by how much?  Or is that subject to testing?

Our new engine makes speed gains in a different way and I don't think Vulkan will make anything faster at all, honestly, over what the engine could do with OpenGL 4.3. We already have zero driver overhead, but Vulkan is more widely supported and the brand is synonymous with speed in the customer's mind. I know this because when I talk to people in person the first thing they say is "does it use Vulkan?"

It will be a lot faster than Leadwerks 4 but so was the OpenGL implementation.

Using Vulkan does not automatically make anything faster. No one will listen to me if I try to argue with that, which is fine, because I am just going to let everyone else mess up and I will make the fastest engine.

  • Confused 1

Share this comment

Link to comment

The best wishes in this project will surely be very pleasant results for users like us who are not real programmers.  :)

Share this comment

Link to comment

No, I don't consider myself a real programmer, that is to say the magic is done by the engine, I have no idea how to create a light, I have no idea how that light casts shadows on a static or dynamic mesh, I think that in this era of globalization we are made to believe that we are programmers when we use powerful tools such as Leadwerks, where the greatest work is already done, on the other hand we are users, creators of possible games, but the programmer in all this riddle is really you, the one who has the necessary tools for people like us to think we are programmers and create something fun.  

Translated with www.DeepL.com/Translator

Share this comment

Link to comment
1 hour ago, Josh said:

Using Vulkan does not automatically make anything faster.

This could be true.  I did some searches and one response said that it depends on if you can put it to use for your engine or game.  Wiki says something similar:


Vulkan is intended to offer higher performance and more balanced CPU/GPU usage ... Vulkan is said to induce anywhere from a marginal to polynomial speedup in run time relative to other APIs if implemented properly on the same hardware


Share this comment

Link to comment

This is really complicated stuff, for really no good reason. I'm 600 lines into it and still can't even make a blue screen. :blink:

  • Confused 1

Share this comment

Link to comment

That confirms my suspicions, you are a programmer, I play to believe that I am, I know that you will succeed. 

Share this comment

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Blog Entries

    • By Josh in Josh's Dev Blog 0
      Textures in Leadwerks don't actually store any pixel data in system memory. Instead the data is sent straight from the hard drive to the GPU and dumped from memory, because there is no reason to have all that data sitting around in RAM. However, I needed to implement texture saving for our terrain system so I implemented a simple "Pixmap" class for handling image data:
      class Pixmap : public SharedObject { VkFormat m_format; iVec2 m_size; shared_ptr<Buffer> m_pixels; int bpp; public: Pixmap(); const VkFormat& format; const iVec2& size; const shared_ptr<Buffer>& pixels; virtual shared_ptr<Pixmap> Copy(); virtual shared_ptr<Pixmap> Convert(const VkFormat format); virtual bool Save(const std::string& filename, const SaveFlags flags = SAVE_DEFAULT); virtual bool Save(shared_ptr<Stream>, const std::string& mimetype = "image/vnd-ms.dds", const SaveFlags flags = SAVE_DEFAULT); friend shared_ptr<Pixmap> CreatePixmap(const int, const int, const VkFormat, shared_ptr<Buffer> data); friend shared_ptr<Pixmap> LoadPixmap(const std::wstring&, const LoadFlags); }; shared_ptr<Pixmap> CreatePixmap(const int width, const int height, const VkFormat format = VK_FORMAT_R8G8B8A8_UNORM, shared_ptr<Buffer> data = nullptr); shared_ptr<Pixmap> LoadPixmap(const std::wstring& path, const LoadFlags flags = LOAD_DEFAULT); You can convert a pixmap from one format to another in order to compress raw RGBA pixels into BCn compressed data. The supported conversion formats are very limited and are only being implemented as they are needed. Pixmaps can be saved as DDS files, and the same rules apply. Support for the most common formats is being added.
      As a result, the terrain system can now save out all processed images as DDS files. The modern DDS format supports a lot of pixel formats, so even heightmaps can be saved. All of these files can be easily viewed in Visual Studio itself. It's by far the most reliable DDS viewer, as even the built-in Windows preview function is missing support for DX10 formats. Unfortunately there's really no modern DDS viewer application like the old Windows Texture Viewer.

      Storing terrain data in an easy-to-open standard texture format will make development easier for you. I intend to eliminate all "black box" file formats so all your game data is always easily viewable in a variety of tools, right up until the final publish step.
    • By Josh in Josh's Dev Blog 1
      I wanted to see if any of the terrain data can be compressed down, mostly to reduce GPU memory usage. I implemented some fast texture compression algorithms for BC1, BC3, BC4, BC5, and BC7 compression. BC6 and BC7 are not terribly useful in this situation because they involve a complex lookup table, so data from different textures can't be mixed and matched. I found two areas where texture compression could be used, in alpha layers and normal maps. I implemented BC3 compression for terrain alpha and could not see any artifacts. The compression is very fast, always less than one second even with the biggest textures I would care to use (4096 x 4096).
      For normals, BC1 (DXT1 and BC3 (DXT5) produce artifacts: (I accidentally left tessellation turned on high in these shots, which is why the framerate is low):

      BC5 gives a better appearance on this bumpy area and closely matches the original uncompressed normals. BC5 takes 1 byte per pixel, one quarter the size of uncomompressed RGBA. However, it only supports two channels, so we need one texture for normals and another for tangents, leaving us with a total 50% reduced size.

      Here are the results:
      2048 x 2048 Uncompressed Terrain:
      Heightmap = 2048 * 2048 * 2 = 8388608 Normal / tangents map = 16777216 Secret sauce = 67108864 Secret sauce 2 = 16777216 Total = 104 MB 2048 x 2048 Compressed Terrain:
      Heightmap = 2048 * 2048 * 2 = 8388608 Normal map = 4194304 Tangents = 4194304 Secret sauce = 16777216 Secret sauce 2 = 16777216 Total = 48 MB Additionally, for editable terrain an extra 32 MB of data needs to be stored, but this can be dumped once the terrain is made static. There are other things you can do to reduce the file size but it would not change the memory usage, and processing time is very high for "super-compression" techniques. I investigated this thoroughly and found the best compression methods for this situation that are pretty much instantaneous with no noticeable loss of quality, so I am satisfied.
    • By jen in jen's Blog 0
      My small project will be called Foregate, it will be a dark medieval Diablo style single player action RPG.
      The graphics will be simple, no PBR, 256x256 map, reasonably low-res models.
      Camera style? Top-down-ish I think? Like in Diablo exactly - and because the camera is not directly in-front of the 3 models, I can get away with low-resolution assets - bonus. Also, with top-down view, I won't have to worry about high resolution sky-boxes. 
      What's my plan for this project?
      I plan to make this project as small and as simple as possible, possibly release it as open-source, and have fun with it of course.
      My previous experience with game development (1-2 years ago?) was amateurish I think, still is now. I want to give it a go again, this time with experience although my skill in C++ is not really that good? Maybe I can improve it in this project.
      More about the game
      The content is not set in stone yet but I have a general idea of how the mechanics is going to look and feel - Diablo-ish obviously. It'll have monsters (ancient & mythical probably), loot when killing a monster, gold as in-game currency, visual grid inventory, player stats (level, strength, agility, vitality, energy, &c.). 
      The game will be single-player. Possibly a coop multiplayer also? I don't have any interest in making massive multi-player. 
      I started my development yesterday with the basic preparations (setting up project environment, &c.), today I made my first step in developing the core components; worker class, game state, task class.
      I have a game state that keeps a single source of truth for the entire application; all game data will be stored in this class as "states". 
      I also have a "Worker" which will do the processing of tasks in the game.
      I also have "object" class, this can be a monster, the player, a weapon, a prop, or an NPC.
      So the idea is to have a CQRS type of interaction between the classes and the data. Any action in the game will be interpreted as "Task" for the Worker class. The worker class iterates through the Task. Tasks can be created by any class interfaced with the Worker class trough "addNewTask" and the new tasks can be of a certain type i.e.: ATTACK, IDLE, SAVE_GAME, EXIT_GAME, the new task will also have a payload data and it's processed according to its task type e.g. an ATTACK with payload "{ Damage: 10, Target: MonsterA }" will reduce the health of MonsterA by 10 - the worker class will change the game state; find MonsterA in MonsterState and reduce its health by 10. 
      I think it's advantageous to have this type of centralized module where all actions are processed; I can do all sorts of procedures during the processes, maybe debug data, filter actions, mutate payloads, and such.
      How much time am I going to put into this?
      A couple of hours a day for 3 days a week maybe.
      So it's all a rough sketch for now and it's heading the right direction. I'll have more to report later on. 

      This is Forgate Castle, minus the castle, in the map Forgate; the starting location for the player. The fortification will have merchants, and quest givers.
  • Create New...