Pigeon Devlog #2: A Cleaner Core, Faster Draws, and Fighting Bugs

Hello again!

For the last two weeks, I have been focused on the heart of my Vulkan renderer, "Pigeon." My goal was to refactor the core engine to make it cleaner, easier to understand, and ready for new features.

When you work with Vulkan, it is very easy to make a complicated mess. I wanted to fix this early.

My new design is much simpler:

  • Context and Device: All the Vulkan setup (like choosing the GPU and creating the logical device) lives in one place.
  • Resources and Lifetime: All the logic for creating and destroying buffers, images, and other resources is now managed separately.
  • The Frame Loop: The main loop is now very clean. It only contains the logic for rendering a frame, not managing resources.

This new structure makes it much easier to think about synchronization (making sure the CPU and GPU work together) and memory.

For the swapchain, I am using triple buffering. Each of the three swapchain images has its own command buffer and its own set of uniform buffers. To manage the uniform buffers, I am using a Ring Buffer. This is a great way to give each frame-in-flight its own copy of data without re-allocating buffers all the time.

Visual Parts

With a stable core, I could finally add some visuals!

  • Lighting: I implemented basic Blinn-Phong shading. It is a classic and fast lighting model. Right now, it works with one point light. The materials are simple, but the pipeline is stable.
    Bistro Scene without Blinn-Phong Shading
    Bistro Scene with Blinn-Phong Shading
  • Shadows: I added a shadow map pass. This is done with a "depth-only" pipeline, which is a fast pass that only writes depth information from the light's point of view. I checked that my shadow projection and bias settings are correct, so the shadows look right.
Bistro Scene without Shadow
Bistro Scene with Shadow
  • Anti-Aliasing (MSAA): I enabled Multisample Anti-Aliasing (MSAA) in the main render pass. This makes the edges of objects look much smoother.

No MSAA

MSAA 8x

This MSAA part is a good example of a "Vulkan rule." You cannot sample (read) from a multisampled image directly in a shader. You must first resolve it. This means I take the multisampled image and render it to a normal, single-sampled image. Then, I can use that clean image for post-processing or sampling later.

Speeding Up with MDI and Async Texture Uploads

This is the part I am most excited about. I wanted to reduce the CPU work and really use the GPU.

  • Multi-Draw Indirect (MDI)

Before, to draw 1000 objects, my CPU had to send 1000 separate draw calls to the GPU. This is a lot of "chatter" and keeps the CPU busy.

Vulkan lets us do better with Multi-Draw Indirect (MDI).

Now, I do this instead:

  1. On the CPU, I create a buffer (a list) of all the draw commands.
  2. I send this one buffer to the GPU.
  3. I tell the GPU, "Here is a list of 1000 things to draw. You iterate the list and do the work."

The CPU is now free! It just builds the list once and submits it. The GPU does the rest.

The performance difference is huge: in my simple test scene with 1920x1080 resolution, the normal "bindfull" approach gave me ~560 FPS, but with MDI, it jumped to ~1300 FPS! ( Of course without using MSAA). This fits the Vulkan philosophy: record work once, submit efficiently, and keep the CPU out of the hot path.

  • Asynchronous Data Uploads

The next bottleneck was uploading new textures. This could cause the Pigeon to "stall" or freeze for a moment.

Vulkan gives us a powerful solution: Queues. Think of queues like different lanes in a supermarket.

  • The Graphics Queue is the main lane, doing all the drawing and rendering.
  • The Transfer Queue is a special, separate lane just for moving data.

I built an async data uploader that uses this Transfer Queue. When I need to upload a new texture, I put the work in this separate lane. It happens in the background, and the main Graphics Queue never stops. The main thread is not blocked, and uploading data no longer stalls the frame.


The Bugs I Fought (And What I Learned)

Of course, it was not all easy. I spent a lot of time debugging!

  • The Fence Bug: This was the worst. I accidentally used a global "wait" command (waitUntilAllSubmitsAreComplete) instead of the correct per-submit command (waitUntilSubmitIsComplete). This caused a hidden stall that looked like a deadlock. The GPU was waiting for the CPU, and the CPU was waiting for the GPU! After I found the correct function, the frame loop was smooth.
  • The ImGui Bug: I added ImGui to help me debug. But I kept getting "unresolved symbol" errors. I learned I needed to define a special macro: IMGUI_IMPL_VULKAN_NO_PROTOTYPES. A small thing that took hours to find!
  • The Blinking Bug: My ring buffer seemed to "blink." Resources were disappearing. This was a simple, stupid mistake: I forgot to advance the buffer to the next index for the new frame. I was accidentally overwriting resources that the GPU was still using.

  • The Descriptor Set Trap: My shaders were seeing garbage data. I had mixed up the initialization order and the binding numbers in my descriptor sets. I had to clean up all my descriptor layouts and writing sequences to fix it.
  • The Shadow Map Bug: This bug took me four days to solve. My shadows looked completely wrong, like they were rendering in clip space. It turned out I was passing the wrong light space matrix to the final shader where the shadow calculation happens. A classic matrix bug!
LightMatrix issue

Comments

Popular posts from this blog

Raven: The Beginning of My Ray Tracing Journey

Computer Graphics II - HW 2 - Dynamic Cube Mapping

Subsurface scattering