Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 5: Alan Qiao #15

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
564 changes: 564 additions & 0 deletions .gitignore

Large diffs are not rendered by default.

61 changes: 56 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,61 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Alan Qiao
* Tested on: Windows 11 22H2, Intel Xeon W-2145 @ 3.70GHz 64GB, RTX 3070 8GB (Dorm Workstation)

### (TODO: Your README)
## Introduction

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![](img/grass_physics.gif)

This project implements a grass simulation using Vulkan, based on the paper "[Responsive Real-time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf)". This project follows the Vulkan graphics pipeline employing in order the following shaders:

Vertex → Tessellation Control → Tessellation Evaluation → Compute → Fragment

Each grass blade is modeled as a bezier curve tesselated into a grass shape. Using compute shaders, basic physics simulation of the impact of gravity and a varying wind field are applied.

## Features
### Tessellation and Rendering
Each Grass blade is modeled by three control points, width and height of blade, an up vector for vertical direction, and an angle for blade orientation. The following diagram, sourced from the reference paper, provides a clear illustration of this setup.

![](img/blade_model.jpg)

This quadratic bezier curve model is memory efficient, but requires tesselation to procedurally generate renderable spline geometry.

![](img/grass_render.png)

The basic shape that would be transformed according to the bezier curve in this implementation is a quadratic. This provided a nice basis for creating a grass blade that converges at the tip.

### Physics Simulation
In this implementation, three forces are considered: gravity, "recovery force", and wind. Gravity is implemented as a constant downwards vector applied to the tip of the grass blade. The "recovery force" represents the resistance against deformation from gravity, and is proportional to the degree of deformation (measured by distance from upright position) multipled by a stiffness coefficient. Lastly, wind is modeled as a two dimensional force field that is simulated using Perlin noise, and is sampled at the each blade's position to determine direction and strength of wind, as well as how it would interact with the blade given its orientation and deformation.
![](img/grass_physics.gif)

### Culling
![](img/grass_culling.gif)
To increase performance in larger scenes, blades are culled by three heuristic tests.
1. Orientation Test
If the blade is approximately parallel to the camera's view direction, it would be culled since the blades aren't modeled with thickness and thus would not be visible from the side.
2. View Frustrum Test
if the blade lies outside the view frustum, it would be culled because it would not be visible in the final render either way.
3. Distance Test
Since objects further away are smaller, further objects take up less space on the screen and thus may need less detail to look convincing. Thus, decreasing number of blades are actually rendered the further away from the camera.

## Performance Analysis
For consistency, a single camera angle (see below) was chosen for all performance renders. The angle was chosen such that all three of the culling techniques could be in effect to some extent.

![](img/grass_test_angle.png)

### Number of Blades
![](img/number_fps.png)

As expected, performance decreases with the number of blends to render. Up until $2^{12}$ blades, there isn't a clear advantage for culling. This is likely because the rendering task is sufficiently light that the additional process time of culling have greater impact. As the number of blades increases beyond that threshold, the performance improve boemces signifcant, with the no culling implementation consistently performing below the culling implementation. The benefit of culling, however, is difficult to quantify over all possible configurations as these culling heuristics may not be triggered at certain camera configurations.

### Culling
![](img/culling_fps.png)

This chart was computed using the same camera angle specificed at the top of this section with $2^{16}$ grass blades. It is clear that all culling methods make contributions reducing the computation load, but to a different degree depending on the camera setup.

The impact of orientation culling is the least, and this may perhaps be a result of my wind function exhibiting rather random winds that change directions often, making the chance of a blade being parallel to the camera's view very low.
The distance contribution is more significant but also limited. This may be caused by the scene plane covering only a 15 by 15 space, which is smaller than the specified maximum culling distance. As a result, the number of blades that would be culled at the visible distances would be relatively few.
The frustum culling method made the most significant contribution, and this is expected because the chosen angle omits grass from three of the four corners of the plane for a extended range. As a result, there is a significant number of blades that don't need to be rendered, hence the major improvement.
Finally, the impact of combining all culling methods is provided as a top reference.
Binary file added img/culling_fps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/grass_culling.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/grass_physics.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/grass_render.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/grass_test_angle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/number_fps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/Blades.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode
indirectDraw.firstInstance = 0;

BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory);
}

Expand Down
2 changes: 1 addition & 1 deletion src/Blades.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#include <array>
#include "Model.h"

constexpr static unsigned int NUM_BLADES = 1 << 13;
constexpr static unsigned int NUM_BLADES = 1 << 16;
constexpr static float MIN_HEIGHT = 1.3f;
constexpr static float MAX_HEIGHT = 2.5f;
constexpr static float MIN_WIDTH = 0.1f;
Expand Down
9 changes: 6 additions & 3 deletions src/Camera.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,17 @@

Camera::Camera(Device* device, float aspectRatio) : device(device) {
r = 10.0f;
theta = 0.0f;
phi = 0.0f;
cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 1.0f, 10.0f), glm::vec3(0.0f, 1.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f));
theta = 40.0f;
phi = -30.0f;

cameraBufferObject.projectionMatrix = glm::perspective(glm::radians(45.0f), aspectRatio, 0.1f, 100.0f);
cameraBufferObject.projectionMatrix[1][1] *= -1; // y-coordinate is flipped

BufferUtils::CreateBuffer(device, sizeof(CameraBufferObject), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, buffer, bufferMemory);
vkMapMemory(device->GetVkDevice(), bufferMemory, 0, sizeof(CameraBufferObject), 0, &mappedData);
memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject));

UpdateOrbit(0, 0, 0);
}

VkBuffer Camera::GetBuffer() const {
Expand All @@ -37,6 +39,7 @@ void Camera::UpdateOrbit(float deltaX, float deltaY, float deltaZ) {
glm::mat4 finalTransform = glm::translate(glm::mat4(1.0f), glm::vec3(0.0f)) * rotation * glm::translate(glm::mat4(1.0f), glm::vec3(0.0f, 1.0f, r));

cameraBufferObject.viewMatrix = glm::inverse(finalTransform);
cameraBufferObject.invViewMatrix = finalTransform;

memcpy(mappedData, &cameraBufferObject, sizeof(CameraBufferObject));
}
Expand Down
1 change: 1 addition & 0 deletions src/Camera.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
struct CameraBufferObject {
glm::mat4 viewMatrix;
glm::mat4 projectionMatrix;
glm::mat4 invViewMatrix;
};

class Camera {
Expand Down
132 changes: 129 additions & 3 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,28 @@ void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding
std::vector<VkDescriptorSetLayoutBinding> bindings;
for (int i = 0; i < 3; ++i)
{
VkDescriptorSetLayoutBinding layoutBinding = {};
layoutBinding.binding = i;
layoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
layoutBinding.descriptorCount = 1;
layoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
layoutBinding.pImmutableSamplers = nullptr;

bindings.push_back(layoutBinding);
}

VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS)
{
throw std::runtime_error("Failed to create descriptor set layout");
}
}

void Renderer::CreateDescriptorPool() {
Expand All @@ -216,6 +238,7 @@ void Renderer::CreateDescriptorPool() {
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate
{ VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, static_cast<uint32_t>(scene->GetBlades().size()) * 3 }
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -320,6 +343,39 @@ void Renderer::CreateModelDescriptorSets() {
void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo modelBufferInfo = {};
modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
modelBufferInfo.offset = 0;
modelBufferInfo.range = sizeof(ModelBufferObject);

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &modelBufferInfo;
descriptorWrites[i].pImageInfo = nullptr;
descriptorWrites[i].pTexelBufferView = nullptr;
}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -360,6 +416,69 @@ void Renderer::CreateTimeDescriptorSet() {
void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades
computeDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate compute descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo bladesBufferInfo = {};
bladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer();
bladesBufferInfo.offset = 0;
bladesBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 0].dstBinding = 0;
descriptorWrites[3 * i + 0].dstArrayElement = 0;
descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 0].descriptorCount = 1;
descriptorWrites[3 * i + 0].pBufferInfo = &bladesBufferInfo;
descriptorWrites[3 * i + 0].pImageInfo = nullptr;
descriptorWrites[3 * i + 0].pTexelBufferView = nullptr;

VkDescriptorBufferInfo culledBladesBufferInfo = {};
culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBladesBufferInfo.offset = 0;
culledBladesBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].dstArrayElement = 0;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo;
descriptorWrites[3 * i + 1].pImageInfo = nullptr;
descriptorWrites[3 * i + 1].pTexelBufferView = nullptr;

VkDescriptorBufferInfo numBladesBufferInfo = {};
numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladesBufferInfo.offset = 0;
numBladesBufferInfo.range = NUM_BLADES * sizeof(BladeDrawIndirect);

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].dstArrayElement = 0;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo;
descriptorWrites[3 * i + 2].pImageInfo = nullptr;
descriptorWrites[3 * i + 2].pTexelBufferView = nullptr;
}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -717,7 +836,7 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout };

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -884,6 +1003,10 @@ void Renderer::RecordComputeCommandBuffer() {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr);

// TODO: For each group of blades bind its descriptor set and dispatch
for (uint32_t i = 0; i < scene->GetBlades().size(); i++) {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr);
vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1);
}

// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
Expand Down Expand Up @@ -974,15 +1097,17 @@ void Renderer::RecordCommandBuffers() {

for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
// VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);

// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down Expand Up @@ -1057,6 +1182,7 @@ Renderer::~Renderer() {
vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr);

vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr);

Expand Down
3 changes: 3 additions & 0 deletions src/Renderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,15 @@ class Renderer {
VkDescriptorSetLayout cameraDescriptorSetLayout;
VkDescriptorSetLayout modelDescriptorSetLayout;
VkDescriptorSetLayout timeDescriptorSetLayout;
VkDescriptorSetLayout computeDescriptorSetLayout;

VkDescriptorPool descriptorPool;

VkDescriptorSet cameraDescriptorSet;
std::vector<VkDescriptorSet> modelDescriptorSets;
VkDescriptorSet timeDescriptorSet;
std::vector<VkDescriptorSet> grassDescriptorSets;
std::vector<VkDescriptorSet> computeDescriptorSets;

VkPipelineLayout graphicsPipelineLayout;
VkPipelineLayout grassPipelineLayout;
Expand Down
4 changes: 4 additions & 0 deletions src/Scene.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ VkBuffer Scene::GetTimeBuffer() const {
return timeBuffer;
}

float Scene::GetDeltaTime() const {
return time.deltaTime;
}

Scene::~Scene() {
vkUnmapMemory(device->GetVkDevice(), timeBufferMemory);
vkDestroyBuffer(device->GetVkDevice(), timeBuffer, nullptr);
Expand Down
1 change: 1 addition & 0 deletions src/Scene.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ high_resolution_clock::time_point startTime = high_resolution_clock::now();
void AddBlades(Blades* blades);

VkBuffer GetTimeBuffer() const;
float GetDeltaTime() const;

void UpdateTime();
};
Loading