赞
踩
https://github.com/jiabaodan/Direct12BookReadingNotes
在之前的代码中,我们在每帧结束的时候调用D3DApp::FlushCommandQueue方法来同步CPU和GPU,这个方法可以使用,但是很低效:
这个问题的其中一个解决方案是针对CPU更新的资源创建一个环形数组,我们叫它帧资源(frame resources),通常情况下数组中使用3个元素。该方案中,CPU提交资源后,将会获取下一个可使用的资源(GPU没有在执行的)继续数据的更新,使用3个元素可以确保CPU提前2个元素更新,这样就可以保证GPU一直的高效运算。下面的例子是使用在Shape示例中的,因为CPU只需要更新常量缓冲,所以帧数据只包含常量缓冲:
// Stores the resources needed for the CPU to build the command lists // for a frame. The contents here will vary from app to app based on // the needed resources. struct FrameResource { public: FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount); FrameResource(const FrameResource& rhs) = delete; FrameResource& operator=(const FrameResource& rhs) = delete; ˜FrameResource(); // We cannot reset the allocator until the GPU is done processing the // commands. So each frame needs their own allocator. Microsoft::WRL::ComPtr<ID3D12CommandAllocator> CmdListAlloc; // We cannot update a cbuffer until the GPU is done processing the // commands that reference it. So each frame needs their own cbuffers. std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr; std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr; // Fence value to mark commands up to this fence point. This lets us // check if these frame resources are still in use by the GPU. UINT64 Fence = 0; }; FrameResource::FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount) { ThrowIfFailed(device->CreateCommandAllocator( D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(CmdListAlloc.GetAddressOf()))); PassCB = std::make_unique<UploadBuffer<PassConstants>> (device, passCount, true); ObjectCB = std::make_unique<UploadBuffer<ObjectConstants>> (device, objectCount, true); } FrameResource::˜ FrameResource() { }
在我们的应用中使用Vector来实例化3个资源,并且跟踪当前的资源:
static const int NumFrameResources = 3;
std::vector<std::unique_ptr<FrameResource>> mFrameResources;
FrameResource* mCurrFrameResource = nullptr;
int mCurrFrameResourceIndex = 0;
void ShapesApp::BuildFrameResources()
{
for(int i = 0; i < gNumFrameResources; ++i)
{
mFrameResources.push_back(std::make_unique<FrameResource> (
md3dDevice.Get(), 1,
(UINT)mAllRitems.size()));
}
}
现在对于CPU第N帧,执行算法是:
void ShapesApp::Update(const GameTimer& gt) { // Cycle through the circular frame resource array. mCurrFrameResourceIndex = (mCurrFrameResourceIndex + 1) % NumFrameResources; mCurrFrameResource = mFrameResources[mCurrFrameResourceIndex]; // Has the GPU finished processing the commands of the current frame // resource. If not, wait until the GPU has completed commands up to // this fence point. if(mCurrFrameResource->Fence != 0 && mCommandQueue->GetLastCompletedFence() < mCurrFrameResource->Fence) { HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS); ThrowIfFailed(mCommandQueue->SetEventOnFenceCompletion( mCurrFrameResource->Fence, eventHandle)); WaitForSingleObject(eventHandle, INFINITE); CloseHandle(eventHandle); } // […] Update resources in mCurrFrameResource (like cbuffers). } void ShapesApp::Draw(const GameTimer& gt) { // […] Build and submit command lists for this frame. // Advance the fence value to mark commands up to this fence point. mCurrFrameResource->Fence = ++mCurrentFence; // Add an instruction to the command queue to set a new fence point. // Because we are on the GPU timeline, the new fence point won’t be // set until the GPU finishes processing all the commands prior to // this Signal(). mCommandQueue->Signal(mFence.Get(), mCurrentFence); // Note that GPU could still be working on commands from previous // frames, but that is okay, because we are not touching any frame // resources associated with those frames. }
这个方案并没有完美解决等待,如果其中一个处理器处理太快,它还是要等待另一个处理器。
绘制一个物体需要设置大量参数,比如创建顶点和索引缓存,绑定常量缓冲,设置拓扑结构,指定DrawIndexedInstanced参数。如果我们要绘制多个物体,设计和创建一个轻量级结构用来保存上述所有数据就很有用。我们对这一组单个绘制调用需要的所有数据称之为一个渲染物体(render item),当前Demo中,我们RenderItem结构如下:
// Lightweight structure stores parameters to draw a shape. This will // vary from app-to-app. struct RenderItem { RenderItem() = default; // World matrix of the shape that describes the object’s local space // relative to the world space, which defines the position, // orientation, and scale of the object in the world. XMFLOAT4X4 World = MathHelper::Identity4x4(); // Dirty flag indicating the object data has changed and we need // to update the constant buffer. Because we have an object // cbuffer for each FrameResource, we have to apply the // update to each FrameResource. Thus, when we modify obect data we // should set // NumFramesDirty = gNumFrameResources so that each frame resource // gets the update. int NumFramesDirty = gNumFrameResources; // Index into GPU constant buffer corresponding to the ObjectCB // for this render item. UINT ObjCBIndex = -1; // Geometry associated with this render-item. Note that multiple // render-items can share the same geometry. MeshGeometry* Geo = nullptr; // Primitive topology. D3D12_PRIMITIVE_TOPOLOGY PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST; // DrawIndexedInstanced parameters. UINT IndexCount = 0; UINT StartIndexLocation = 0; int BaseVertexLocation = 0; };
我们的应用将包含一个渲染物体列表来表示他们如何渲染;需要不同PSO的物体会放置到不同的列表中:
// List of all the render items.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;
// Render items divided by PSO.
std::vector<RenderItem*> mOpaqueRitems;
std::vector<RenderItem*> mTransparentRitems;
之前的章节中我们介绍了一个新的常量缓冲:
std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
它主要包含一些各个物体通用的常量,比如眼睛位置,透视投影矩阵,屏幕分辨率数据,还包括时间数据等。目前我们的Demo不需要所有这些数据,但是都实现他们会很方便,并且只会消耗很少的额外数据空间。比如我们如果要做一些后期特效,渲染目标尺寸数据就很有用:
cbuffer cbPass : register(b1) { float4x4 gView; float4x4 gInvView; float4x4 gProj; float4x4 gInvProj; float4x4 gViewProj; float4x4 gInvViewProj; float3 gEyePosW; float cbPerObjectPad1; float2 gRenderTargetSize; float2 gInvRenderTargetSize; float gNearZ; float gFarZ; float gTotalTime; float gDeltaTime; };
我们也需要修改之和每个物体关联的常量缓冲。目前我们只需要世界变换矩阵:
cbuffer cbPerObject : register(b0)
{
float4x4 gWorld;
};
这样做的好处是可以将常量缓冲分组进行更新,每一个pass更新的常量缓冲需要每一个渲染Pass的时候更新;物体常量只需要当物体世界矩阵变换的时候更新;静态物体只需要在初始化的时候更新一下。在我们Demo中,实现了下面的方法来更新常量缓冲,它们每帧在Update中调用一次:
void ShapesApp::UpdateObjectCBs(const GameTimer& gt) { auto currObjectCB = mCurrFrameResource->ObjectCB.get(); for(auto& e : mAllRitems) { // Only update the cbuffer data if the constants have changed. // This needs to be tracked per frame resource. if(e->NumFramesDirty > 0) { XMMATRIX world = XMLoadFloat4x4(&e->World); ObjectConstants objConstants; XMStoreFloat4x4(&objConstants.World, XMMatrixTranspose(world)); currObjectCB->CopyData(e->ObjCBIndex, objConstants); // Next FrameResource need to be updated too. e->NumFramesDirty--; } } } void ShapesApp::UpdateMainPassCB(const GameTimer& gt) { XMMATRIX view = XMLoadFloat4x4(&mView); XMMATRIX proj = XMLoadFloat4x4(&mProj); XMMATRIX viewProj = XMMatrixMultiply(view, proj); XMMATRIX invView = XMMatrixInverse(&XMMatrixDeterminant(view), view); XMMATRIX invProj = XMMatrixInverse(&XMMatrixDeterminant(proj), proj); XMMATRIX invViewProj = XMMatrixInverse(&XMMatrixDeterminant(viewProj), viewProj); XMStoreFloat4x4(&mMainPassCB.View, XMMatrixTranspose(view)); XMStoreFloat4x4(&mMainPassCB.InvView, XMMatrixTranspose(invView)); XMStoreFloat4x4(&mMainPassCB.Proj, XMMatrixTranspose(proj)); XMStoreFloat4x4(&mMainPassCB.InvProj, XMMatrixTranspose(invProj)); XMStoreFloat4x4(&mMainPassCB.ViewProj, XMMatrixTranspose(viewProj)); XMStoreFloat4x4(&mMainPassCB.InvViewProj, XMMatrixTranspose(invViewProj)); mMainPassCB.EyePosW = mEyePos; mMainPassCB.RenderTargetSize = XMFLOAT2((float)mClientWidth, (float)mClientHeight); mMainPassCB.InvRenderTargetSize = XMFLOAT2(1.0f / mClientWidth, 1.0f / mClientHeight); mMainPassCB.NearZ = 1.0f; mMainPassCB.FarZ = 1000.0f; mMainPassCB.TotalTime = gt.TotalTime(); mMainPassCB.DeltaTime = gt.DeltaTime(); auto currPassCB = mCurrFrameResource->PassCB.get(); currPassCB->CopyData(0, mMainPassCB); }
我们更新顶点着色器相应的支持这个缓冲变换:
VertexOut VS(VertexIn vin)
{
VertexOut vout;
// Transform to homogeneous clip space.
float4 posW = mul(float4(vin.PosL, 1.0f), gWorld);
vout.PosH = mul(posW, gViewProj);
// Just pass vertex color into the pixel shader.
vout.Color = vin.Color;
return vout;
}
这里额外的逐顶点矩阵相乘,在现在强大的GPU上是微不足道的。
着色器需要的资源发生变化,所以需要更新根签名相应的包含两个描述表:
CD3DX12_DESCRIPTOR_RANGE cbvTable0; cbvTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 0); CD3DX12_DESCRIPTOR_RANGE cbvTable1; cbvTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 1); // Root parameter can be a table, root descriptor or root constants. CD3DX12_ROOT_PARAMETER slotRootParameter[2]; // Create root CBVs. slotRootParameter[0].InitAsDescriptorTable(1, &cbvTable0); slotRootParameter[1].InitAsDescriptorTable(1, &cbvTable1); // A root signature is an array of root parameters. CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(2, slotRootParameter, 0, nullptr, D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT__LAYOUT);
不要在着色器中使用太多的常量缓冲,为了性能[Thibieroz13]建议保持在5个以下。
这节将会展示如何创建椭球体,球体,圆柱体和圆锥体。这些形状对于绘制天空示例,Debugging,可视化碰撞检测和延时渲染非常有用。
我们将在程序中创建几何体的代码放在GeometryGenerator(GeometryGenerator.h/.cpp)类中,该类创建的数据保存在内存中,所以我们还需要将它们赋值到顶点/索引缓冲中。MeshData结构是一个内嵌在GeometryGenerator中用来保存顶点和索引列表的简单结构:
class GeometryGenerator { public: using uint16 = std::uint16_t; using uint32 = std::uint32_t; struct Vertex { Vertex(){} Vertex( const DirectX::XMFLOAT3& p, const DirectX::XMFLOAT3& n, const DirectX::XMFLOAT3& t, const DirectX::XMFLOAT2& uv) : Position(p), Normal(n), TangentU(t), TexC(uv){} Vertex( float px, float py, float pz, float nx, float ny, float nz, float tx, float ty, float tz, float u, float v) : Position(px,py,pz), Normal(nx,ny,nz), TangentU(tx, ty, tz), TexC(u,v){} DirectX::XMFLOAT3 Position; DirectX::XMFLOAT3 Normal; DirectX::XMFLOAT3 TangentU; DirectX::XMFLOAT2 TexC; }; struct MeshData { std::vector<Vertex> Vertices; std::vector<uint32> Indices32; std::vector<uint16>& GetIndices16() { if(mIndices16.empty()) { mIndices16.resize(Indices32.size()); for(size_t i = 0; i < Indices32.size(); ++i) mIndices16[i] = static_cast<uint16> (Indices32[i]); } return mIndices16; } private: std::vector<uint16> mIndices16; }; … };
我们通过定义底面和顶面半径,高度,切片(slice)和堆叠(stack)个数来定义一个圆柱体网格,如下图,我们将圆柱体划分成侧面,底面和顶面:
我们创建的圆柱体中心的原点,平行于Y轴,所有顶点依赖于环(rings)。每个圆柱体有stackCount + 1环,每一环有sliceCount个独立的顶点。每一环半径的变化为(topRadius – bottomRadius)/stackCount;所以基本的创建圆柱体的思路就是遍历每一环创建顶点:
GeometryGenerator::MeshData
GeometryGenerator::CreateCylinder(
float bottomRadius, float topRadius,
float height, uint32 sliceCount, uint32
stackCount)
{
MeshData meshData;
//
// Build Stacks.
//
float stackHeight = height / stackCount;
// Amount t
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。