Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Commit

Permalink
Merge pull request #460 from parjohns/patch-1
Browse files Browse the repository at this point in the history
10102.md
  • Loading branch information
Jeehut authored Dec 5, 2023
2 parents 670236c + 176eb93 commit 083ebed
Show file tree
Hide file tree
Showing 3 changed files with 142 additions and 0 deletions.
142 changes: 142 additions & 0 deletions content/notes/wwdc22/10102.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
contributors: parjohns
Speakers: Galo Avila, engineering manager in GPU Software, and Eylon Caspi.
---

# Offline Compilation
Offline compilation can help reduce app stutters, first launch, and new level load times. This is accomplished by moving GPU binary generation to project build time.

## Previous Way to Generate GPU Binaries
- Metal library is instantiated from source during runtime using AIR (Apple's Intermediate Representation) this operation is CPU intensive.
- Library generation can be moved to build time by precompiling the source file and instantiating it.
- When the Metal library is in memory, a Pipeline State Descriptor and Pipeline State Object (PSO) are created.
- Creating a PSO is CPU intensive.
- After the PSO is created, just-in-time GPU binary generation takes place.
- When PSOs are created, Metal stores the GPU binaries in its file system cache.
- Binary archives let users control when and where GPU binaries are cached.
- PSO creation can become a lightweight operation by using PSO descriptors to cache GPU binaries in an archive.
![Runtime][runtime]
## What's New
Offline binary generation allows for a Metal pipeline script to be specified at project build time. This new artifact is equivalent to a collection of Pipeline State Descriptors in the API. The output provides a binary archive that can be loaded to accelerate PSO creation.
![psodescriptor][psodescriptor]

### Creating Metal Pipeline Script
A Metal pipeline script is a JSON formatted description of one or more API Pipeline State Descriptors and can be created in a JSON editor or harvested from the binary archives.


**Using a JSON Editor**
1. Specify API Metal library file path.
2. Add API render descriptor function names as render pipeline properties.
3. Add pipeline state information (such as raster_sample_count or pixel formats).

Metal code for generating a render pipeline script
```
// An existing Obj-C render pipeline descriptor
NSError *error = nil;
id<MTLDevice> device = MTLCreateSystemDefaultDevice();
id<MTLLibrary> library = [device newLibraryWithFile:@"default.metallib" error:&error];
MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new];
desc.vertexFunction = [library newFunctionWithName:@"vert_main"];
desc.fragmentFunction = [library newFunctionWithName:@"frag_main"];
desc.rasterSampleCount = 2;
desc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
desc.depthAttachmentPixelFormat = MTLPixelFormatDepth32Float;
```

JSON equivalent
```{
"//comment": "Its equivalent new JSON script",
"libraries": {
"paths": [
{
"path": "default.metallib"
}
]
},
"pipelines": {
"render_pipelines": [
{
"vertex_function": "vert_main",
"fragment_function": "frag_main",
"raster_sample_count": 2,
"color_attachments": [
{
"pixel_format": "BGRA8Unorm"
},
],
"depth_attachment_pixel_format": "Depth32Float"
}
]
}
}
```
Further schema details can be found in Metal's developer documentation
https://developer.apple.com/documentation/metal

**Using Metal Runtime**

This is done during runtime
1. Create Pipeline Descriptor with state and functions
2. Add descriptor to binary archive
3. Serialize binary archive to be imported by app

Harvesting sample
```// Create pipeline descriptor
MTLRenderPipelineDescriptor *pipeline_desc = [MTLRenderPipelineDescriptor new];
pipeline_desc.vertexFunction = [library newFunctionWithName:@"vert_main"];
pipeline_desc.fragmentFunction = [library newFunctionWithName:@"frag_main"];
pipeline_desc.rasterSampleCount = 2;
pipeline_desc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
pipeline_desc.depthAttachmentPixelFormat = MTLPixelFormatDepth32Float;
// Add pipeline descriptor to new archive
MTLBinaryArchiveDescriptor* archive_desc = [MTLBinaryArchiveDescriptor new];
id<MTLBinaryArchive> archive = [device newBinaryArchiveWithDescriptor:archive_desc error:&error];
bool success = [archive addRenderPipelineFunctionsWithDescriptor:pipeline_desc error:&error];
// Serialize archive to file system
NSURL *url = [NSURL fileURLWithPath:@"harvested-binaryArchive.metallib"];
success = [archive serializeToURL:url error:&error];
```

Extracting JSON pipelines script from binary archive can be done to move generation from runtime to build time.

This can be done by using `metal-source` while specifying buffers and output directory options

```metal-source -flatbuffers=json harvested-binaryArchive.metallib -o /tmp/descriptors.mtlp-json```

### Generating Offline GPU Binaries

Generating GPU binary from source can be done by invoking `metal` with source, pipeline script, and output

```metal shaders.metal -N descriptors.mtlp-json -o archive.metallib```

Generating GPU Binary from Metal Library can be done by invoking `metal-tt` with source, pipeline script, and output file

```metal-tt shaders.metallib descriptors.mtlp-json -o archive.metallib```

### Loading Offline Binaries
1. Provide binary archive URL when creating archive descriptor
2. Use URL to instantiate archive

For more information regarding API, see last year's talk.

https://developer.apple.com/videos/play/wwdc2021/10229

# Optimize for Size
The Metal Compiler optimizes aggressively for runtime performance. These optimizations may expand the GPU program size which may have unexpected costs. Xcode 14 provides a new optimization mode for metal: optimize for size.
This setting prevents optimizations such as inlining and loop unrolling which should lower application size and compile time. This setting is useful for situations where the user encounters long compilation time. Optimize for size may hurt runtime performance, however it may improve runtime performance if the program was
incurring runtime penalties associated with large size.

## Enabling Optimize for Size
This feature can be enabled in 3 ways:
1. In Xcode build settings under Metal Compiler - Build Options by selecting Size [-Os] under Optimization Level
2. In Terminal with option `-Os`
3. In Metal Framework setting `MTLLibraryOptimizationLevelSize` in an `MTLCompileOptions` object



[runtime]: ../../../images/notes/wwdc22/10102/runtime.JPG
[psodescriptor]: ../../../images/notes/wwdc22/10102/psodescriptor.JPG
Binary file added images/notes/wwdc22/10102/psodescriptor.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/notes/wwdc22/10102/runtime.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 083ebed

Please sign in to comment.