Levels of Detail for 3D NFTs

tags: `m3`

https://zora.co/collect/0xb67ff46dfde55ad2fe05881433e5687fd1000312

Basics of LOD

LOD improves runtime performance by rendering simpler versions of models for distant objects. This helps shorten response time since you need to download less to start seeing objects, and overall save on network usage since unused LODs are never downloaded.

These benefits combine into the ability to create richer content. Good world builders today are hesitant to put many high quality models in the same scene for negative performance impact reasons that these techniques can help mitigate.

https://twitter.com/dankvr/status/1646345888678260736
Source: https://hubs.mozilla.com/labs/hubs-lod-support/ 👈 READ this!

ELIF (Explain like I'm 5)

Imagine you have a toy car. When you hold it up close, you can see all the tiny details like the wheels and the little stickers. But when the car is far away, you can't see those details anymore, and it doesn't matter, because it still looks like a car. LOD (levels of detail) in games is like that! When things are close, they have lots of details, but when they're far away, they have fewer details to make the game run faster.

GPT4 John Carmack explaining LOD

Levels of detail (LOD) is a technique used in computer graphics to optimize the rendering of 3D models. By creating multiple versions of a model with varying polygon counts, we can display the most appropriate version based on the viewer's distance. This saves processing power and maintains performance without sacrificing visual quality. In essence, LOD dynamically adjusts the complexity of 3D objects to balance performance and visual fidelity.

How to Generate LODs

https://gltf-transform.donmccurdy.com/ - glTF2.0 SDK
https://github.com/takahirox/glTF-Transform-lod-script - Script to generate MSFT_lods
- https://github.com/playkostudios/model-splitter - another good tool for LODs

Results when testing 1.glb

LOD Triangles: 20k (original) -> 6.8k -> 6.6k tris
LOD File sizes: 2.6 MB (original) -> 485 KB -> 285 KB
Downside: Original total file size bloated from 2.6 MB → 3.1 MB with combined glTF

Replace with jin in art room + hacker battlestation / create art to go with post

node generate_lod.mjs in.glb out.glb –ratio 0.5,0.1 –error 0.01,0.05 –coverage 0.7,0.3,0.0 –texture 512x512,128x128

Testing and comparing 2 LODs on an optimized glb using MSFT_lod glTF extension with 0.5 + 0.1 ratio and 512x512 + 128x128 texture size

Couple observations:

LOD 1 is 9x smaller than LOD0, but only looks good from a distance
The file size of LOD 0 + 1 < original file size
With more control over layers, file size can be smaller
- Tested by optimizing model manually in blender, decimating body separate from the glasses produced better results

LOD + Draco
Here’s a look the size difference between the original glb files exported out of boom-tools and then draco compressed + draco compressed x2 MSFT_LOD:

The LOD version of boomboxheads is the third column, barely bigger in file size than single compressed models

420 original size glbs = 1.47 GB
420 draco compressed glbs = 863 MB
420 draco x2 MSFT_lod glbs = 883 MB 😯

Wait a second, 20 MB difference for twice as many glbs? I’m not really sure how, because that’s like a 43x difference between the LODs in file size, and would mean on average the LOD1 is ~48 KB (48 * 420 = 20 MB). However, when I exported a few of these out manually, they seemed on average about 260 KB in size.

WUT DA!?

I tested a few more and realized the combined LOD version using MSFT_lod extension is smaller than separate exported glTF files. Here’s a visual:

The combined 1_LOD.glb is 1.96 MB and when split into 2 files they become 2.5 MB total

Downsides

Currently not that many platforms support the MSFT_lod extension
Even if the file size of a combined LOD model is similar to a draco compressed, if the platform doesn't have HTTP range requests it still downloads the whole file
Draw calls doubled, tested with gltf-viewer

When to Generate LODs

Another post with cooking analogies about LOD, metaverse salad bowls, optimization workflows, and the world computer thumbnail generator. Ideas on how to improve virtual world builder workflows

This topic could be worth a whole post in itself. In short, I realize how much upfront work there is to do to generate LODs and attach to metadata. The process outlined here won't scale if it has to be driven by the artists pre-minting.

Then I realized something. Our operating systems automatically generate thumbnails for various media files for our data at rest, so why not too for our 3D models and then by extension for the world computer? Afterall, collectors and platforms are more incentivized to run nodes that can do the type of automated processing for generating LODs from data at rest, and to seed the content that powers a decentralized network.

In my research I found tons of open source photo and video gallery software will automate the process of generating thumbnails, perhaps some of the techniques for generating LOD for 3D models outlined in this post can be applied similarly for 3D galleries?

This way would lift a ton of upfront work to allow artists to just mint the final piece while others do the work of gracefully degrading such various levels of detail. Many of the platforms already do this for images, but don’t directly offer you the files. Instead, usually these files are produced by marketplaces that send them around via CDNs for improving loading times.

Another possibility might be for virtual world platforms to do LOD processing on the backend, and then having the ability to export the results for use across the metaverse.

https://twitter.com/dankvr/status/1645096016650043392

Progressive Loading

One of the main questions I’m wondering is this:

If you only need LOD1, does it make sense to download the entire file (~8x bigger)?

Earlier in the experiment LOD1 was 285 KB and LOD0 was ~2.1 MB. For objects that are in the distance, needing to download the entire file when you only need a small representation is wasteful. Generating LODs is one thing, loading is another that is up to platforms to implement. One method would be to integrate HTTP range requests and progressive loading, as illustrated below (source: https://hubs.mozilla.com/labs/hubs-lod-support/)

Mozilla Hubs already seems to have HTTP range requests implemented to progressively load LOD files!

HTTP range requests enable partial downloading. An HTTP range request asks the server to send back partial data of a file. HTTP range requests enable downloading a specific level of a bundled glb file. It has a 206 Partial Content success response code.

Btw all modern web browsers support range requests, it just isn't fully supported with 3D web graphics libraries like threejs, playcanvas, babylonjs, etc for glTF loaders yet.

If interested in going deeper on this topic I highly recommend reading the Hubs LOD support blog post linked below. The author saw tooling that didn't exist yet for MSFT_lod, so he wrote it for multiple libraries and blender. There's no an open PR for HTTP range requests slated as a milestone for the next major release of Threejs thanks to takahirox 👏

https://hubs.mozilla.com/labs/hubs-lod-support/

Other ideas came from the digital assets group in OMF with discussions like GLTF as an arbitrary-binary container format that has a simple JSON interface, and the Dacti package format which is optimized for sparsely fetching data from CDNs in real-time.

Perhaps it's also worth seeing if this project can become a case study for glXF, a WIP specification from the Khronos Group for creating a new file type in the glTF ecosystem for efficient composition of multiple glTF assets.

What about reading the paths to the files from the NFT metadata? 🤔

LODs in NFT Metadata

Having references to 3D model LODs in NFT metadata can offer some benefits over having LODs combined with glTF files via MSFT_lod extension:

Benefits and Shortcomings

Easier updates

If a new, improved LOD model becomes available, it may be easier to update the reference in metadata rather than updating the entire glTF file.

Reduced file size

Including multiple LODs within a glTF file can result in larger file sizes, which can negatively affect loading times and performance in platforms that don't support MSFT_lod and progressive loading. By referencing LODs in metadata, it can reduce the size of the glTF file and make it easier to manage.

HOWEVER…

It's important to note that having references to LODs in NFT metadata requires additional work to implement, as it involves creating a separate data structure to store the references and handling the loading and swapping of LOD models in the application code. Additionally, not all platforms or applications may support LOD references in metadata, so developers may need to provide alternative solutions in those cases.

Also if the LOD asset is being pulled from separate IPFS hashes, then the speed difference might be negligable and maybe worse off if relying on this method alone. Correct me if I'm wrong, but since you're dealing with disparate files for LODs then it means you'll expect they need to propogate the CIDs individually from the swarm, which can vary by how long it takes to resolve. Perhaps this might be mitigated and worth testing if all the files for the NFT are bundled together in a folder while utilizing IPNS, or via some combination with ENS domains.

Right now we have all the files for boomboxheads v2 organized in a GitHub repo that we can test with after minting is done. If interested, reach out on M3 org discord - we have a decent supply of Filecoin and other tokens that we can put towards open bounties.

Multi-asset NFT Metadata

There’s a few multi-asset NFT metadata schemes that we’ve been looking at for the past year that are being developed by MetaFactory, Nifty Island, Cryptoavatars, and Metamundo. I think it’s great that we are all taking slightly different approaches to the same problem while still early to see which ones work out the best.

Also, I don't think it matters too much in the end because AI can pretty much one-shot translate between any and help us write data transformer programs that can automate converting between, such as this simple bash script.

For additional tooling, we can integrate LOD import / export directly into popular 3D programs such as Blender. Here's a couple programs that would be useful starting places for interested devs:

Extensible Token Metadata (ETM)

Link: https://etm-standard.github.io/ETM_MULTIASSET_v1.0.0

As the use cases for NFTs have expanded, the need for an NFT to represent multiple files has grown and been addressed in several ways. However, the inconsistency in these varied approaches has rendered assets beyond images and videos unable to be ingested by NFT consumers in a streamlined way. This has forced consumers to create custom centralized tooling to access specific assets - a path that is not scalable or maintainable in the long-term.

This standard is an extension of ETM and provides a decentralized approach to representing multiple digital assets with a single NFT in a scalable and maintainable way.

The goal is to provide a streamlined approach to the following:

Associating multiple assets with an NFT such that custom or centralized tooling is not required

Supporting NFTs with heterogeneous media types (e.g. 3D models, animations, etc.)

Providing a clear definition of the file type for files associated with an NFT

I talked about ETM in the last blog post if you want to go back and read that first. Now I just finished up with the script that converts all boomboxheads v2 NFT metadata to ETM metadata spec, the differences between them can be seen below:

We've added additional fields for the glTF LODs, the 4k image, and replaced vrm_url with file_type within the assets node.

ETM is still experimental, but having spoken with several project founders exploring similar problems about how to link multiple assets to the same token and this spec seems to be the right direction. More testing needs to be done by projects that have ready made assets for such.

Additional Ideas

There has been a mini cambrian explosion of experiments about different approaches to various representations of the same digital asset / character linked to a token happening in the web3 gaming scene. I believe we're near the precipice of having a feature rich playground for metaverse builders exporing LODs for NFTs.

Tubby Cats NFT, Blitnauts / Logos, Forgotten Rune Wizards Cult, Cyberbrokers, MetaKey, Cryptoavatars, CloneX, and more

One interesting recent case study would by Cyberbrokers Mechs since they're being constructed by individual NFT parts that collectors can assemble together into a new giant mech robot with various 3D asset types, stats, and preview image.

Sprites Sheets

I think that Doom style sprite avatars are going to be the premiere LOD solution for avatars in the future. Recently VRChat implemented a feature to generate imposters of your avatar to be used as fallback as seen in this developer update video.

https://hackmd.io/@XR/avatarlod

https://twitter.com/Worldwide_WEB3/status/1563600468613693442

Thing is, if you just wanted this code it is currently welded with huge chunks of a bigger machine that takes skill to decouple into a standalone app.

I do have some code here that can be repurposed for creating sprite sheets using screenshot-glb by Shopify. In particular, look at the parts for making a gif and go from there (GPT4 can help if you get stuck :P).

Trading Cards

https://twitter.com/dankvr/status/1507189063694131202

Conclusion

I already batch converted all the boomboxheads V2 to MSFT_lod versions and as separate LODs and the NFT metadata to ETM spec to prepare for as a metaverse interop science lab, but we cannot update the collection until it finishes minting out. That's fine because in the meantime we should do some isolated experiments to see how well the spec performs with practical testing by minting a few NFTs with ETM to test with.

If interested in supporting or joining the conversation on Github discussions. So far we are near 25% of boomboxheads v2 being minted, thank you so much for reading and supporting our work.

See you in the metaverse
- jin

Notes

IMO if we want to reach the goal of having a truly open and decentralized metaverse then it should be possible for virtual worlds to work offline and be self-hostable on your own hardware. This may couple well with services like ClubNFT which offer backup services to download the content for the associated NFTs you own, which you can then generate LODs for on your own machine.

Idea: generate LODs and export with metadata encoded separately or as part of glTF

We can grow from here in establishing a more robust decentralized content network of node operators running programs that can generate and distribute various representations of digital assets.