This is a supplementary material associated to the article Procedural Texture Extrapolation using Point Process Texture Basis Functions. This is intended for developpers who would like to integrate it in industry for production, researchers that would like to use it as starting SDK, etc...
We rely on 3 types of machines:
Graphics Card | GPU architecture | Nb Cores | Memory |
---|---|---|---|
NVIDIA GeForce 1060 GTX | Pascal | 1280 | 6 Go |
NVIDIA GeForce 1080 GTX | Pascal | 2560 | 8 Go |
NVIDIA Quadro P5000 | Pascal | 2560 | 16 Go |
We rely on the following cross-plateform technologies:
Technology | Description |
---|---|
C++ | programming langage |
CMake | cross-plateform project building |
OpenGL | 3D graphics library |
glad | OpenGL loader |
glm | 3D maths library |
stb | image load/save |
GLFW | window/graphics context management |
ImGui | graphical user interface |
We use OpenGL compute shaders for all algorithms and traditional graphics shaders for rendering.
We use one unique megakernel based on OpenGL compute shader. The PPTBF is stored in a 1D texture of 1-channel floating point (R32F). This approach requires no additional GPU memory allocation. But requires a lot of registers, giving rise to register pressure, latency... Complex code => branching...
For OpenGL compute kernels, we use blocks of 8x8 threads. Memory is limited by graphics card. For each output image size, we test without and with repulsive forces (5 iterations in Lloyd algorithm). These are the results for Voronoi window and 1 to 5 Gabor kernels in Feature function.
We show speed according to image size. Target device: NVIDIA GeForce 1060 GTX 6Go.
Output Size (pixels) | Time (ms) | Add repulsion forces (Lloyd) |
---|---|---|
256 x 256 | 12.587 | ... |
512 x 512 | 46.802 | ... |
1024 x 1024 | 177.103 | ... |
2048 x 2048 | 754.369 | ... |
Using a wavefront approach split algorithm into multi-passes. [Laine et al. 2013] It follows node-based tools way of thinking. User can then choose and customize components. The deformation component D could be fBm (fractional brownian motion); turbulence, etc... The point process component PP could be normal or add forces such as repulsions. The wavefront approach requires to store results at each step. We rely on imageStore to write into textures, and bindless texture extensions to be able to access a lot of textures/images in read or write mode. Bindless textures requires an GPU device at least Kepler (but some old Kepler are not working).
For OpenGL compute kernels, we use blocks of 8x8 threads. Memory is limited by graphics card. For each output image size, we test without and with repulsive forces (5 iterations in Lloyd algorithm). These are the results for Voronoi window and 1 to 5 Gabor kernels in Feature function.
We show speed according to image size. Target device: NVIDIA GeForce 1060 GTX 6Go.
Output Size (pixels) | Total Time (ms) | Model & Deformation (ms) | Point Process (ms) | Window Function (ms) | Feature Function (ms) | Compositing (ms) |
---|---|---|---|---|---|---|
256 x 256 | 11.486 | 0.122 | 7.663 | 3.680 | 0.021 | ... |
512 x 512 | 46.483 | 0.458 | 31.461 | 14.500 | 0.065 | ... |
1024 x 1024 | 254.953.103 | 1.798 | 156.460 | 96.465 | 0.230 | ... |
2048 x 2048 | 1067.572 | 7.162 | 684.703 | 410.841 | 0.866 | ... |
Rendering relies on a classical fullscreen triangle with a fragment shader reading the PPTBF texture.
For OpenGL compute kernels, we use blocks of 8x8 threads. Memory is limited by graphics card. For each output image size, we test without and with repulsive forces (5 iterations in Lloyd algorithm). These are the results for Voronoi window and 1 to 5 Gabor kernels in Feature function.
We show speed according to image size. Target device: NVIDIA GeForce 1060 GTX 6Go.
We empirically sampled our PPTBF parameter space based on our real-time PPTBF visualizer/designer. One cannot uniformly sample due to too much parameter. We try to keep parameters impacting appearance, and separate data according to 98 classes based on point process tiling types, window function types and feature function types. We finally obtain >= 45 millions of PPTBF images (continous, real values).
Nb Images | Size | Storage | Time | Machines Used |
---|---|---|---|---|
>= 45 millions (45446112) | 400x400 pixels | <= 4 To | 2 weeks | 12 GTX 1060, 1 GTX 1080, 2 Quadro P500 |
We used 3 standard computer vision descriptors plus one comming from deep learning. None of them seemed to give better results than the others in all cases.
NOTE: we reduced the size of the PPTBF database using one of the descriptor, thresholding them in 3 categoris (20%, 50% and 80%), binarizing them, computing descriptors, comparing descriptors with L2-distance given a threshold based on our observations (in terms of perceptual appearance, trying to remove similar candidates).
Threshold | Nb Images |
---|---|
20% | 171702 |
50% | 140743 |
80% | 195112 |
We used 4 descriptors:
We rely on the following cross-plateform technologies:
Technology | Description |
---|---|
Anaconda | cross-plateform package framework |
Python | scientific programming |
Numpy | scientific computing framework |
GPy | Gaussian Process framework |
FLANN | nearest neighbor search library |
We rely on the FLANN library for fast nearest neighbor search to determine similar PPTBF candidates. Our target machine used 8 cores.
Nb Feature Vectors | Generation Time | Request Time (150 nearest neighbors) |
---|---|---|
51825 | 24.977 s | 695 ms |
We rely on the GPy python Gaussian Process framework.
Nb Elements (nearest neighbors) | Nb Elements (after MCMC refinement) | GPR (regression) | GPLVM (latent variable model) | Similarity Map |
---|---|---|---|---|
20 | 150 | 3 s | 2 s | 0.3 s |