Can I make the falling sand game run parallel on the GPU?









up vote
0
down vote

favorite












I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:



Sand particles



In one update, both particle 1A and 1B could flow into space T.



Sequential



In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.



Parallel



My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.



My question is then, can I make a parallel solution to this, or can it only be solved sequentially?



Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:



struct Particle 
float4 color;
int live;
int particleType;
;


kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]])

uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];

Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right

if (particle.live == 1 && particle.particleType == SAND) // move down

if (bottom.live == 0)

dst[index].live = 0;

dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;


else if (bottomLeft.live == 0) // move down to the left if clear
dst[index].live = 0;

dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;



else if (bottomRight.live == 0) // move down to the right if clear

dst[index].live = 0;

dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;

else // cant move
dst[index] = src[index];




if (particle.particleType == WALL)
dst[index] = particle;
dst[index].live = 1;


if (particle.particleType == ERASER)
dst[index].color = blue;
dst[index].live = 0;











share|improve this question























  • Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
    – Ken Thomases
    Nov 10 at 20:20










  • Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
    – Chris
    Nov 10 at 20:23










  • I guess I need to investigate how to do an atomic compare and exchange in this scenario.
    – Chris
    Nov 10 at 20:29






  • 1




    Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
    – Ken Thomases
    Nov 10 at 21:25






  • 1




    Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
    – Ken Thomases
    Nov 11 at 19:44














up vote
0
down vote

favorite












I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:



Sand particles



In one update, both particle 1A and 1B could flow into space T.



Sequential



In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.



Parallel



My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.



My question is then, can I make a parallel solution to this, or can it only be solved sequentially?



Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:



struct Particle 
float4 color;
int live;
int particleType;
;


kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]])

uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];

Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right

if (particle.live == 1 && particle.particleType == SAND) // move down

if (bottom.live == 0)

dst[index].live = 0;

dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;


else if (bottomLeft.live == 0) // move down to the left if clear
dst[index].live = 0;

dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;



else if (bottomRight.live == 0) // move down to the right if clear

dst[index].live = 0;

dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;

else // cant move
dst[index] = src[index];




if (particle.particleType == WALL)
dst[index] = particle;
dst[index].live = 1;


if (particle.particleType == ERASER)
dst[index].color = blue;
dst[index].live = 0;











share|improve this question























  • Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
    – Ken Thomases
    Nov 10 at 20:20










  • Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
    – Chris
    Nov 10 at 20:23










  • I guess I need to investigate how to do an atomic compare and exchange in this scenario.
    – Chris
    Nov 10 at 20:29






  • 1




    Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
    – Ken Thomases
    Nov 10 at 21:25






  • 1




    Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
    – Ken Thomases
    Nov 11 at 19:44












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:



Sand particles



In one update, both particle 1A and 1B could flow into space T.



Sequential



In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.



Parallel



My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.



My question is then, can I make a parallel solution to this, or can it only be solved sequentially?



Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:



struct Particle 
float4 color;
int live;
int particleType;
;


kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]])

uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];

Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right

if (particle.live == 1 && particle.particleType == SAND) // move down

if (bottom.live == 0)

dst[index].live = 0;

dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;


else if (bottomLeft.live == 0) // move down to the left if clear
dst[index].live = 0;

dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;



else if (bottomRight.live == 0) // move down to the right if clear

dst[index].live = 0;

dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;

else // cant move
dst[index] = src[index];




if (particle.particleType == WALL)
dst[index] = particle;
dst[index].live = 1;


if (particle.particleType == ERASER)
dst[index].color = blue;
dst[index].live = 0;











share|improve this question















I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:



Sand particles



In one update, both particle 1A and 1B could flow into space T.



Sequential



In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.



Parallel



My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.



My question is then, can I make a parallel solution to this, or can it only be solved sequentially?



Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:



struct Particle 
float4 color;
int live;
int particleType;
;


kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]])

uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];

Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right

if (particle.live == 1 && particle.particleType == SAND) // move down

if (bottom.live == 0)

dst[index].live = 0;

dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;


else if (bottomLeft.live == 0) // move down to the left if clear
dst[index].live = 0;

dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;



else if (bottomRight.live == 0) // move down to the right if clear

dst[index].live = 0;

dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;

else // cant move
dst[index] = src[index];




if (particle.particleType == WALL)
dst[index] = particle;
dst[index].live = 1;


if (particle.particleType == ERASER)
dst[index].color = blue;
dst[index].live = 0;








gpu simulation metal






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 10 at 20:27

























asked Nov 10 at 16:54









Chris

62531530




62531530











  • Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
    – Ken Thomases
    Nov 10 at 20:20










  • Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
    – Chris
    Nov 10 at 20:23










  • I guess I need to investigate how to do an atomic compare and exchange in this scenario.
    – Chris
    Nov 10 at 20:29






  • 1




    Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
    – Ken Thomases
    Nov 10 at 21:25






  • 1




    Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
    – Ken Thomases
    Nov 11 at 19:44
















  • Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
    – Ken Thomases
    Nov 10 at 20:20










  • Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
    – Chris
    Nov 10 at 20:23










  • I guess I need to investigate how to do an atomic compare and exchange in this scenario.
    – Chris
    Nov 10 at 20:29






  • 1




    Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
    – Ken Thomases
    Nov 10 at 21:25






  • 1




    Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
    – Ken Thomases
    Nov 11 at 19:44















Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20




Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20












Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23




Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23












I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29




I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29




1




1




Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25




Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25




1




1




Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44




Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241223%2fcan-i-make-the-falling-sand-game-run-parallel-on-the-gpu%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241223%2fcan-i-make-the-falling-sand-game-run-parallel-on-the-gpu%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Node.js Script on GitHub Pages or Amazon S3

Museum of Modern and Contemporary Art of Trento and Rovereto