www.xbdev.net
xbdev - software development
Friday February 6, 2026
Home | Contact | Support | WebGPU Graphics and Compute ... | Neural Networks and WebGPU Compute.. Learning from Data.....
     
 

Neural Networks and WebGPU Compute..

Learning from Data.....

 


Slow Compute



Why are single threaded compute shaders bad?


Single threaded compute shaders are NOT bad. You learn about the shader language, architecture and the algorithm. Most professional developers probably learned to program by starting with
small
sequential implementations first before ramping things up and distributing the workload. As the old saying goes, you should learn to walk before you try to run.

The reason why single threaded compute shaders are frouned on, gives people the idea that the GPU is akin to the CPU. That you can run lots of programs in parallel on single threads. That the GPU has hundreds or thousands of corese - which means you can have hundreds of thousands of threads - each running a different program! WRONG! These cores and threads are designed to run in parallel - so if you run a single threaded program, it's akin to having all of those cores locked and wasted (they can't be used for other tasks).

However, single threaded tasks are great for learning and testing - without the ability to drop down to a slower single threaded model on the GPU makes the learning ride easier.


Single Threaded Compute


This section is about transitioning to working with the neural network on the GPU (using the compute shader). We want to make sure the compute shader version functions correctly (same as the CPU/JavaScript version).

To accomplish this - we'll have the two version run together - so the computational results can be compared.

This becomes important later on as you start to experiment and tweak the compute implementation to make it run faster; compare results with the client side version.


• The CPU version runs and trains the weights
• Check the XOR output is correct (using activate)
• The GPU version uses the trained weights - check it produces the same result as the CPU version
• Generate random weights and train using the GPU version (single thread)
• Check the XOR output is correct (using activateGPU)


CPU and GPU Version (Check Each Other)


The output should look something like the following for the following implementation.

log:["epoch 0 mean squared error: 0.2522134475256263"]
log:["epoch 1000 mean squared error: 0.2499800049979511"]
log:["epoch 2000 mean squared error: 0.2496320243720077"]
log:["epoch 3000 mean squared error: 0.2264426964431236"]
log:["epoch 4000 mean squared error: 0.129379074065149"]
log:["epoch 5000 mean squared error: 0.017275423870474226"]
log:["epoch 6000 mean squared error: 0.0061856634670116924"]
log:["epoch 7000 mean squared error: 0.0035252172955462383"]
log:["epoch 8000 mean squared error: 0.002411010265655219"]
log:["epoch 9000 mean squared error: 0.0018131056574663832"]
log:["epoch 10000 mean squared error: 0.001444454045244726"]
log:["for input 0,0 expected 0 predicted 0.0327 which is correct"]
log:["for input 0,1 expected 1 predicted 0.9612 which is correct"]
log:["for input 1,0 expected 1 predicted 0.9670 which is correct"]
log:["for input 1,1 expected 0 predicted 0.0460 which is correct"]
log:["\nGPU VERSION - MATCH CPU \n"]
log:["for input 0,0 expected 0 predicted 0.0327 which is correct"]
log:["for input 0,1 expected 1 predicted 0.9612 which is correct"]
log:["for input 1,0 expected 1 predicted 0.9670 which is correct"]
log:["for input 1,1 expected 0 predicted 0.0460 which is correct"]
log:["\nRESET - TRAIN WITH GPU VERSION\n"]
log:["epoch 0 mean squared error: 0.48168399569426956"]
log:["epoch 100 mean squared error: 0.21958587334806895"]
log:["epoch 200 mean squared error: 0.14994121468198873"]
log:["epoch 300 mean squared error: 0.09874228292508974"]
log:["epoch 400 mean squared error: 0.04802577685587252"]
log:["epoch 500 mean squared error: 0.02940106584560033"]
log:["epoch 600 mean squared error: 0.02055316752912885"]
log:["epoch 700 mean squared error: 0.015557418955860783"]
log:["epoch 800 mean squared error: 0.012417021152197297"]
log:["epoch 900 mean squared error: 0.01028809588098907"]
log:["epoch 1000 mean squared error: 0.008761597662769863"]
log:["for input 0,0 expected 0 predicted 0.1115 which is correct"]
log:["for input 0,1 expected 1 predicted 0.9040 which is correct"]
log:["for input 1,0 expected 1 predicted 0.9053 which is correct"]
log:["for input 1,1 expected 0 predicted 0.0665 which is correct"]


• CPU and GPU neural networks - tested with xor and backpropagation
• Array 'blocks' for the Neural Network Data (layer outputs, weights, errors, ..)
• Scalable solution - specificy the network dimensions, e.g., [2,3,1]...
• CPU and GPU VERSION - Compares/Checks Results
• Some asserts scattered around to check basic array size aligment/data


const LEARNING_RATE 0.2;
const 
VARIANCE_W 0.5;
//const randomUniform = (min, max) => { return 0.123; }; // Math.random() * (max - min) + min;
const randomUniform = (minmax) => Math.random() * (max min) + min;

const 
ru = ()=>{ return randomUniform(-VARIANCE_WVARIANCE_W); }

const 
layers  = [ 23,  ];
const 
maxLayerSize = [...layers].sort( (a,b)=>b-)[0];

const 
xordataset = [  { inputs: [0,0], outputs: [0] },
                      { 
inputs: [0,1], outputs: [1] },
                      { 
inputs: [1,0], outputs: [1] },
                      { 
inputs: [1,1], outputs: [0] }  ];
console.assertxordataset[0].outputs.length == layerslayers.length-] );

const 
weights  = Array( layers.length maxLayerSize maxLayerSize ).fill);
const 
biases   = Array( layers.length maxLayerSize ).fill(0);
const 
loutputs = Array( layers.length maxLayerSize ).fill(0);
const 
errors   = Array( layers.length maxLayerSize ).fill(0);

const 
MAX_NEURONS_PER_LAYER maxLayerSize;
const 
NUM_LAYERS            layers.length;

//--------------------------------------------------------------------------------------

const initializeWeightsandBiases = () => {
    for (
let i=0i<layers.length-1i++)
    {
       for (
let k=0k<layers[i]; i++)
       {
            for (
let g=0g<layers[i+1]; g++)
            {
                
setWeightikru() );
            }
       }
    }
   
    for (
let i=0i<layers.length-1i++)
    {
       for (
let k=0k<layers[i]; i++)
       {
               
biases[i][k] = 0.0;  
       }
    }
}
initializeWeightsandBiases();

//--------------------------------------------------------------------------------------

function getBias(layerneuron) {
  return 
biases[layer MAX_NEURONS_PER_LAYER neuron];
}

function 
setBias(layerneuronvalue) {
  
biases[layer MAX_NEURONS_PER_LAYER neuron] = value;
}

function 
setOutput(layerneuronvalue) {
  
loutputs[layer MAX_NEURONS_PER_LAYER neuron] = value;
}

function 
getOutput(layerneuron) {
  return 
loutputs[layer MAX_NEURONS_PER_LAYER neuron];
}

function 
getWeight(layerfromNeurontoNeuron) {
  return 
weights[layer MAX_NEURONS_PER_LAYER MAX_NEURONS_PER_LAYER
                
fromNeuron MAX_NEURONS_PER_LAYER
                
toNeuron];
}

function 
setWeight(layerfromNeurontoNeuronvalue) {
  
weights[layer MAX_NEURONS_PER_LAYER MAX_NEURONS_PER_LAYER
         
fromNeuron MAX_NEURONS_PER_LAYER
         
toNeuron] = value;
}

function 
setError(layerneuronvalue) {
  
errors[layer MAX_NEURONS_PER_LAYER neuron] = value;
}

function 
getError(layerneuron) {
  return 
errors[layer MAX_NEURONS_PER_LAYER neuron];
}

//--------------------------------------------------------------------------------------

const sigmoid = (x) => 1.0 / (1.0 Math.exp(-x));

const 
sigmoidDerivative = (x) => * (x);

const 
relu = (x) =>  { return Math.max(0.0x); }

const 
reluDerivative = (x) => {  if (0.0) { return 1.0; } return 0.0; }

const 
leakyRelu = (xalpha 0.01) => { return alpha x; }

const 
leakyReluDerivative = (xalpha 0.01) => { return alpha; }


const 
activate = (iin) => {
      
//console.assert( iin.length == outputs[ 0 ].length );
      //console.assert( weights[0].length    == layers[0] );
      //console.assert( weights[0][0].length == layers[1] );
      //console.assert( weights[1].length    == layers[1] );
    //console.assert( weights[1][0].length == layers[2] );
  
    
for (let i=0i<NUM_LAYERSi++)
    {
          if ( 
i==)
        {
              for (
let k=0k<iin.lengthk++)
            {
                
setOutput(0kiin] );
            }
        }
        else
        {
              for (
let k=0k<layers[i]; k++)
            {
                  var 
sum 0.0;
                  for (
let b=0b<layers[i-1]; b++)
                {
                      
sum += getOutputi-1) * getWeighti-1b);
                }
                
setOutputik,  sigmoidsum getBiasi) ) );
            }
            
        }
    }
  
 
    return [ 
getOutputNUM_LAYERS-1) ];
};

//--------------------------------------------------------------------------------------

const propagate = (targetalpha 0.2) => {

    for (
let i=NUM_LAYERS-1i>0i--)
    {
          for (
let k=0k<layers[i]; k++)
        {
              if ( 
i==NUM_LAYERS-)
            {
                  
let error = ( target[k] - getOutputi) ) * sigmoidDerivativegetOutputi,) );
                
setErrorikerror );
            }
            else
            {
                  
setErrorik0.0 ); 
                  for (
let g=0g<layers[i+1]; g++)
                {
                      
let error getErrori+1) * getWeighti,k,) * sigmoidDerivativegetOutput(i,k) );
                    
setErrorikerror );
                }
            }
        }
    }

      
    for (
let i=0i<NUM_LAYERSi++)
    {
          for (
let k=0k<layers[i]; k++)
        {
            if ( 
NUM_LAYERS-) {
              for (
let g=0g<layers[i+1]; g++)
            {
                  var 
weight getWeightik);
                  
weight += alpha getOutput(i,k) * getError(i+1,g);
                  
setWeightikgweight ); 
            }
            }
          
            
let bias getBiasi,);
            
bias += alpha getErrori);
            
setBiasikbias );
        }
    }

};

//--------------------------------------------------------------------------------------



// - test the neural network using iteration loop - xor dataset

console.log( new Date() );

for (
let epoch 0epoch <= 10000epoch++) {
    
let indexes = Array.from(Array( xordataset.length ).keys());
    
indexes.sort(() => Math.random() - 0.5);
    for (
let j of indexes) {
        
activatexordataset[j].inputs ); 
        
propagatexordataset[j].outputsLEARNING_RATE);
    }

    if (
epoch 1000 === 0) {
        
let cost 0;
        for (
let j 0xordataset.lengthj++) {
            
let o activate(  xordataset[j].inputs );
              for (
let b=0b<xordataset[j].outputs.lengthb++)
            {
                
cost += Math.powxordataset[j].outputs[b] - o[b], 2);
            }
        }
        
cost /= 4;
        
console.log(`epoch ${epoch} mean squared error: ${cost}`);
    }
}


for (
let i 0xordataset.lengthi++) 
{
    const 
result activatexordataset[i].inputs );
  
    
console.log(`for input ${xordataset[i].inputs} expected ${xordataset[i].outputs} predicted ${result[0].toFixed(4)} which is ${Math.round(result[0]) === xordataset[i].outputs[0] ? "correct" "incorrect"}`);
}


//---------------------------------------------------------------------------


const adapter await navigator.gpu.requestAdapter();
const 
device  await adapter.requestDevice();

//---------------------------------------------------------------------------



async function runComputePipeline(shaderCodebindingsworkGroupCountentryFunction='main') {
    const 
shaderModule device.createShaderModule({ codeshaderCode });
  
    
let entries = [];
    for (
let n=0n<bindings.lengthn++) {
       
entries.push( {bindingbindings[n].bindingvisibilityGPUShaderStage.COMPUTEbuffer: {type"storage"} } );
    }
    const 
bindGroupLayout device.createBindGroupLayout( { "entries"entries } );
      
/*
    const bindGroupLayout = device.createBindGroupLayout({
        entries: [ {binding: 0, visibility: GPUShaderStage.COMPUTE, buffer: {type: "storage"}  },
                   {binding: 1, visibility: GPUShaderStage.COMPUTE, buffer: {type: "storage"}  },
                   {binding: 2, visibility: GPUShaderStage.COMPUTE, buffer: {type: "storage"}  },
                   {binding: 3, visibility: GPUShaderStage.COMPUTE, buffer: {type: "storage"}  } 
                 ]
      });
    */
  
    
const pipeline device.createComputePipeline({
      
layoutdevice.createPipelineLayout({bindGroupLayouts: [bindGroupLayout]}),
      
compute: { moduleshaderModuleentryPointentryFunction },
    });

    const 
bindGroup device.createBindGroup({ layoutbindGroupLayoutentriesbindings });

    const 
commandEncoder device.createCommandEncoder();
    const 
passEncoder commandEncoder.beginComputePass();
    
passEncoder.setPipeline(pipeline);
    
passEncoder.setBindGroup(0bindGroup);
    
passEncoder.dispatchWorkgroups(workGroupCount);
    
passEncoder.end();

    const 
commandBuffer commandEncoder.finish();
    
device.queue.submit([commandBuffer]);
    
//await device.queue.onSubmittedWorkDone();
  
    // Return pipeline/group 
    
return { pipeline:pipelinebindGroup:bindGroupworkGroupCount:workGroupCount };
}

function 
createBufferdatausage ) {
    const 
buffer device.createBuffer({
      
sizedata.byteLength,
      
usageusage GPUBufferUsage.COPY_DST,
      
mappedAtCreationtrue,
    });
    new 
data.constructor(buffer.getMappedRange()).set(data);
    
buffer.unmap();
    return 
buffer;
}


const 
layersBuffer    createBuffer(new Uint32Arraylayers ), GPUBufferUsage.STORAGE );
const 
inputsBuffer    createBuffer(new Float32Arraylayers[0]  ), GPUBufferUsage.STORAGE );
const 
resultsBuffer   createBuffer(new Float32Arraylayers[layers.length-1] ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC );
const 
expectedBuffer  createBuffer(new Float32Arraylayers[layers.length-1] ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC );
const 
weightsBuffer   createBuffer(new Float32Arrayweights.flat(4) ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC);
const 
biasesBuffer    createBuffer(new Float32Arraybiases.flat(4)  ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC);
const 
outputsBuffer   createBuffer(new Float32Arrayloutputs.flat(4) ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC);
const 
errorsBuffer    createBuffer(new Float32Arrayerrors.flat(4) ), GPUBufferUsage.STORAGE GPUBufferUsage.COPY_SRC);


//---------------------------------------------------------------------------



// Shaders
const forwardShaderCode = `
const MAX_NEURONS_PER_LAYER = 
${maxLayerSize};
const NUM_LAYERS            = 
${NUM_LAYERS};

@group(0) @binding(0) var<storage, read_write> layers       : array<u32>;
@group(0) @binding(1) var<storage, read_write> weights      : array<f32>;
@group(0) @binding(2) var<storage, read_write> biases       : array<f32>;
@group(0) @binding(3) var<storage, read_write> inputs       : array<f32>;
@group(0) @binding(4) var<storage, read_write> results      : array<f32>;
@group(0) @binding(5) var<storage, read_write> outputs      : array<f32>;

// Activation function (Sigmoid)
fn sigmoid(x:f32) -> f32 {
  return 1.0 / (1.0 + exp(-x));
}

fn getBias(layer: u32, neuron: u32) -> f32 {
    return biases[ layer * MAX_NEURONS_PER_LAYER + neuron ];
}

fn setOutput(layer: u32, neuron: u32, value: f32) {
    outputs[ layer * MAX_NEURONS_PER_LAYER + neuron ] = value;
}

fn getOutput(layer: u32, neuron: u32) -> f32 {
    return outputs[ layer * MAX_NEURONS_PER_LAYER + neuron ];
}

fn getWeight(layer: u32, fromNeuron: u32, toNeuron: u32 ) -> f32 {
    return weights[ layer * MAX_NEURONS_PER_LAYER * MAX_NEURONS_PER_LAYER
                   + fromNeuron * MAX_NEURONS_PER_LAYER
                   + toNeuron ];
}


@compute @workgroup_size( 1 ) 
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    // Single threaded version - basic compute testing reconisonse
    
    for (var i:u32=0; i<NUM_LAYERS; i++)
    {
          if ( i==0 )
        {
              for (var k:u32=0; k< arrayLength(&inputs); k++)
            {
                setOutput(0, k, inputs[ k ] );
            }
        }
        else
        {
              for (var k:u32=0; k<layers[i]; k++)
            {
                  var sum = 0.0;
                  for (var b:u32=0; b<layers[i-1]; b++)
                {
                      sum += getOutput( i-1, b ) * getWeight( i-1, b, k );
                }
                setOutput( i, k,  sigmoid( sum + getBias( i, k ) ) );
            }
        }
    }
    
    for (var k:u32=0; k< arrayLength(&results); k++)
    {
        results[k] = getOutput( arrayLength(&layers)-1 , k );
    }
}
`;


async function activateGPUinputs )
{
    
device.queue.writeBufferinputsBuffer,     0, new Float32Arrayinputs ) );
    
//await device.queue.onSubmittedWorkDone();

    
await runComputePipeline(forwardShaderCode, [
                  { 
binding0resource: { bufferlayersBuffer        } },
                  { 
binding1resource: { bufferweightsBuffer       } },
                  { 
binding2resource: { bufferbiasesBuffer        } },
                  { 
binding3resource: { bufferinputsBuffer        } },
                  { 
binding4resource: { bufferresultsBuffer       } },
                  { 
binding5resource: { bufferoutputsBuffer       } }
            ], 
);
  
    
// Retrieve results
    
const readBuffer device.createBuffer({sizeresultsBuffer.sizeusageGPUBufferUsage.COPY_DST GPUBufferUsage.MAP_READ });

    const 
commandEncoder device.createCommandEncoder();
    
commandEncoder.copyBufferToBuffer(resultsBuffer0readBuffer0resultsBuffer.size);
    
device.queue.submit([commandEncoder.finish()]);

    
await readBuffer.mapAsync(GPUMapMode.READ);
    const 
results = Array.from( new Float32Array(readBuffer.getMappedRange()) );
    
readBuffer.unmap();
      return 
results;
}


//---------------------------------------------------------------------------


// Shaders
const backwardShaderCode = `
const MAX_NEURONS_PER_LAYER = 
${maxLayerSize};
const NUM_LAYERS            = 
${NUM_LAYERS};
const LEARNING_RATE:f32     = 
${LEARNING_RATE};


@group(0) @binding(0) var<storage, read_write> layers       : array<u32>;
@group(0) @binding(1) var<storage, read_write> weights      : array<f32>;
@group(0) @binding(2) var<storage, read_write> biases       : array<f32>;
@group(0) @binding(3) var<storage, read_write> inputs       : array<f32>;
@group(0) @binding(4) var<storage, read_write> outputs      : array<f32>;
@group(0) @binding(5) var<storage, read_write> expected     : array<f32>;
@group(0) @binding(6) var<storage, read_write> errors       : array<f32>;

// Derivative of sigmoid function
fn sigmoidDerivative(x:f32) -> f32 {
  return x * (1 - x);
}

fn getBias(layer: u32, neuron: u32) -> f32 {
    return biases[ layer * MAX_NEURONS_PER_LAYER + neuron ];
}

fn setBias(layer: u32, neuron: u32, value:f32 ) {
    biases[ layer * MAX_NEURONS_PER_LAYER + neuron ] = value;
}

fn setOutput(layer: u32, neuron: u32, value: f32) {
    outputs[ layer * MAX_NEURONS_PER_LAYER + neuron ] = value;
}

fn getOutput(layer: u32, neuron: u32) -> f32 {
    return outputs[ layer * MAX_NEURONS_PER_LAYER + neuron ];
}

fn getWeight(layer: u32, fromNeuron: u32, toNeuron: u32 ) -> f32 {
    return weights[ layer * MAX_NEURONS_PER_LAYER * MAX_NEURONS_PER_LAYER
                   + fromNeuron * MAX_NEURONS_PER_LAYER
                   + toNeuron ];
}

fn setWeight(layer: u32, fromNeuron: u32, toNeuron: u32, value:f32 ) {
    weights[ layer * MAX_NEURONS_PER_LAYER * MAX_NEURONS_PER_LAYER
                   + fromNeuron * MAX_NEURONS_PER_LAYER
                   + toNeuron ] = value;
}

fn setError( layer: u32, neuron: u32, value: f32 ){
    errors[ layer * MAX_NEURONS_PER_LAYER + neuron ] = value;
}

fn getError( layer: u32, neuron: u32 ) -> f32 {
    return errors[ layer * MAX_NEURONS_PER_LAYER + neuron ];
}


@compute @workgroup_size( 1 )
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) 
{
    // Single threaded - test compute reconisone 
    
    for (var i:u32=NUM_LAYERS-1; i>0; i--)
    {
          for (var k:u32=0; k<layers[i]; k++)
        {
              if ( i==NUM_LAYERS-1 )
            {
                  let error = ( expected[k] - getOutput( i, k ) ) * sigmoidDerivative( getOutput( i,k ) );
                setError( i, k, error );
            }
            else
            {
                  setError( i, k, 0.0 ); 
                  for (var g:u32=0; g<layers[i+1]; g++)
                {
                      let error = getError( i+1, g ) * getWeight( i,k,g ) * sigmoidDerivative( getOutput(i,k) );
                    setError( i, k, error );
                }
            }
        }
    }

      
    for (var i:u32=0; i<NUM_LAYERS; i++)
    {
          for (var k:u32=0; k<layers[i]; k++)
        {
            if ( i < NUM_LAYERS-1 ) {
              for (var g:u32=0; g<layers[i+1]; g++)
            {
                  var weight = getWeight( i, k, g );
                  weight += LEARNING_RATE * getOutput(i,k) * getError(i+1,g);
                  setWeight( i, k, g, weight ); 
            } }
          
            var bias = getBias( i,k );
            bias += LEARNING_RATE * getError( i, k );
            setBias( i, k, bias );
        }
    }
}    
`;


async function propagateGPUexpected )
{
    
device.queue.writeBufferexpectedBuffer,   0, new Float32Arrayexpected ) );
    
//await device.queue.onSubmittedWorkDone();
  
    
await runComputePipeline(backwardShaderCode, [
            { 
binding0resource: { bufferlayersBuffer       } },
          { 
binding1resource: { bufferweightsBuffer      } },
          { 
binding2resource: { bufferbiasesBuffer       } },
          { 
binding3resource: { bufferinputsBuffer       } },
          { 
binding4resource: { bufferoutputsBuffer      } },
          { 
binding5resource: { bufferexpectedBuffer     } },
          { 
binding6resource: { buffererrorsBuffer       } },
    ], 
);
}


//---------------------------------------------------------------------------

console.log(`
GPU VERSION - MATCH CPU 
`);

for (
let i 0xordataset.lengthi++) 
{
    const 
result await activateGPUxordataset[i].inputs );
  
    
console.log(`for input ${xordataset[i].inputs} expected ${xordataset[i].outputs} predicted ${result[0].toFixed(4)} which is ${Math.round(result[0]) === xordataset[i].outputs[0] ? "correct" "incorrect"}`);
}


//---------------------------------------------------------------------------

console.log(`
RESET - TRAIN WITH GPU VERSION
`);

initializeWeightsandBiases();

device.queue.writeBufferweightsBuffer,     0, new Float32Arrayweights.flat(4) ) );
device.queue.writeBufferbiasesBuffer,      0, new Float32Arraybiases.flat(4) ) );
//await device.queue.onSubmittedWorkDone();


for (let epoch 0epoch <= 1000epoch++) {
    
let indexes = Array.from(Array( xordataset.length ).keys());
    
indexes.sort(() => Math.random() - 0.5);
    for (
let j of indexes) {
        
await activateGPUxordataset[j].inputs ); 
        
await propagateGPUxordataset[j].outputsLEARNING_RATE);
    }

    if (
epoch 100 === 0) {
        
let cost 0;
        for (
let j 0xordataset.lengthj++) {
            
let o await activateGPU(  xordataset[j].inputs );
              for (
let b=0b<xordataset[j].outputs.lengthb++)
            {
                
cost += Math.powxordataset[j].outputs[b] - o[b], 2);
            }
        }
        
cost /= 4;
        
console.log(`epoch ${epoch} mean squared error: ${cost}`);
    }
}


for (
let i 0xordataset.lengthi++) 
{
    const 
result await activateGPUxordataset[i].inputs );
  
    
console.log(`for input ${xordataset[i].inputs} expected ${xordataset[i].outputs} predicted ${result[0].toFixed(4)} which is ${Math.round(result[0]) === xordataset[i].outputs[0] ? "correct" "incorrect"}`);
}

//---------------------------------------------------------------------------


async function dumpBuffer(buffstr='data:')
{
  
/*
  Note: - make sure you add the 'GPUBufferUsage.COPY_SRC' flag to the buffer creation (copy the buffer data back)
  */
  // Retrieve results
  
const readBuffer device.createBuffer({sizebuff.sizeusageGPUBufferUsage.COPY_DST GPUBufferUsage.MAP_READ });

  const 
commandEncoder device.createCommandEncoder();
  
commandEncoder.copyBufferToBuffer(buff0readBuffer0buff.size);
  
device.queue.submit([commandEncoder.finish()]);

  
await readBuffer.mapAsync(GPUMapMode.READ);
  const 
results = Array.from( new Float32Array(readBuffer.getMappedRange()) );
  
readBuffer.unmap();
  
  
console.logstrresults );
}


Resources and Links


• WebGPU Lab XOR CPU and GPU Version [LINK]








WebGPU by Example: Fractals, Image Effects, Ray-Tracing, Procedural Geometry, 2D/3D, Particles, Simulations WebGPU Compute graphics and animations using the webgpu api 12 week course kenwright learn webgpu api kenwright programming compute and graphics applications with html5 and webgpu api kenwright real-time 3d graphics with webgpu kenwright webgpu api develompent a quick start guide kenwright webgpu by example 2022 kenwright webgpu gems kenwright webgpu interactive compute and graphics visualization cookbook kenwright wgsl webgpu shading language cookbook kenwright wgsl webgpugems shading language cookbook kenwright



 
Advert (Support Website)

 
 Visitor:
Copyright (c) 2002-2025 xbdev.net - All rights reserved.
Designated articles, tutorials and software are the property of their respective owners.