WebGPU > WebGPU Meets LLM: Using Chat-GPT Type Agents 'Free' In Your Browser With WebGPU
To give you an idea of just how complex and big large language models (LLMs) such as ChatGPT are:
• It would take 355 years to train GPT-3 on a single NVIDIA Tesla V100 GPU
• GPT-4's is based on eight models with 220 billion parameters each - for a total of about 1.76 trillion parameters
Typically, these large complex language models can't run on a CPU (not unless you're willing to wait a very very long time for the result). You need access to a GPU, a powerful number crunching GPU.
I know what you're thinking, you could download all of the drivers, and tools, instal them and configure your system to run a LLM! But what if I told you, that you could run a LLM like ChatGPT in a web-browser (locally on your machine) without installing anything - just open the webpage, it downloads the necessary files, and it runs on your machine (not using cloud computing - but your own GPU).
WebGPU API has made this possible!! Amazing yes?
ChatGPT Directly in the Browser with WebGPU?! (No Install, No Server)
There are a few variations of LLMs being developed using WebGPU, WebAssembly & Web Workers - if you're interested in giving one a try, visit this website to give the chat demo a try: Web LLM.
Be warned, if you've got a slow internet connection, it takes a while to download and initialize - you'll also need an 'okay' graphics card - otherwise, it'll run slower than a wet-sock (or crash).
To give you an idea, on my i9 computer with an AMD Radeon 5500 - it takes about 10-20 seconds per 'word' when writing a reply to a simple questions like 'hello'.