Heyplay Blog

Building multiplayer gaming & gamedev platform with Elixir, Phoenix LiveView, and Rust

Heyplay stack plays a key role in enabling something as complex as multiplayer gaming and game creation platform as an indie effort. Let’s see how it comes together to turn my dream project into reality.

How it started

I could tell multiple stories about how Heyplay was born:

  • Story 1: I was so in love with Elixir that I’ve decided to chase the kind of project that would make the best of its superstar libs (e.g. Phoenix LiveView), of its OTP superpowers (concurrency, scalability, auto-healing) and of its ability to run native code (e.g. via Rustler)
  • Story 2: I was looking for a way to finally meet my two favorite genres of programming: web dev (which I’m doing professionally for 10 years) and game & 3D dev (which was how my love for coding bloomed and how I’ve acclaimed my masters in CS)
  • Story 3: I just wanted to implement something fun and/or something that enables others to do something fun in a creative way and/or something that one day I could show to my kids so they also see fun in thinking, inventing their own stuff and in making it happen

Feel free, dear reader, to take your best pick as all of the above is true. One could say it’s a crime for mature engineer to pick a project for a tech stack instead of the reverse (and a professional suicide to admit it publicly). Another would complain “where’s your market research” or straight “this is not the way to make money”. And some helpful someone else would advice to just switch to paid gamedev for a year or to leave my kids alone to chase their own dreams.

And they all would be right too. My answer is simple — and no, this time it’s not the battle-proven “because I can”. It’s all about the passion — the secret ingredient for all my engineering work. Heyplay is all about it — I’m really thrived at how amazingly all the techs used for building it have mixed and I’m really proud at what it’s doing esp. considering the relatively small effort put into making it.

For me Heyplay = dream project + dream stack. Can it be a dream tool for others too?
(image by pixel2013 from Pixabay)

Today I want to share this passion — for Heyplay as a project, for its stack and for how the two dance together in a perfect product-technology tango.

Problems and solutions

Like every project Heyplay tackles many problems and uses many libs, but I consider some challenges most interesting and some technical solutions so important that IMO it simply wouldn’t be possible without them. Let’s dive into them.

This is an initial technical entry so I’ll mostly go with high-level diagrams instead of code samples this time. It should be more than enough for starters.

Lobby and networking — Elixir + Phoenix Channels

In order to make multiple players participate in the same game, that game must first be instanced and then they must all be gathered in it somehow. Then players have to stay connected and updated in real-time.

While dynamic instance orchestration is obviously a huge challenge — I’ll get back to it in a sec — the problem of gathering players together may sound trivial but in reality it’s full of challenges (which makes it cool). It mainly comes from the variety of flows that must be handled at the same time — asynchronously and consistently. Here are some examples:

  • join most populated instance nearby
  • join specific instance by link regardless of region
  • create new public or private instance & let player in once ready
  • reconnect player to previously joined instance after connection loss
  • reclaim or stop instance that’s no longer needed
  • let web app visitor know the player count in each game
  • lock instance for joins after player limit is reached

Now mix all these cases together for a permutation of all edge scenarios…

It’d be fairly complex already for a single game kept in one region. It gets even harder for when it hits multiple regions. And when you turn that into a platform with multiple games… Is that possible to cover by a mere human or do we have to wait until ChatGPT finally takes over…?

Well, the short answer it Elixir and the more precise one is OTP. If that’s not clear enough, let me explain further.

Heyplay implements the Lobby GenServer that uses in-memory state for instance statuses and player counts. Use of actor model simplifies the thinking about concurrency of instances & players that — like mentioned above — asynchronously evolve all the time: instances are spawned, reserved and reclaimed while players connect, leave on purpose or suffer temporary disconnects.

Lobby also monitors other processes (e.g. to hear first-hand about player disconnect or instance going down) and talks back to them (e.g. about join progress). It’s also supervised so that it’s revived in case things go wrong. And it may always be spawned separately for specific game or region for further isolation.

When it comes to state, use of ETS allows to share the state efficiently (e.g. for player counts) while hooking CRDT on top of it ensures state is replicated and won’t go away on redeployment or in case of failure.

In-game networking? Well, it’s what Erlang was built for decades ago and so it’s truly a piece of cake for Phoenix Channels to deliver game updates in a blazing fast, bandwidth-efficient, stable and scalable fashion. As long as TCP networking over Websocket is acceptable, here it’s as good as it can be.

I hope to dive deeper into specific solutions, but the bottom line is that OTP and actor model implemented by it did prove to be a perfect tool for game lobby, instance orchestration and game networking in Heyplay.

Front-end — Phoenix LiveView + Monaco + Blockly

Heyplay is expected to render & animate games smoothly & efficiently regardless of the device. And since it’s not only about playing but also creating them, it must provide some form of game editor — perhaps even a full IDE to show errors, stats, versions, docs and so on. Finally, gaming-oriented platform should look & feel smooth and fancy — definitely not like a regular website that painfully reloads itself on every click.

All of the above could lead to a conclusion that Heyplay is a perfect candidate for JS-heavy SPA, written in React or another framework. And while I’m sure that wouldn’t be a bad choice for many reasons, it’s not something I could afford to pick given the limited resources. After a few React & misc SPA apps I knew first-hand how much overhead comes from having to design FE-BE contract, to build APIs, to duplicate models & keep state in sync, to glue together automated browser-based tests…

This applies to any kind of a project and even more so to Heyplay. That’s because it must concern itself not “just” (yeah, as if that wouldn’t be enough) with serving pages for game library, playing selected game and game editor with multiple panels, but also with all the platform sidekick features like user auth, leaderboards, chat, help with guides, full editor docs and who knows what else in the future.

This is where Phoenix LiveView comes in. Yes, we won’t get away with game rendered via HTML patched by morphdom (believe me, I tried), but game rendering is just one part of the whole platform — and one that’s particularly well contained. All it takes in Heyplay is a JS hook which does the following:

  • uses Three.js for WebGL hardware accelerated rendering (+ custom shaders that were a perfect excuse to undust my GLSL skills)
  • relies on binary messaging through Phoenix Channels + JS implementations of Protobuff & Zstd to talk with the server

As for code editing & IDEs, the team behind Livebook has already proven that’s not a problem and LiveView can talk nicely not just to one but even multiple Monaco instances. For Heyplay all it takes is one with a twist of also offering visual programming for beginners via Blockly.

Navigation, game rendering, script & visual programming, platform pages... All nice & smooth.

UX-wise LiveView allows to cover entire web app with reload-free routing, dynamic pieces and arbitrary code to handle page transitions so it does allow Heyplay navigation to feel smooth and modern like it should be.

Physics and scripting — Rust and Lua

Multiplayer game that cares about cheat protection must be run on the server. Also, regardless if whole app is just for one game or a platform for creating many, you need to express game rules somehow and usually languages used for web dev or even game engine dev aren’t a good fit.

Now, decency would dictate at least 30 updates per second — so 30 full game world recalculations and game script reevaluations. Plus of course payload encoding, networking etc.

For starters I did give Erlang-native Lua implementation a shot but the performance measurements were equally brutal as when trying to render it all using HTML. And any attempt to code basic 2D physics using Elixir’s immutable syntax (or fallback to ETS) felt wrong, cumbersome and slow.

So, can Elixir — a dynamic scripting language — do that? Well, not on its own. But Elixir can do native code (a.k.a. NIFs) and there’s a new cool kid in town for that: Rust. It plays with Elixir particularly well both thanks to amazing safety guarantees and to Rustler for seamless integration. And there’s a Rust crate that enables full Lua integration — allowing Rust (and so Elixir) to run scripts written in a language that’s a standard in gamedev and that comes scarily close to low-level code in terms of speed.

Building on this idea, Heyplay implements a simple 2D game engine in Rust that holds all the objects in the game, recalculates their movement and determines which ones are in each player’s viewport in order to efficiently assemble and compress minimal patches. Then, through Rust, Heyplay runs the user-provided Lua script — script that holds its own runtime state and that calls the API provided by Heyplay engine to manipulate the game world. And reverse — engine calls callbacks in the script for handling game events such as game start, player join, input, next frame or collision between defined object groups.

And did I mention that Blockly can turn visual blocks into Lua code out of the box? Oh, yeah, that too. What a lucky coincidence!

Global instancing and ops — Fly.io + Fly Machines

OK, so games are played on specific instances and we have the Lobby GenServer that creates these instances and routes players to them. Everyone can develop games using arbitrary scripts and we have this fancy Rust/Lua engine to run them. We are missing one very important part: a place to run that arbitrary code at. We need to meet following criteria:

  • security: arbitrary code cannot be executed within main app that contains source code, database access, 3rd party credentials etc — otherwise a malicious user can take it over
  • fault tolerance: arbitrary code may and will consume unpredictable amounts of machine’s resources or even bring it down which could cannibalize other parts of the platform or cause its downtime
  • scalability: we want to enable running many games with many instances each — so we need dynamic machines for each game even if game author would be well-behaved and an expert in optimization
  • global availability: online gaming is all about network latency so we want instances to be spawned as close to players as possible — otherwise we’d have to rule out users from e.g. US, Europe or China
  • internal connectivity: once physically separated, each instance still needs to communicate with the lobby — in a real-time, scalable and secured way (secured both against public access and internal breach)

Wow, that asks for some serious devops team with months of work around Google/AWS/Azure clouds, Kubernetes clusters, multi-region provisioning, app integration code… Have you looked recently at devops salaries or GKE pricing? Yup, we’re talking about enormous time and money spending.

But there’s another way — Fly.io. Without further ado, here’s a truly astonishing list of what it’s doing in case of Heyplay:

By relying on Fly.io, Heyplay may run game instances in locations all over the world.

And guess what: all of the above is provided for free as long as the traffic stays low, all thanks to Fly.io pricing which is very welcoming for small projects and, considering how much it does, still very fair when scaling.


Of course there’s a lot of interesting stuff going on in the app code to get all of that working, including:

  • dynamic cluster formation using libcluster
  • instance provider behaviour with selection of Fly and other options
  • gateway channel & client for secured app-game instance connection
  • asynchronous instance provisioning & prewarming
  • player reconnection upon temporary connection loss
  • instance survival across deployments based on lobby state replication
  • app and game instance code organization within one project

Let’s leave these for another day. For now the point is that I’m really happy how Fly.io empowers Elixir developers so much and that we can only expect more in the future now that Chris McCord has joined them. It truly makes Heyplay fly.

Summary

Every indie project or side-gig can only happen if it’s doable within very limited timeframe and if it won’t cost too much time and money to maintain without full-time engagement or funding. This is especially true for app like Heyplay which not “only” goes through all the amazing challenges of multiplayer gaming but also allows to execute arbitrary code in a scalable way and which therefore has serious needs when it comes to ops. At the same time it plays a role of online IDE and cross-platform rendering engine so it’s non-trivial on the front-end too.

The beauty of the Heyplay stack that I’ve presented above is how it maximizes development agility and cost efficiency:

  • Elixir allows to build scalable apps that can handle complex, self-healing asynchronous flows with code understandable by mere humans and without having to call dibbs on hundreds of data centers
  • Phoenix offers well-organized, pleasant web app development and top-notch real-time networking performance
  • LiveView allows to build complex and well-tested front-end in record times while easily tapping into the huge JS/NPM ecosystem when looking for complex libs like Monaco or Blockly code editors
  • Rustler supercharges it with native superpowers of Rust and its thriving ecosystem of native libs including entire runtimes such as the one for Lua that Heyplay is using for game scripting
  • Fly.io provides incredible operational capabilities in affordable way

That’s it for today! I feel I’ve barely scratched the surface of all the challenges that I’ve faced in Heyplay, so I hope to get into more detail in subsequent entries.

Until then, you can give Heyplay a shot & support it in following ways: