Unist TV
Under Night In-Birth EXE:Late[st] (most commonly referred as Unist) is a fighting game I used to play back in early 2019. In all fighting games, competitive players need replays of high level players to analyze and improve.
The best source for these replays for Unist was a bunch of Japanese youtube channels that had VODs of streams from specific arcades where the best players played.
Although it was a very valuable resource, it was hard to exploit as it consisted on hours long videos that didn’t necessarly featured the character you played. To adress this issue, I created a program that could analyze these videos, and extract timestamps as well as characters played.
I then created a basic frontend to show that data. Although it somewhat worked, I lost my interest in the game and I didn’t reach the point where the program would run on it’s own, so I dropped it after a while.
Still, building it was an interesting challenge that I’m gonna describe now
Video analysis
The code for this part can be found here.
The first step to process videos was to actually download them. This step was pretty simple as youtube-dl is very powerful and can be used from python. To reduce download times and space used, I settled on downloading the videos in 480p MP4 format. This was good enough to process later.
The actual processing of the videos was more complex, I mainly used OpenCV to look into the frames, searching for specific markers. I had three things I needed to identify : the timestamp for the start of a match (not a round), if both players were humans (if nobody challenges you on an arcade cab, you can play CPUs) and the characters played.
Luckily Unist had some animations and an UI that made the first two points easy to look for. At the start of a round, a thin white horizontal line is displayed on the very middle of the screen, I only had to look for that line every few frames, while skipping some frames after I found it (because a match has a minimum duration based on the amount of damage you can do, the transitions between and rounds and other factors). To be sure it was the first round, I also checked the color for the round indicators whenever I found that line.
Figuring out if players were human was also easy, there’s a specific text displayed for human players below their portrait. I only had to check if that text was here.
The most complicated part was finding out which characters players used. There’s a few ways you can achieve this, and I believe some similar programs exist that rely on matching the character portraits in the HUD from a set of images. I considered that strategy and I believe it could have worked, but I ended up with another method : using Optical Character Recognition (OCR) to read the characters name from the HUD.
To achieve this, I used TesseractOCR and its python bindings. I just had to point it at the part of the screen where the character names are, and voila !
Except not really. Doing that resulted in results that weren’t exactly the character names (and sometimes very wrong), probably because my input wasn’t very clean, and the HUD might be a bit hard to read. To help that, I used a python function (difflib.get_close_matches) that gave me the best match from the string the OCR found against the name of the characters. This worked in most cases, although I had one character where it just wouldn’t work (damn you Orie).
As the program was launched manually, I also added a popup when the program couldn’t figure out the character where I could enter it myself.
In the end, the result of this was written to a .sql file containing an insert query with a value for every match found (with the timestamped URL, the date of the match and the characters).
The actual site
The code for this part can be found here.
Now that I had this data, I had to store it, and show it. To do that, I used PostgreSQL to load the sql file I wrote previously to a database, and postgREST to expose it as a web api. For those who are not familiar with it, PostgREST is a web server that turns a PostgreSQL database into a full REST API. Security is handled by the database own permissions, you can use SQL queries in the request to fetch data from it and it even has pagination out of the box. Although I’m not sure how it would scale in a big serious commercial application, It is incredibly easy to use for small projects like this one, and allowed me to have a running backend without writing any kind of code.
The frontend was written in Preact which is a very small (3kb) js framework with an API that’s very similar to React (but with a smaller scope). As I was already familiar with React, I could write a basic frontend pretty quickly, using fetch requests to get the data from PostgREST, with pagination and using sql queries in the request to filter by characters.
This website was hosted on a small free GCP VPS, with an added Nginx in front to serve the frontend (and as a reverse proxy).