I was discussing ASCII art with a group of friends recently, and was reminded of towel.blikenlights.com. For the uninitiated, this is an old telnet server address, that upon connection would play Star Wars Episode 4 in its entirety in ASCII. With this as inspiration, I proceeded to look into video players that supported ASCII rendering, such as mplayer, however, I quickly discovered that these players have no support for subtitles or audio in ASCII mode. As such, I decided to take it upon myself to add subtitle, and possibly in future, audio support to an ASCII player of my own.
- FFMpeg and its development libraries
- ASCIIart by pixlab.io
This project began with some research into languages and libraries for the tasks ahead of me. Firstly I had to determine what I would write ASCIIPlay in, this decision wound up being driven mostly by my wanting to practice and learn new things in C. In many ways, choosing C was a mistake due to poor library support and API documentation, but I persvered and wound up learning a lot along the way.
Of course nothing ever works out as easily as it could, and this project was no exception. This next stage of development was one of the worst, and showed me that not all open source projects are created equal. I had to find an ASCII renderer that I could use to build my project, and this was at first a straightforward problem, but it did not stay that way. I had originally intended to use libcaca for ASCII rendering and had downloaded their source code for this purpose. However, upon installing the requisite libraries it became apparent that something was wrong. I could not get their example program to run correctly when compiled from source, so, thinking that perhaps I had pulled a version that was too recent, I tried using the version that was compiled for aptitude, but had the same issue. What followed were a few days of debugging and looking through barely existant documentation until I had given up, and decided to move onto another library with better documentation and one that was more actively maintained.
To this end I came across ASCIIart, a simple library by PixLab
, an ASCII renderer project based on using decision trees to quickly and efficiently generate representations of bitmap images. Unfortunatrly this library is not free to use, and requires a one time purchase of their trained model for it to function, however, at this point I wanted to get some progress and decided to purchase access for the development of a proof of concept. This library only supports grayscale images, so in future I intend to build my own ASCII renderer based on the same principles as this library but adding colour support, and removing the license necessary.
With a renderer found, and one that worked to boot, I was ready to tie the frame fetcher into a simple rendering loop. Firstly, I had to convert the output data into a representation that the ASCII renderer could interpret; the format of the data coming from ffmpeg was in a ppm format, with colour data stored as 3 bit representations per pixel, and all pixels written in sequential order in a long byte stream. By parsing the video file codec info, I was able to determine the playback resolution, which, combined with the number of colour channels, allowed me to convert the incoming colour sequence to an equivalent grayscale sequence, which was found by performing a basic median conversion by averaging the three representative colour values. This resultant grayscale image array was then fed into the ASCII renderer and rendered with simple printf's to display the output. With this step complete it seemd that I was almost finished; however, this was not the case in the slightest.
A rpimary issue with my current implementation of the project was that video playback speed was only limited by the aility of the renderer to render frames. This meant that it played far too fast, and as such would need to be rate limited to the appropriate playback speed. Originally I had intended to simply leave these two processes in lockstep and use timing to limit the rate of playback, but decided instead to build a frame buffer that would maintain the next 100 frames at all times. This wound up being a very important feature as will be discussed later, but started out causing quite a few headaches while being developed. The first stage of developing this frame buffer was to convert the frame fetcher to be threaded, as I would need to decouple its execution from that of the renderer. This was fairly simple, and revealed to me that threading is actually quite a simple feature of programming in C, although special care is needed so as to not have to processes attempt to modify data at the same time. To this end, I constructed a special structure for the purpose of tracking information about the playback of the video, which maintained a pointer to my frame buffer, it's length, as well as a semaphore to prevent simultaneous data access.
With the groundwork laid, next came the task of building the actual buffersystem. Since unlike C++, C has no built in libraries for datastructures I had to build my own queue structure. This was accomplished by building a simple singly linked list node structure, and having functions to add to the end of the list, and popping from the front. I went with a linked list implementation since it wouldn't require that I mantain an index pointer for a backing array, although this would have likely been more performant than looking through the entire list each time I added a new frame. In fact, I later implemented exactly such a structure, but at this point I am happy with the current implementation. The accessor methods for this list start by waiting for the buffer access semaphore to be free so as to prevent accidnetal double accessing of frame data, and will only proceed when the semaphore is free. The buffer then is given a pointer to a frame buffer containing the greyscale representation of the associated frame, and when read this pointer is returned. Following a read, the buffer node is freed and the head is updated.
Once the framebuffer was built next came the task of having a thread dedicated to rendring the frames at the correct time and pulling them from the buffer. The initial design for this rendering engine used a simple call to sleep, having the program wait for the inverse of the framerate seconds before rendering the next frame. With this, a simple version of the program was functional that didn't rely on being in lockstep. This version of the application was functional, and rendered at roughly the correct speed, but a major issue became apparent. It was leaking memory at a rate of about 30MB/s; this was of course unacceptable, and next came a day of debugging to locate the source of this memory leakage. Eventually I found that the frame buffer nodes were actually being passed by value rather than reference, which lead to their old pointer locations being lost. This was a major learning moment, and allowed me to not only practice locating memory leaks with valgrind, but also taught me more about how pointers are handled in C.
Now that my memory leakage was under control, next came the issues of improving the playback rate; this was accomplished by removing the sleep call, and instead tracking the system time from the last rendered frame, and the current time. In order to minimize variability here, time tracking is done after/before any lengthy calls as appropriate in order to ensure that should a process take longer than normal it is accounted for by the rendering engine. This allowed for more consistent video playback, but exposed another issue in the renderer, being that using printf was causing the output to blink rapidly as I was writing to the screen. This was due to having to clear the screen before each frame was printed, causing brief moments with no text on screen, as well as leaving the cursor visible. For this reason, I had decided to move to ncurses as my output renderer, this is because it allows you to write data to it as a buffer, and it will only update and refresh as neccessary, preventing screen blinking.
Having implemented the rendering system in more or less its final state, next came the task of finally adding subtitle rendering. This was relatively easily accomplished in stages, the first of which being the loading of subtitle data from an SRT file. By matching a regular expression to the timecode syntax of the file, the location of each subtitle in the file can be retrieved, along with its start and end times. Upon retrieval of this data, all information is stored in an array, allowing for easier lookup when it comes time to render the film. The next stage of subtitle implementation was to create the subtitle renderer, a process which began with finding where on the rendered frame the text should appear. A region of fixed size centered horizontally, and two rows above the bottom of frame was selected, with a single row/column of buffer around the two internal rows of the box. Then, between the frame being converted to text and being drawn, a subtitle injector is called; when the current playback time is between the start and end regions of the next subtitle, it is drawn into the region. After the end time is exceeded the index of the subtitle to render is incremented and the subtitle renderer will wait to begin rendering again. This system worked reasonably well, although an issue occured with the indexer, that was temporarily fixed by storing and restoring its value.
A few other additions were added after this point, such as frame-rate limiting, and debug value printing, however for the most part this was the first usable version of the program. It served as a great learning experience, forcing me to try new things, and learn more about using open source API documentation. More features are planned for the future, and I will update as these are introduced, but as it stands, the project is in a state that I am happy with, and a demonstration can be observed with the following command in a full screen terminal
More info can be found at telnetflix.com