Emote Usage Tracker: Database Setup

Introduction

Emote Usage Tracker is a project I created to help broadcasters on Twitch gain insights into how their emotes were performing, and increase user engagement by publishing rankings on the web.

Among the main goals in developing this project was to increase my facility with C#/.NET, as well as gain experience writing SQL using Npgsql directly, rather than reach for Entity Framework to abstract that away.

In a previous post, I described the process of designing the database. In this post, I’d like to explain how I initially set up the database using a .NET console application, how I handled schema migrations, deployment concerns in a production environment, and challenges overcome.

Connecting to the Database in .NET

Accessing the database from the .NET project using Npgsql requires a connection string. This is stored as an environment variable in production, and in the user secrets file in development.

The connection string itself is a series of key-value pairs, separated by a semi-colon. At the very least it should define the Host, Port, Username, Password, and Database.

In development, the secrets.json file ought to have something like the following:

JSON
{
	"ConnectionStrings:postgresql": "Host=localhost;Port=5432;Username=myusername;Password=mypassword;Database=emote_usage_tracker"
}

In production, the connection string is assigned to an environment variable with the name ConnectionStrings__postgresql. Since this project is deployed to a Docker container network managed by CapRover, the Host is not localhost, but rather the address of the database container in that network; e.g., srv-captain--emote-usage-tracker-database.

Since the connection string is all that’s needed as far as configuration is concerned in this relatively simple console application, using a ConfigurationBuilder, determining the environment via a DOTNET_ENVIRONMENT environment variable, and conditionally adding the source of the connection string is sufficient.

C#
// Program.cs

var builder = new ConfigurationBuilder();
string? dotnetEnvironment = Environment.GetEnvironmentVariable("DOTNET_ENVIRONMENT");
if (dotnetEnvironment != null && dotnetEnvironment.Equals("PRODUCTION"))
{
    builder.AddEnvironmentVariables();
}
else
{
    builder.AddUserSecrets<Program>();
}
var configuration = builder.Build();
string connectionString = configuration.GetConnectionString("postgresql")
    ?? throw new Exception("Missing ConnectionString__postgresql in configuration!");

Setting up the Database with Npgsql

Handling the creation of tables, indexes, triggers, functions, and types is a new static class DatabaseSetup. Within that is a method taking the connectionString as an argument, and that’s where the necessary commands using Npgsql are executed.

An open connection is needed, and then a command to which this connection is assigned:

C#
using var conn = new NpgsqlConnection(connectionString);
conn.Open();

using var cmd = new NpgsqlCommand();
cmd.Connection = conn;

First, the custom enum type provider_name is created:

C#
cmd.CommandText = @"
	DO $$
	BEGIN
		IF NOT EXISTS (SELECT 1 FROM pg_type WHERE typname = 'provider_name') THEN
			CREATE TYPE provider_name AS ENUM ('Twitch', 'FFZ', 'BTTV', 'SevenTV');
		END IF;
	END $$;
";
cmd.ExecuteNonQuery();

The emotes table is created like so:

C#
cmd.CommandText = @"
	CREATE TABLE IF NOT EXISTS emotes (
		emote_id SERIAL PRIMARY KEY,
		provider_name provider_name NOT NULL,
		provider_emote_id VARCHAR(255) NOT NULL,
		provider_emote_name VARCHAR(255) NOT NULL,
		UNIQUE (provider_name, provider_emote_id)
	);
";
cmd.ExecuteNonQuery();

Creating the table emote_usages and its indexes:

C#
cmd.CommandText = @"
	CREATE TABLE IF NOT EXISTS emote_usage (
		channel_id VARCHAR(255) NOT NULL,
		user_id VARCHAR(255) NOT NULL,
		emote_id INT NOT NULL,
		usage_count INT NOT NULL,
		last_modified TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
		FOREIGN KEY (channel_id) REFERENCES channels(channel_id),
		FOREIGN KEY (user_id) REFERENCES users(user_id),
		FOREIGN KEY (emote_id) REFERENCES emotes(emote_id),
		UNIQUE (channel_id, user_id, emote_id)
	);
";
cmd.ExecuteNonQuery();

cmd.CommandText = @"
	CREATE INDEX IF NOT EXISTS idx_channel_id ON emote_usage(channel_id);
";
cmd.ExecuteNonQuery();

cmd.CommandText = @"
	CREATE INDEX IF NOT EXISTS idx_user_id ON emote_usage(user_id);
";
cmd.ExecuteNonQuery();

cmd.CommandText = @"
	CREATE INDEX IF NOT EXISTS idx_emote_id ON emote_usage(emote_id);
";
cmd.ExecuteNonQuery();

Creating the table table_row_counts, its triggers and functions:

C#
cmd.CommandText = @"
	CREATE TABLE IF NOT EXISTS table_row_counts (
		table_name text PRIMARY KEY,
		row_count int NOT NULL
	);
";
cmd.ExecuteNonQuery();

cmd.CommandText = @"
	INSERT INTO table_row_counts (table_name, row_count)
	VALUES
		('users', (SELECT COUNT(*) FROM users)),
		('channels', (SELECT COUNT(*) FROM channels)),
		('emotes', (SELECT COUNT(*) FROM emotes))
	ON CONFLICT (table_name)
	DO UPDATE SET row_count = EXCLUDED.row_count;
";
cmd.ExecuteNonQuery();

cmd.CommandText = @"
	CREATE OR REPLACE FUNCTION update_table_row_counts_users() RETURNS TRIGGER AS $$
	BEGIN
		IF TG_OP = 'INSERT' THEN
			UPDATE table_row_counts SET row_count = row_count + 1 WHERE table_name = 'users';
			RETURN NEW;
		ELSIF TG_OP = 'DELETE' THEN
			UPDATE table_row_counts SET row_count = row_count - 1 WHERE table_name = 'users';
			RETURN OLD;
		END IF;
	END;
	$$ LANGUAGE plpgsql;
";
cmd.ExecuteNonQuery();

// ... with similar functions defined for channels and emotes tables

cmd.CommandText = @"
	CREATE OR REPLACE TRIGGER trigger_table_row_counts_users
	BEFORE INSERT OR DELETE ON users
	FOR EACH ROW EXECUTE PROCEDURE update_table_row_counts_users();
";
cmd.ExecuteNonQuery();

// ... with similar triggers defined for channels and emotes tables

// Finally, close the connection
conn.Close();

Implementing Migrations

Given that I’m the sole developer working on this project, there’s nothing particularly fancy about the way I’m handling migrations. I simply have a Migrations.cs file, and within that define each migration as a separate method, using sequential numbers in the method names.

C#
/// <summary>
/// Performs a migration on the 'channels' table in the PostgreSQL database.
/// Adds new columns to the 'channels' table.
/// </summary>
/// <remarks>
/// This method adds columns channel_display_name, channel_description, and profile_image_url to the 'channels' table.
/// </remarks>
public static void M0002_AddColumnsToChannelTable(string connectionString)
{
	using var conn = new NpgsqlConnection(connectionString);
	conn.Open();

	using var cmd = new NpgsqlCommand();
	cmd.Connection = conn;

	cmd.CommandText = @"
		ALTER TABLE channels
		ADD COLUMN channel_display_name VARCHAR(255),
		ADD COLUMN channel_description TEXT,
		ADD COLUMN profile_image_url VARCHAR(255);
	";

	cmd.ExecuteNonQuery();

	conn.Close();
}

These migrations are then called in order from Program.cs. Once a migration has been applied in production, its call is commented out and it’s marked as done. As I said, nothing fancy, but it gets the job done for now.

Handling Database Access in a Docker Environment

This DatabaseConfig console application, like others in the Emote Usage Tracker project, is deployed as a Docker container to a CapRover instance. If left to its own devices, CapRover will restart the container once the application exits, and will continue to do so unless the instance count is reduced to zero. Re-running the application in that way is certainly not desirable, so there needs to be some way to delay the program’s exit.

In development, this project is run from within Visual Studio, where I can simply wait for keyboard input before exiting. In production, however, there is no such input method, so instead the program is instructed to wait indefinitely to allow for the instance count within CapRover to be set to zero.

C#
if (dotnetEnvironment != null && dotnetEnvironment.Equals("PRODUCTION"))
{
	Console.WriteLine("Program finished. Set instance count back to 0 to avoid re-running.");
	while (true)
	{
		await Task.Delay(TimeSpan.FromHours(1));
	}
}
else
{
	Console.WriteLine("Press any key to exit...");
	Console.ReadKey();
}

Overcoming Challenges with PostgreSQL Functions & Triggers

Not long after I started running the Emote Usage Tracker in production, I noticed that the count of emotes, channels, and users were wildly inflated. While, according to the tables themselves, the number of channels was 9, emotes 5177, and users 495, according to the counts in table_row_counts, the number of channels was 54, emotes 109144, and users 7340!

These counts were being displayed on the front page of the project’s website, so resolving the issue was a top priority.

It turned out that PostgreSQL did not directly support conditional triggers based on the result of an ON CONFLICT clause, which is how I was upserting records in bulk when periodically updating channel metadata, emote data from providers, and users when handling chat messages.

I decided that, since these counts did not need to be up-to-date at all times, the simplest solution was to switch to a pgAgent job that would periodically count the table rows and update the values in table_row_counts.

First, I applied a migration to remove the relevant functions and triggers, as well as update table_row_counts with the correct values:

C#
public static void M0004_DropTableRowCountsRelatedTriggersAndFunctionsThenUpdate(string connectionString)
{
	using var conn = new NpgsqlConnection(connectionString);
	conn.Open();

	using var cmd = new NpgsqlCommand();
	cmd.Connection = conn;

	cmd.CommandText = @"
		DROP TRIGGER IF EXISTS trigger_table_row_counts_users ON users;
		DROP TRIGGER IF EXISTS trigger_table_row_counts_channels ON channels;
		DROP TRIGGER IF EXISTS trigger_table_row_counts_emotes ON emotes;

		DROP FUNCTION IF EXISTS update_table_row_counts_users();
		DROP FUNCTION IF EXISTS update_table_row_counts_channels();
		DROP FUNCTION IF EXISTS update_table_row_counts_emotes();

		UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM users) WHERE table_name = 'users';
		UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM channels) WHERE table_name = 'channels';
		UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM emotes) WHERE table_name = 'emotes';
	";

	cmd.ExecuteNonQuery();

	conn.Close();
}

Then, I scheduled the following to run hourly using pgAgent:

SQL
UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM users) WHERE table_name = 'users';
UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM channels) WHERE table_name = 'channels';
UPDATE table_row_counts SET row_count = (SELECT COUNT(*) FROM emotes) WHERE table_name = 'emotes';

Reflections and Lessons Learned

I’m satisfied with how this has performed in production. As in so many things, the KISS principle applies. It didn’t seem desirable to prematurely add complexity when one of my overall goals with this project was to interact with the PostgreSQL database with as few external dependencies as possible.

Having the initial database setup and migrations handled in this manner made it relatively easy to create the application for setting up the testing database used by tests of the data layer.

I’d like to explore other approaches to handling migrations in future projects, but at least for now I’m content with how this project manages it.

Conclusion

Using Npgsql and writing the SQL myself has been a fantastic learning experience. While this application may be the least refined among those in the Emote Usage Tracker project, I can’t deny that it’s performed effectively, and writing it has felt like an appropriate initial step into database interactions with .NET.