Creatorland is a California-based startup aiming to become the go-to platform for creatives, creators, and influencers, helping them showcase their work, find brand deals, and connect within their niche. In early 2023, they reached out to me to lead their backend development and systems engineering. Their tech stack was heavily TypeScript-based, with Firebase storing standard user and post data, and Neo4J, a specialized graph database, managing user connections. Everything was hosted on Google Cloud Platform.
When I joined, the backend was essentially one monolithic Nest.JS application deployed on Cloud Run through a GitHub workflow that built and containerized the application. There was serious tech debt from the part-time contractors who originally built the application — documentation of endpoints was inconsistent or missing, TypeScript best practices had been consistently ignored and the "any" keyword was used heavily, which often lead to errors due to bad or missing data, and there was no queueing or caching of any kind to shift heavy workloads off of the main request-response cycle or reduce the amount of queries submitted to the databases. Request latency was high, and so were the error rates.
However, being a pre-seed startup that just received angel funding meant that there wasn't a lot of time to dedicate towards getting rid of this tech debt, and feature development had to go hand-in-hand with incremental improvements until MVP status was achieved. A big feature that had to be developed was an analytics service, designed to manage OAuth connections for users across various social media platforms, fetch analytics data via their APIs, and store this data to build out the analytics page for users on Creatorland.
Since this was going to be a big service with separate concerns from the main application, I set up a separate Nest.JS application for it, deployed separately on Cloud Run. Starting with a clean slate avoided carrying over tech debt to this new part of the application, and instead meant I could establish best practices that could then be carried over to the existing monolith by our developers. A Redis instance cached the responses for users analytics data which shortened response times greatly, and Pub/Sub, GCPs real-time messaging pipeline tool similar to Kafka, was used to orchestrate the fetching of users analytics data in the background after they established an OAuth connection, since this could take up to 30 seconds due to the high volume of API calls going out to 3rd party APIs. Using Pub/Sub also meant this process was automatically fault-tolerant, and would automatically retry upon encountering failures due to rate limits or other issues with the 3rd party API, which was a common occurrence especially with the TikTok API.
After the service went live and showed noticeable higher performance and reliability than the other parts of the application, I started applying the same patterns there and enforcing them across the org — decoupling tasks that could be done in the background through queueing tools, and caching common queries and responses with Redis. With the tech-debt melting away, I could also dedicate more and more time to new feature development, notably introducing Likes, Notifications, a Discover Page with a dynamic algorithm, and more.
A separate issue was CI/CD and the developer experience. When I joined the company, the backend's build process was based on a convoluted GitHub Action nobody on the team fully understood. There were no preview deployments for open PRs, no build logs for when a build failed, and build times were extremely long. To remedy, I revamped the entire build processes with Google Cloud Build and Docker, and wrote YAML files that took care of the build and deploy steps. A separate YAML file made sure a new backend revision was built and deployed every time a new PR was opened or a new commit was pushed to an open PR, while ensuring old revisions were automatically cleaned up when a PR was closed or merged.
These changes not only enhanced the developer experience but also sped up the time to production for new features and bug fixes. Being a startup, obviously we keep iterating as we are adding new features and expanding the system. Cycle times are very quick, and the experience of architecting and developing backend functionality in such an environment is immensely valuable, especially due to the trust and wide responsibility given to me when it comes to cloud infrastructure design, system architecture, and feature engineering. 

You may also like

Back to Top