SQLAlchemy reflection and PostgreSQL for data schema flexibility

Felt's mission is to be the best place to make a map on the internet. Maps are a unique medium for storytelling. They are unique because they provide a vast amount of implicit context. Looking at a square section of the world triggers the brain to start searching for places, demographics, economics, history…searching for data. And data is connected: climate change? Give me the temperature delta. Temperature delta? Give me droughts. droughts? Give me wildfires.

But turning the blank canvas of a map into a series of geospatial visualizations is hard today. It requires skilled 'data plumbers'–people who are trained at finding accurate sources, struggling with APIs, and turning the data into something understood at a glance. Even to these professionals, it’s a painful and lengthy process. To achieve our mission, we must offer an alternative way to visualize lots of different types of data and to reduce this painful complexity–not just for the end user, but for our internal team.

We have spent a lot of time prototyping and playing with tools that will provide our team the flexibility, maintainability and scalability to be successful at achieving this mission. Some of the most recent tooling decisions we’ve made may be helpful, and surprising, to any of you skilled 'data plumbers’ out there. For example, it’s fairly standard to choose to use NoSQL, and code that doesn’t make assumptions about the schema when you’re seeking maximum flexibility. Here at Felt, we choose to use PostgreSQL and SQLAlchemy–a relational database and Object-Relational Mapping (ORM) tool. These aren’t often mentioned for their flexibility, but we have found some real advantages to using them.

Here are a couple of ways we have applied SQLAlchemy and PostgreSQL to achieve the flexibility and maintainability we need, and how we use Retool to reduce time spent on building and maintaining a user interface.

SQLAlchemy’s reflect API for maximum schema flexibility

SQLAlchemy is an ORM. Its main usage is defining the mapping between your database tables and your classes, easing the coding of create, read, update and delete interactions. But SQLAlchemy also ships with a reflect API that you can use to get information about schemas that you don’t control. You normally don’t want to do this, because your schema is defined by you and thus known in advance. But for applications where you enable the user to create their own tables, it’s perfect.

For example, our Data Library API allows our team to import a data file, and after some processing on our end, the user will be able to see tiles for their maps based on that data, with extra features, such as centroids (for labeling). To be able to do this, we need to know the schema of the data. We get this from the user’s data table, where they select which columns should be delivered (they are also able to perform transformations on them). We save some metadata in the main database related to the user uploaded data table (meta metadata, in fact, given that we store information regarding columns ;-)). This is a simplified snippet from a migration script when that feature was added:

‍
As portrayed above, reflect tells SQLAlchemy to read the database dictionary and discover the current schema. Once we have the schema, we can get the column definition, generate the initial metadata for the user in our internal format, with our conventions and defaults, and store it with a normal ORM operation on the related entity (Snapshot).

PostgreSQL views for flexible transformations and storage optimization

We import data into PostgreSQL, projecting, filtering, and transforming–without losing data. To achieve this, we use views created from the metadata that we saved from the previous reflect example. This is a quick diagram of what takes place between file importing and MVT delivery (arrows represent dependencies):

From the <p-inline>data_table<p-inline>, the centroids_table is created. This table contains a point for labeling generated with <p-inline>ST_PointOnSurface<p-inline>, the polygon area computed with <p-inline>ST_Area<p-inline>, and other attributes that are useful for styling or display. None of those tables is directly delivered as an MVT, though. A transformed view is created (with the standard <p-inline>CREATE VIEW<p-inline>) by picking only the columns selected by the user, with transformations such as initcase applied. In addition, a centroids view is also created by joining the centroids table with the transformed view. Those two views are merged as layers into the delivered MVTs, either from pg_tileserv with <p-inline>ST_AsMVT<p-inline> or with static generation of MBTiles. This way, storage is optimized: there’s no single duplicated column storage in the schema. In addition, the definition of the transformations is only in one view, easing maintenance, with the possibility of changing them with a single <p-inline>CREATE OR REPLACE VIEW<p-inline> command if needed.

Retool for pain-free building and maintenance of internal tooling

Being a good plumber means hiding your pipes behind a vanity when you’re all done. Maybe I’m stretching the metaphor, but all of this infrastructure needs an approachable gate that enables anyone on our team to achieve their data visualization goals. We’re using Retool, to build a new, unified application that hides the internals of our infrastructure, understands the timing and the internal workflow. You can create a UI by dragging and dropping components and wiring them to your systems. The Retool UI we keep iterating on frees up the rest of the team to focus on sourcing and styling the data, rather than getting lost in the complexity of building and maintaining a tailored solution.

‍

Join Us‍

If you’re a skilled 'data plumber’ and want to solve old problems in new ways, join us at Felt. We are always looking for skilled engineers!

...for applications where you enable the user to create their own tables, SQLAlchemy's reflect API is perfect.

SQLAlchemy’s reflect API for maximum schema flexibility

PostgreSQL views for flexible transformations and storage optimization

Retool for pain-free building and maintenance of internal tooling

Join Us‍

Choosing and Setting Up a Cloud PostGIS Solution for Felt: Navigating the Modern GIS Stack

Upload Anything: how we revolutionized data upload

Visualizing weather data using the Felt API

Felt Style Language: a powerful new mapping grammar

When “one size fits all” fails: PostGIS bounding boxes for maps

Rate limiting algorithms for client-facing web apps