Connecting the universe of external data to your world
The biggest barrier to data use is not a talent shortage – the data science community is stronger, smarter, and bigger than ever. The bottleneck for most companies is that finding, connecting to, and preparing data from outside sources has a massive resource cost.
We've created a massive marketplace full of data from thousands of worldwide sources to simplify the most tedious parts of data science. We've also made our platform compatible with dozens of data types that are normalized to a common format, so that it's easier to blend and enrich data.
Our data catalog solution puts every data point you need on a level playing field. That way, your organization can find, share, govern, and use quality data from any source with less friction.
Our data ingest pipeline has been trained on around 300,000 datasets of almost every size, format, and state of cleanliness you can imagine. The result is an ETL that's: incredibly performant, even at large scales; capable of handling data in a wide variety of formats; machine intelligent and automated to save you time; and fully configurable for handling messy and unruly data. Our pipeline isn't dependent on an entire team at your company either; once your catalog is deployed, you're ready to start flowing data through. For static data, a drag-and-drop interface makes importing data a 2-click operation.
When connecting to cloud warehouses, we make it easy to connect to the source and set schedules to gather data at any interval – daily, weekly, or even on a custom cron schedule.
Using data that updates often means a compromise: set up a complicated connector to do it automatically, or download a static file and replace it as necessary? How many unique sources can you manage in either case? How many could you use if you didn't have to worry about it?
ThinkData Works has found the best of both worlds. From a ThinkData Catalog, you can efficiently bring live, updating data anywhere. Our API is built on our own NiQL query language with SQL-like syntax, so there's virtually no learning curve. We've also created custom connectors to plug data from your catalog directly into solutions like Tableau and Power BI, and a Python library to follow typical data science workflows in applications like Jupyter Notebooks.