I read recently that data is now considered “the new currency”. No kidding, I thought! Organisations that are seeking a competitive advantage are finding their edge in the data they capture and process.
We no longer talk about just a simple data warehouse. Instead, we’re talking about data lakes, which are a repository for all our unstructured and semi-structured data, from numerous sources.
Let’s think about a Travel company and the type of data it might store. It will capture data from customer bookings, social media activity, website logs, and industry trends. The data stored in their data lake may include:
Transactional data, such as the date and time of a booking, the destination, and the price.
Social media data, such as how many times the company has been mentioned on Twitter, and how many ‘likes’ it’s most recent Instagram post attracted.
Website logs, such as the number of visits, most popular times, user activity, and errors.
But what is the best way to store this data? You need to choose the right type of database and this depends on the type of data and how you need to process it.
As an example, social media platforms such as LinkedIn query huge amounts of connected data in order to make recommendations about people or pages that you may be interested in. Traditional relational databases, which store data in rows and columns within related tables, are not designed for these types of queries, and become slow and cumbersome to use.
Increased demand for availability, flexibility, and performance, together with a shift towards cloud providers such as AWS and Microsoft Azure, has created an eco-system consisting of many different types of database that are optimised to deal with specific types of workload.
As a result, we now have a seemingly unlimited amount of options and services when it comes to storing our data. Here’s a brief rundown of some of these options and the scenarios and common use-cases.
Relational
Key Features | Referential integrity, ACID transactions, Fixed Schema |
---|---|
Examples | AWS Aurora, AWS RDS, Azure SQL Database, MySQL, Microsoft SQL Server |
Common Use Cases | ERP systems, CRM systems, finance systems, lift and shift applications to the cloud |
Key-value / Document
Key Features | Flexible schema, high-throughput, low latency read/write, highly scalable |
---|---|
Examples | AWS DynamoDB, Azure Cosmos DB |
Common Use Cases | Shopping cart, social, product catalogue, customer preferences, user profiles |
In-memory
Key Features | Microsecond latency |
---|---|
Examples | AWS ElastiCache, Azure Cache for Redis |
Common Use Cases | Gaming leaderboards, real-time analytics, caching |
Graph
Key Features | Quick and easy navigation of complex relationships between data |
---|---|
Examples | AWS Neptune, Azure Cosmos |
Common Use Cases | Fraud detection, social networking, recommendation engines |
Leger
Key Features | A complete, immutable, and verifiable history of all changes over time |
---|---|
Examples | Amazon QLDB, supported in Azure SQL and Microsoft SQL Server |
Common Use Cases | Supply chain, health care, financial transactions |
I hope this has given you a good introduction to the different types of database to consider for your data. If you’re interested in discussing this further, please get in touch with us at Koderly for a coffee and a chat!