Database Selection & Design (Part IV)

— 5 V Properties of Big Data —

Volume

Your design should consider this as a key requirement to analyze the current density of data and also review the forecasting of the data growth for the next few years. This influences the type of database you will need to run your application. This also influences the way you are going to store, partition and replicate.

Velocity

Variety

  • Structured data is one whose format, length and volume are clearly defined
  • Semi-structured data is one that may partially conform to a specific data format
  • Unstructured data is unorganized data and doesn’t conform with the traditional data formats.

A company can obtain data from many different sources: from in-house devices to smartphone GPS technology or what people are saying on social networks. The data formats range from plain text to videos, images, pdfs, reports etc. The structure of your data dictates how you need to store and retrieve your data. Understanding the structure of your data is key in selecting the database. Not all databases in the industry support all type of data structures.

Veracity

  • Can you trust the data that you have collected?
  • Is this data credible enough to glean insights from?
  • Should we be basing our business decisions on the insights garnered from this data?

All these questions and more, are answered when the veracity of the data is known. Since big data is vast and involves so many data sources, there is the possibility that not all collected data will be of good quality or accurate in nature. Hence, when processing big data sets, it is important that the validity of the data is checked before proceeding for processing.

Value

Conclusion:

Information derived from high volume, high velocity and validated data collected from varied sources can add value to the overall decision-making of the company. While most organizations today do have the intent to use data, many are struggling to effectively capture, store, process or harness it. Your design should be in such a way that this should be very seamless for business with ever changing dynamics of the user behaviors.

Link to the next part in this series:

https://medium.com/@f5sal/database-selection-design-part-v-f93cc9e5efc9

--

--

Engineering Director, People Leader, Offroader, Handyman, Movie Buff, Photographer, Gardener

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Faisal Mohamed

Engineering Director, People Leader, Offroader, Handyman, Movie Buff, Photographer, Gardener