Actually, V’s of big data represents the characteristics of big data. There are currently many v’s (approx 10), but 5 v’s of big data are very important. These are Volume, Velocity, Variety, Veracity, and Value.
Big data is a term that refers to a huge amount of data, of various kinds. This data is constantly increasing and is so huge that it can’t be processed and analyzed with our traditional methods. Big data is generated on a very large scale and is mostly being used by large companies and tech giants like Apple, Google, Facebook, etc. for analyzing and improving their products and services
What are the 5 v’s of big data?
The v’s of big data define the characteristics of data. It all started in 2001 when a firm MetaGroup(now called Gartner) introduced data scientists and data analysts with 3 characteristics of big data called Volume, Velocity, and Variety which are known as 3 v’s of big data. Later it got two new additional V’s i.e, Veracity and Value. Currently, we have more than 10 V’s but these 5 are the most important V’s of big data.
Also read –> What is docker and why we need it?
So, let’s see it one by one
Volume in Big Data
Volume is simply the amount or size of data we deal with. It is the most important characteristic of big data as you can notice the word “Big” in Big Data represents the volume of data. Data is generated every moment on every action like the booking of flight, transaction of the user, surfing on the internet, etc. And these huge amounts of data are used by big companies and tech giants like Amazon, Google, Facebook, etc for near real-time analysis of their products and services.
Velocity in Big Data
Velocity refers to the amount of incoming data in some interval of time for example in Facebook, 1.7MB of data is generated every second by just a single user out of their 2.7 billion active users. Also according to a Forbes 2019 report, every minute users watch 4.15 million YouTube videos, send 456, 000 tweets on Twitter, posts 467, 400 photos on Instagram, posts 500,000 comments, and about 293, 000 statuses are updated on Facebook.
Also read –> Deep learning complete guide for beginners without math
Variety in Big Data
Before social media, we mostly used to have structured data stored in excel file or database, but due to recent development and rise in digitalization, we are getting an extreme amount of other types of data too, like images, videos, gif, etc which are categorized as semi-structured, and unstructured data.
Structured data → Data in a organized format and specified labels like tabular data
Semi-Structured data → this type of data is semi-organized like email, log files etc
Unstructured data → Unorganized Data with no format and labels, for eg, text, image, videos, gif etc
Veracity in Big Data
Veracity is another characteristic of big data which represents the level of trustworthiness or messiness of data. In the real world, it is hard to know if any information present in the data is correct or incorrect. We could have many mistakes, uncertainty, and incompleteness in our data.
For eg if a shopkeeper has purchased 400 eggs and sold 200 eggs then his inventory management device will show 200 eggs are left but, there could be a chance were some eggs may have broken or some eggs purchases were not recorded in the database.
Value in Big Data
Value is one of the most important characteristics of big data as after collecting lots and lots of data if a business can’t extract or turn it into valuable data then it will be a complete waste of money and power.
For example, when you search for something on youtube, it collects that data and starts showing you similar videos on your home feed. Which helps Youtube in retaining and providing personalized videos to their user. If youtube doesn’t have utilized and extracted value from those collected search data, then it would not be able to retain those users for a longer time.
So, that was all about 5 v’s of big data, if you want to learn more about it then you can read it here I hope you liked the article
Data Scientist with 3+ years of experience in building data-intensive applications in diverse industries. Proficient in predictive modeling, computer vision, natural language processing, data visualization etc. Aside from being a data scientist, I am also a blogger and photographer.