3V’s of Big Data


Volume, Variety & Velocity

In the big data world, the data has a few properties if it is big data. These are termed in 3V’s of big data, namely Volume, Variety and Velocity. Let’s take a look at each in detail.

v1-1024x307 3V’s of Big Data
Volume source: unsplash.com

Volume 

The price of storage per megabyte or gigabyte is drastically decreased over the last decade. The amount of data that would cost, say $1 now, might cost $0.01, just in a matter of ten years. Reliability in storage has also improved proportionally. Network storage architectures have helped improve that. This has resulted in a dramatic rise in capturing and storing data. The ‘Volume’ of data has increased. The amount of data stored is referred to as the volume of data.

Previously when data storage was not so cheap, the data stored was critical, meaning the only critical components of the data were stored for, e.g., Sales data, Transactional data…etc.

After abundant storage space availability, data that was not so critical to business but was generated and thrown away also was stored. For, e.g., server logs, user behavior … etc.

e.g., of server log:

64.242.88.10 - - [07/Mar/2004:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291

With the revolution of IT in all the fields, the data storage demands for the Medical, I0T, Manufacture, SocialMedia …etc., have also increased, giving rise to increased volume in data storage.

v2-1024x330 3V’s of Big Data
Variety source: unsplash.com

Variety 

The format in which data is available today is very wide. Data is available in text, image, audio, video or a combination of any of these formats. The ‘Variety,’ i.e., the diversity in representation of the data, refers to here.

One might argue that we do not need to store data in its original format. It can always be transformed and stored, getting rid of the variety, but in doing so, we lose information. For, e.g., let’s say there is a 5 min audio conversation. We can convert it to text and store it, but it will not be as good as the original conversion. The signals, such as the speaker’s emotions, tone and other signals that make sense in vocal communication, are lost. And since storage is getting cheaper and cheaper, it’s beneficial to store varied data.

Velocity 

Velocity is the rate at which data is generated. The data generated by big internet companies are at terabytes/day or petabytes/week. We need to account for those needs and have systems in place to store data at that velocity. Some data might be generated fast. Some might be generated slow. Depending upon the application in question, the data velocity might differ.


The 3 V’s govern data and it is a good practice to have those in mind while devising a solution.

For any questions and inquires visit us on  https://thinkitive.com/

Kaustubh

I look after Technology at Thinkitive. Interested in Machine Learning, Deep Learning, IoT, TinyML and many more areas of application of machine learning.

Related Articles

One Comment

  1. I was excited to discover this website. I want to to thank you for your time for this wonderful read!! I definitely liked every part of it and i also have you book-marked to look at new things on your website.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button