Uploaded by Zhenning Xu

Explain Big Data to a Six-Year-Old

advertisement
Explain Big Data to a Six-Year-Old
Daddy, what is Big Data?
Nam Nguyen
Follow
May 16 · 4 min read
Photo by Remy_Loz on Unsplash
Being an expecting parent, I have an unspeakable fear that one day my child will articulate its
first serious concern: “Daddy, what is Big Data?”.
That might not turn into a reality, but when my children wonder about their father’s profession,
how could I explain the notion of Big Data to them?
Fast forward to my baby’s sixth birthday. My greatest ever fear will come true after all. The
innocent mind will start to be curious about what Daddy’s doing at work. Coming home after a
long day, I will eventually have to tell my kid what one of my typical days looks like. The final
moment will arrive when I proudly offer my child a hand-picked present.
I didn’t get to have any LEGO set when I was a child, so I promise I would make it up for my
children. As I breaking my child’s favorite LEGO set into pieces, I say “My child, listen and
listen carefully, I will explain Big Data once and for all using these LEGO bricks”.
Imagine you have a LEGO pile with 100 pieces of various colors and shapes, and Daddy ask you
to gather all the red ones. Being a six-year-old, you can do it in a matter of minutes. The easiest
way is to check every piece and take out the red one. Sooner or later, you’ll have them all in your
little tiny hand.
Photo by Fran Jacquier on Unsplash
However, Daddy’d like to challenge things up a little. While you’re counting the original pile,
Daddy will add 200, 300, or even 1000 more bricks. A six-year-old might not know how big
1000 LEGO pieces are, but as you keep counting and counting, it will never be finished.
Photo by Kelly Sikkema on Unsplash
Photo by Omar Flores on Unsplash
When Daddy’s at work, people also ask me to take out LEGO bricks. But they have much bigger
LEGO piles as we do. The pieces can stack up to the size of a house, a building, or even a
mountain.
If Daddy keeps counting the way you do, he will be exhausted and won’t have any time to play
with you after work. But Daddy’s smart, he has some secret weapons up his sleeves. He has two
friends to help him out: RangO and StackO.
What RangO does is he sorts things out. When you have a small LEGO pile where everyone is
in your eyesight, choosing a specific color seems simple. But when things got way larger, Daddy
cannot handle all the searches and picking out all by himself. He asks RangO to organize the
LEGO pieces by color. Instead of having a colorful LEGO pile, Daddy can have several ones
who share the same color. Do you see how easy it is for Daddy to pick up the red bricks now?
The cool thing is RangO can prepare the pieces by shape, color, or anything Daddy can think of.
How’s about StackO? When Daddy has to pick up a house-size of red LEGO pieces in the red
pile prepared by RangO, Daddy cannot bundle all of them up by hand. StackO helps Daddy to
attach all the pieces together to form a single LEGO set. StackO is best for putting up many
small fragments into a single form. StackO can also connect pieces of different kinds. Daddy
can have a building-size LEGO set from red, blue, and green pieces.
With the help of RangO and StackO, Daddy is no longer fear of picking LEGO bricks, and
that’s why Daddy is always happy coming home to you.
The LEGO set represents an amount of data with hundreds of information fragments. The
searching and picking work without a doubt on a small volume, but it doesn’t scale. Data grows
exponentially while operations only develop linearly. RangO and StackO are no more than
technical tools in Data Preparation, Data Cleaning, and Data Storage.
Big Data in real life doesn’t limit itself in these simple processes, but to throw light upon Big
Data land for its inquisitive visitors, it’s necessary to demonstrate one of the core concepts. We
fear something we cannot understand. Big Data is not about complex operations, but rather
simple manipulations on a big scale.
Big Data is not about complex operations, but rather simple manipulations on a big scale
Photo by Caleb Woods on Unsplash
My name’s Nam Nguyen, and I write about Big Data. Enjoy your reading? Follow me on
Medium and Twitter for more updates.
Download