Explain Big Data to a Six-Year-Old Daddy, what is Big Data? Nam Nguyen Follow May 16 · 4 min read Photo by Remy_Loz on Unsplash Being an expecting parent, I have an unspeakable fear that one day my child will articulate its first serious concern: “Daddy, what is Big Data?”. That might not turn into a reality, but when my children wonder about their father’s profession, how could I explain the notion of Big Data to them? Fast forward to my baby’s sixth birthday. My greatest ever fear will come true after all. The innocent mind will start to be curious about what Daddy’s doing at work. Coming home after a long day, I will eventually have to tell my kid what one of my typical days looks like. The final moment will arrive when I proudly offer my child a hand-picked present. I didn’t get to have any LEGO set when I was a child, so I promise I would make it up for my children. As I breaking my child’s favorite LEGO set into pieces, I say “My child, listen and listen carefully, I will explain Big Data once and for all using these LEGO bricks”. Imagine you have a LEGO pile with 100 pieces of various colors and shapes, and Daddy ask you to gather all the red ones. Being a six-year-old, you can do it in a matter of minutes. The easiest way is to check every piece and take out the red one. Sooner or later, you’ll have them all in your little tiny hand. Photo by Fran Jacquier on Unsplash However, Daddy’d like to challenge things up a little. While you’re counting the original pile, Daddy will add 200, 300, or even 1000 more bricks. A six-year-old might not know how big 1000 LEGO pieces are, but as you keep counting and counting, it will never be finished. Photo by Kelly Sikkema on Unsplash Photo by Omar Flores on Unsplash When Daddy’s at work, people also ask me to take out LEGO bricks. But they have much bigger LEGO piles as we do. The pieces can stack up to the size of a house, a building, or even a mountain. If Daddy keeps counting the way you do, he will be exhausted and won’t have any time to play with you after work. But Daddy’s smart, he has some secret weapons up his sleeves. He has two friends to help him out: RangO and StackO. What RangO does is he sorts things out. When you have a small LEGO pile where everyone is in your eyesight, choosing a specific color seems simple. But when things got way larger, Daddy cannot handle all the searches and picking out all by himself. He asks RangO to organize the LEGO pieces by color. Instead of having a colorful LEGO pile, Daddy can have several ones who share the same color. Do you see how easy it is for Daddy to pick up the red bricks now? The cool thing is RangO can prepare the pieces by shape, color, or anything Daddy can think of. How’s about StackO? When Daddy has to pick up a house-size of red LEGO pieces in the red pile prepared by RangO, Daddy cannot bundle all of them up by hand. StackO helps Daddy to attach all the pieces together to form a single LEGO set. StackO is best for putting up many small fragments into a single form. StackO can also connect pieces of different kinds. Daddy can have a building-size LEGO set from red, blue, and green pieces. With the help of RangO and StackO, Daddy is no longer fear of picking LEGO bricks, and that’s why Daddy is always happy coming home to you. The LEGO set represents an amount of data with hundreds of information fragments. The searching and picking work without a doubt on a small volume, but it doesn’t scale. Data grows exponentially while operations only develop linearly. RangO and StackO are no more than technical tools in Data Preparation, Data Cleaning, and Data Storage. Big Data in real life doesn’t limit itself in these simple processes, but to throw light upon Big Data land for its inquisitive visitors, it’s necessary to demonstrate one of the core concepts. We fear something we cannot understand. Big Data is not about complex operations, but rather simple manipulations on a big scale. Big Data is not about complex operations, but rather simple manipulations on a big scale Photo by Caleb Woods on Unsplash My name’s Nam Nguyen, and I write about Big Data. Enjoy your reading? Follow me on Medium and Twitter for more updates.