Note

Musings may contain typos, grammatical errors, rough ideas and weak arguments

I was hosting a couple of high school kids at the lab this summer and the conversation naturally geared toward what I do everyday. To talk about that, I had to first explain the concept of datasets and other related details like data-(mining, engineering, analysis etc.)

I wanted to use an example they could all relate to, so I decided to talk about websites and the information they contain. My idea was the use this as a way to explain the process of gathering this information (data-mining) to form datasets that would eventually be analyzed or used to build predictive or generative models.

I thought this would be a simple way to launch the idea of what I do, so I asked them to tell me some of their favorite websites. All of them either chorused “Instagram” or “TikTok”. I was but surprised. Technically these are still websites, but that was certainly not the response I was expecting. My idea for “websites” wasn’t these two social apps. I think reddit might have even be more along the lines of what I was expecting.

I didn’t show my surprise. I instead started to explain using an hypothetical scenario. “Say you wanted to get the profiles name of all the people that liked your last five picture, how do I go about this ?” I used this to break down the concept of gathering related information into datasets and we eventually talked about questions we could answer with that data. Like, “Which friend has interacted with all my recent post ?”

After this exercise, I began to thing about the concept of websites and what they mean these days.