Big data and how to get them

It's said that nowadays data is the 'new oil'. I don't like this quote too much there is a certain amount of truth in that quote. Data is becoming even more valuable than it used to be. However, the question is - how to get it.

The lives of others

Big data sources

You can get some of them from some free (usually government) databases or commercial ones:

  • free source databases:
    • local government databases
    • institutional databases or
    • world bank data
  • commerical databases:
    • Trading economics (some data you can get for free)
    • Statista
    • XE forex
    • Stooq


Due to legal restrictions, you can't gather legally all user data without their consent. It's no longer 'wild west' on this matter. So, is there only a meticulous solution for this - slow, one by one getting consent from the user to get fractions of data which it used to be collected before? No, there is one exception - your own employees. Your employees are obliged to follow your 'code of conduct', follow security guidelines etc. Even though you have to inform them that the data are being collected, they have no power to refuse it.

Data brokers

Getting enough data is sometimes quite a burden. That's why there are many companies which will sell you whatever data you want. It's only matter of money. They are called sometimes data providers or brokers

Gray zone

Some 'wise' companies get an idea to get the data from the clients. I will explain it on a very known example.

There are plenty like 'truecaller' applications which are forcing people to get their contacts to get registered in the application. So they're building their database by getting contacts from the users who want to use their services, of course 'for free'. They do it by using their clients data. Surprising? Legal? Maybe. Moral? Not very but it's becoming standard.

Employees - perfect 'lab rats'

Lab rats

It all changes itself when you deal with your employees. You can search / collect their:

  • e-mail usage metrics and content,
  • communication behavior,
  • record audio & video meetings,
  • use AI to recognize them
  • and many others.

It's no surprise. However, I noticed even more strange trend that companies start serving 'wellbeing' programms especially in the current covid-19 situation. They want to collect all 'fit' semi-health data which potentially may reveal to the company more than you want to do so. They collect:

  • foot steps,
  • activity statistics
  • basic health parameters like glucose, cholesterol levels,
  • psychological data like your happinnes level and so on.

They tell you that they care about your health - they compare your statistics to 'gamify' it in order to entice you to follow correct health habits. However, I am not so convinced. More likely they want only get free source of big data troves.


As you can see, getting enough "big data" is not an easy ask, especially if you want to get it morally. Data is valuable thing but in the longer term also matters how you get it as your clients / employees may realize themselves that you abused their trust.

