Development process,

Judging From Experience: Things You NEVER Do To Big Data

2 m read
Maks
Maks Marketing Manager
May 15, 2015
Share on
Reading Time: 2 minutes

We have been through many projects as a company that provides outsourcing solutions. In some worst case scenarios we had to fix things other teams before us failed at. It seems quite odd that many businesses choose price over quality and outsource to cheapest Indian service providers without even a glance at their portfolio or qualifications.

No, I do understand we cannot be called the only fish in the sea, but seriously, why choose a partner that will most certainly fail a project? You can always tell about such unqualified businesses from others, by their experience, website, pricing policy, ways of communications, etc. However, that’s not our point right now. What were the worst things we have seen people do to something as vital as Big Data?

  • Using MongoDB as the platform of choice is just wrong in many ways. It’s not that Mongo is all terrible, no. It is really sweet at numerous things if it’s your operational base. And still it’s a terrible analytical system. In simple words you don’t analyze with Mongo, but you can collect data for further analysis with it.
  • Data ponds are a bad decision. Divide and conquer does not work with Big Data in a way you would have expected. If every business group will create a personal data pond on your way to the data lake creation you will end up with results that are not as good as you wish them to be. Data will get changed, shifted and manipulated leading to you having multiple answers to the same question at the end when all data is collected together. You see, dividing data is not bad, but making too many separate ponds is terrible. Plan ahead, but don’t try to structure every single detail. Go for most general queries.
  • SQL is not the only possible solution and everything related to Big Data cannot be achieved with SQL only. Hive, MapReduce, Pig, Uzi; all of them were created with a purpose so not using them is mere stupidity and thickheaded behavior.
  • Did you even know that HDFS is in no way a file system? It’s not that you dump some files in it and you are done. I mean, sure there are various tools that assist in multiple things like Hive or Pig that were mentioned earlier but it really is not an excuse to just mindlessly dump everything in without second thoughts. Big Data simply does not work that way. You necessarily have to plan what you are putting data into and why are you doing so. Also there are security measures that mean you must know what to protect.

Surely this is in no way a complete list of horrible things you can allow to be done with Big Data, but I, personally, hate those the most and I do advise you: don’t allow this to happen. It will never lead to any good.

Categories

Recent Posts

  • Best Frontend Programming Languages to Create Beautiful and Fast Interfaces

    Read more
  • Capability Maturity Model Integration – QArea's big journey

    Read more
  • QArea is a Unique Services Provider in 2019 CEE Awards

    Read more
  • Ultimate Development Trends in 2018 to Reward Your Business in 2019

    Read more
  • Why You Should Write Your Next Microservice Using Golang

    Read more
  • 7 Reasons to Truly Love Microservices

    Read more
  • The Best Languages for Microservices

    Read more
  • QArea's Year: Summing Up 2018

    Read more

Subscribe

Yes

Share on
Privacy Preference Center