Pages

Jan 2, 2014

Hiring data scientists

Many times I am asked by friends and colleagues on who are data scientists. Many are interested in answers to a very practical question: “Who should I hire as a data scientist?”

In my practical experience in building data science teams, I have come to appreciate the following qualities:
  1. A fundamental understanding of machine learning. Ultimately data mining cannot exist without machine learning, which provides core technique. Thus a researcher in machine learning or related fields (such as natural language processing, computer vision, artificial intelligence, or bioinformatics) is an ideal candidate. They have studied different machine learning methods, and know the newest and best techniques to apply to a problem.   
  2. A sophisticated understanding of statistics and advanced mathematics. Such understanding requires years of training. Thus a Ph.D. degree is typically required for data scientists.
  3. Training in computer science. Ultimately, mining data is a way of computing. It requires design of computer algorithms that are efficient in memory (space) and time.  People who are trained in computer science understand the tradeoff of space and time in computer. They understand the basic concept of computational complexity. Someone who has majored in computer science would have this training ingrained in their DNA. 
  4. Good coding skill. We live in a big data era. In order to work with data, we write code to process them, clean them, and transform them. Then we need to create programs on big data platform, and test and improve the program constantly. All of these require good coding skill. Data mining is about implementation and testing. Programming skill is thus a core requirement.    
In hiring a data scientist, a few other qualifications are desirable but not required:
  1. Experience with big data. This enables someone to work in certain environments such as Hadoop, and use the tool fast. But such knowledge can be easily learned. 
  2. Knowledge of a specific program language. A good programmer can easily learn any new language quickly. In addition, there are many options to run big data program, from Python, to Java, to Scala. If a person masters any one of these languages, he can be very productive. 
A good data scientist who satisfies the 4 basic-skill requirements is hard to find today. Even though our universities train tens of thousands of them each year, the market demand is way higher than that. Many people have read this report by McKinsey, which states that there will be 140,000 job gap (higher demand than talent supply) for data scientists by 2018.

Even today, in early 2014, companies are struggling to bring in data scientists. Those who are on the job market are immediately snatched away by large and well-known companies.  Today, every company is trying to implement “data strategy” (or “big data strategy” in its fancier term).  This is a golden age for data scientists but a challenging time for employers.

34 comments:

  1. I am also agree with that data scientists need those qualifications that you have mentioned in above. I want to added something that a data scientists also need innovative idea. I hope my suggestion will work for you. Thanks!!

    ReplyDelete
  2. "data scientist?" I'm also interested because I do my research dealing with data. Thanks for your information

    ReplyDelete
  3. Contact us to hire our service like webs service web design service and much more......

    ReplyDelete
  4. Thanks for the best blog.it was very useful for me.keep sharing such ideas in the future as well. Thanks for giving me the useful information. I think I need it!
    Happy Wheels , FNAF World , Five Nights At Freddy's

    ReplyDelete
  5. This content is written very well. Your use of formatting when making your points makes your observations very clear and easy to understand. Thank you.
    - usps tracking
    - iphone 7 release date
    - netflix

    ReplyDelete
  6. Truyen ngon tinh hay la the loai truyen tinh cam
    Truyen teen hay la nhung truyen tinh yeu tuoi teen
    Don doc tai trang doc truyen online.

    ReplyDelete

  7. this is great atikel ,, i like it,, thank for u sharing info for all people Situs Agen Judi Piala Eropa 2016 Terpercaya

    ReplyDelete
  8. Nice blog.Thanks admin for sharing this information.This article may useful for many individuals who are all looking for Hadoop courses.
    Big Data Training in Chennai | Big Data Courses in Chennai

    ReplyDelete
  9. Great Information,it has lot for stuff which is informative.I will share the post with my friends.
    Case Solutions & Analysis

    ReplyDelete
  10. We have collected some of the most beautiful and awesome Happy New Year 2017 messages that you can easily share or send to your friends, boyfriend/girlfriend, family, and others.

    ReplyDelete