Vincent Granville’s 66 job interview questions for data scientists

 

  1. What is the biggest data set that you processed, and how did you process it, what were the results?
  2. Tell me two success stories about your analytic or computer science projects? How was lift (or success) measured?
  3. What is: lift, KPI, robustness, model fitting, design of experiments, 80/20 rule?
  4. What is: collaborative filtering, n-grams, map reduce, cosine distance?
  5. How to optimize a web crawler to run much faster, extract better information, and better summarize data to produce cleaner databases?
  6. How would you come up with a solution to identify plagiarism?
  7. How to detect individual paid accounts shared by multiple users?
  8. Should click data be handled in real time? Why? In which contexts?
  9. What is better: good data or good models? And how do you define “good”? Is there a universal good model? Are there any models that are definitely not so good?
  10. What is probabilistic merging (AKA fuzzy merging)? Is it easier to handle with SQL or other languages? Which languages would you choose for semi-structured text data reconciliation?

To see the other 56 questions assessing “the technical horizontal knowledge of a senior candidate at a high level” go here 

About GilPress

I'm Managing Partner at gPress, a marketing, publishing, research and education consultancy. Also a Senior Contributor forbes.com/sites/gilpress/. Previously, I held senior marketing and research management positions at NORC, DEC and EMC. Most recently, I was Senior Director, Thought Leadership Marketing at EMC, where I launched the Big Data conversation with the “How Much Information?” study (2000 with UC Berkeley) and the Digital Universe study (2007 with IDC). Twitter: @GilPress
This entry was posted in Data Science, Data Science Careers, Data Scientists. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *