The Data Science Interview: Edwin Chen, Twitter

I don’t do pure research—my analysis enables real-world functionality

Currently mining terabytes of tweets as a data scientist with Twitter, Edwin Chen studied math and linguistics at MIT and then crunched numbers at Peter Thiel’s hedge fund, Clarium Capital Management. He blogs on topics of interest to data scientists such as crowdsourcing text analysis with Amazon’s Mechanical Turk or ggplot2, a data visualization tool. The following is an edited transcript of our recent phone conversation.

When you went to MIT, what were your future plans?     Continue reading

Posted in Data Science | Leave a comment

Vincent Granville’s 66 job interview questions for data scientists

 

  1. What is the biggest data set that you processed, and how did you process it, what were the results?
  2. Tell me two success stories about your analytic or computer science projects? How was lift (or success) measured?
  3. What is: lift, KPI, robustness, model fitting, design of experiments, 80/20 rule?
  4. What is: collaborative filtering, n-grams, map reduce, cosine distance?
  5. How to optimize a web crawler to run much faster, extract better information, and better summarize data to produce cleaner databases?
  6. How would you come up with a solution to identify plagiarism?
  7. How to detect individual paid accounts shared by multiple users?
  8. Should click data be handled in real time? Why? In which contexts?
  9. What is better: good data or good models? And how do you define “good”? Is there a universal good model? Are there any models that are definitely not so good?
  10. What is probabilistic merging (AKA fuzzy merging)? Is it easier to handle with SQL or other languages? Which languages would you choose for semi-structured text data reconciliation?

To see the other 56 questions assessing “the technical horizontal knowledge of a senior candidate at a high level” go here 

Posted in Data Science | Leave a comment

Data Science: Ranking Online Influencers

Data science is the defining specialty of the business of big data and an emerging career path for those who love to find new insights in the gazillion bytes of data created each day. It’s where you find fierce competition for talent, the jobs of the future, new training programs and courses, new ventures, and new products. But where to find the data science-relevant online conversations with the most impact?  Continue reading

Posted in Data Science | Leave a comment

DataKind’s Jack Porway on Data Science

[youtube=http://www.youtube.com/watch?v=Mm1RplOU0cQ&w=560&h=315]

“If you leave an excited data scientist on his own to solve a problem, he’s going to solve his own problem – which is usually parking his car, or finding a bar to drink at. The trick that we worked on was actually less about data and more about translation, about finding a way for data scientists to speak the language of the people who were trying to solve the big problems… the biggest [challenge] is actually the framing of the problem: really finding the question. As any good data scientist will tell you, it’s not so much about the data, it’s the question you start with”–Jack Porway, DataKind

More here

Posted in Data Science | Leave a comment

The $250K Median Salary of Data Scientists Managers is Why Google and Salesforce Invested $20B in Self-Service Data Science

2019 salaries of data scientists–managers (Burtch Works)

Earlier this month, Salesforce announced the acquisition of data visualization and analytics leader Tableau for $15.7 billion and Google announced the acquisition of data discovery and analytics platform Looker for $2.6 billion. Both acquired companies will beef up the acquiring companies’ Data Science as a Service (DSaaS) capabilities, providing their enterprise customers with a wide range of easy (or easier) to use tools that “democratize” data preparation, integration, analysis, and presentation.

With self-service data science, all business users that do not have statistical analysis background and don’t know how to code can make data-driven decisions, instead of relying on expensive and hard-to-find data scientists.

How expensive? The average annual base salary for an experienced data scientist in a management position is currently $257,443 according to Burtch Works.

Read more here

Posted in Data Science | Leave a comment

Data Science at Zillow (Slideshare)

[slideshare id=45132578&doc=pythondatascienceatzillow-150225104833-conversion-gate02]

Posted in Data Science | Leave a comment

The Data on Data Scientists (Infographic)

Data-Scientists-Infographic

Source: Bob Hayes

Posted in Data Science | Tagged | Leave a comment

Being a Data Scientist in 2015 (Infographic)

CrowdFlower_Infographic_Survey_72dpi

Posted in Data Science | Leave a comment

Most In-Demand Data Science Skills

Data-Science-Skills2016

Source: CrowdFlower, based on “3500 relevant job openings from LinkedIn.”

The folks at CrowdFlower excluded Excel from their list but noted that “that’s still something you see in myriad job listings. Old habits die hard.” Of course, data scientists don’t want to associate the “sexiest job of the 21st century” with old habits. Employers, however, want to cover all bases, sexy or not.

Posted in Data Science, Misc | Tagged | Leave a comment

What Data Scientists Do

Data-scientist-what-I-do

 

Source: siliconrepublic

Posted in Data Science | Leave a comment