Big Data, Small World: Kirk Borne at TEDxGeorgeMasonU (Video)

Kirk Borne is Professor of Astrophysics and Computational Science in the George Mason University School of Physics, Astronomy, and Computational Sciences (SPACS). Turns out he is the father of the term “unknown unknowns” – things we do not know we don’t know – popularized by former secretary of defense Donald Rumsfeld and later by Avinash Kaushik as “the unique space in which big data analysts should actually play.”

 

Posted in Big Data Analytics | Leave a comment

Big Data Friday

“I can’t express how infuriated I am that my credit history, phone activity, and online browsing habits are being systematically collected and archived without my knowledge by undisclosed organizations that aren’t trying to sell me products”–Area Man

“PRISM 1.0 was a little glitchy, and now that we’ve smoothed out the bugs, well, your privacy, especially inside your own home, will be a thing of the past. The technology is so good that it will basically be as if a member of the NSA is standing right behind you at all times”– NSA director General Keith B. Alexander announcing PRISM 2.0

“NSA email me with job offer. Offer say ‘To accept, nod once. To decline, nod twice’”–BigDataBorat

Posted in Misc | Leave a comment

Big Data Friday: Borasky’s Law

  • Murphy’s Law: Anything that can go wrong, will go wrong.
  • O’Toole’s Corollary: Murphy was an optimist.
  • Sturgeon’s Law: 95 percent of everything is crap.
  • Mencken’s Law: Nobody ever went broke underestimating the intelligence of the American public.

Borasky’s Law: Sturgeon and Mencken were optimists, too.

Source: What Hath Von Neumann Wrought?

Posted in Misc | Leave a comment

LinkedIn’s Daniel Tunkelang on How to Interview a Data Scientist

Tunkelang: The O’Reilly Strata Conference brings together an incredible community of people working on big data. This year, I decided to do something different for my presentation. Rather than talk about science or technology, I addressed the practical problem of interviewing the candidates to build teams of data scientists.

[slideshare id=16798687&w=427&h=356&sc=no]

Posted in Big Data Analytics, Big Data Jobs, Data Science, Data Science Careers, Data Scientists | Leave a comment

Vincent Granville’s 66 job interview questions for data scientists

 

  1. What is the biggest data set that you processed, and how did you process it, what were the results?
  2. Tell me two success stories about your analytic or computer science projects? How was lift (or success) measured?
  3. What is: lift, KPI, robustness, model fitting, design of experiments, 80/20 rule?
  4. What is: collaborative filtering, n-grams, map reduce, cosine distance?
  5. How to optimize a web crawler to run much faster, extract better information, and better summarize data to produce cleaner databases?
  6. How would you come up with a solution to identify plagiarism?
  7. How to detect individual paid accounts shared by multiple users?
  8. Should click data be handled in real time? Why? In which contexts?
  9. What is better: good data or good models? And how do you define “good”? Is there a universal good model? Are there any models that are definitely not so good?
  10. What is probabilistic merging (AKA fuzzy merging)? Is it easier to handle with SQL or other languages? Which languages would you choose for semi-structured text data reconciliation?

To see the other 56 questions assessing “the technical horizontal knowledge of a senior candidate at a high level” go here 

Posted in Data Science, Data Science Careers, Data Scientists | Leave a comment

Data Science at Netflix with Elastic MapReduce

[youtube http://www.youtube.com/watch?v=oGcZ7WVx6EI]

Posted in Data Science | Leave a comment

DJ Patil at LeWeb, December 2012

[youtube http://www.youtube.com/watch?v=J_CYKk8q1Ao]

Summary of the presentation by Ben Rooney here

Update: Ben Rooney interviews DJ Patil

[youtube http://www.youtube.com/watch?v=0LtzMhr0ZCM]

Posted in Data Science, Data Scientists | Leave a comment

Past Courses in Big Data Analytics and Data Science: Content Online

Past Courses

in Big Data Analytics and Data Science

Content Online

Analyzing Big Data with Twitter (UC Berkeley, School of Information) (Fall 2012)

Introduction to Data Science (Columbia University, Statistics Department) (Fall 2012

Introduction to  Data Science (UC Berkeley, Computer Science) (Spring 2011)

Posted in Big Data Analytics, Big Data Education, Data Science, Data Science Education | Leave a comment

Social Media & Web Analytics Innovation Summit

Join me in Boston this September 13 & 14 for the exclusive Social Media & Web Analytics Innovation Summit – bringing together the industry’s most innovative leaders and professionals for two days in an open and interactive environment.

The event will combine keynote presentations from over 35 industry experts, with interactive breakout sessions and open discussion. There will also be networking opportunities and workshops to share industry insights and innovation with your peers.

Confirmed Speakers include:

– Vice President, Digital Marketing & Analytics, Discovery
– Vice President, Web Analytics, Amazon
– Head, Digital Marketing, Siemens
– Senior Vice President, Research, NBCUniversal
– Director, Product Intelligence, Salesforce
– Senior Director, Personalization & Targeting, CBS Interactive
– Director, Global Social Media, Ancestry
– Director, Business Intelligence, KPMG
– And many more…

Register online at http://analytics.theiegroup.com/social-boston/registration

Posted in Misc | Leave a comment

Big Data Quotes of the Week: August 10, 2012

“With big data, you have only two concerns, but they are, naturally, big ones: where the data will come from and what your company will do with it. Solve these and you have big data licked… IT projects have to be fully buzzword-compliant or they’ll fail. For a big data project, this means Hadoop. If you don’t want to invest staff time and energy learning this technology, do what my client did: Build a virtual server, install MySQL on it, and assign the name “Hadoop” to the server. When your BDSC (big data steering committee) asks if you’ve installed Hadoop, you can answer in the affirmative with a clear conscience”—Bob Lewis    Continue reading

Posted in Big Data Analytics, Data Scientists, Quotes | Leave a comment