Intro to Data Science

What is Data Science?

  • Relatively new field intersecting CS, Statistics and Business
  • Determining consumer trends, business opportunities, and user behavior
  • Drawing valuable insight from data - data is "unbiased"
  • Some areas that overlap:
    • Computational Statistics
    • Big Data and Data Mining
    • Machine Learning

How is Data Science used?

A/B testing - web applications with different features

Consumer recommendations - Amazon, Netflix, etc.

Consumer trends - How people use certain products

FastCompany - The Best (and Worst) Times to Post on Social Media

Revenue opportunities - ad targeting, investor/stock forecasting

Accessibility

Technologies to "do" Data Science

  • Programming Languages - Python, R (statistical computing)
  • Big Data Frameworks - Hadoop, Spark, Pig
  • Concepts - Map-Reduce, Aggregation, Regression, Distributions
  • Visualization Frameworks - Matplotlib, D3, Processing

Tutorial

Brainstorm

  • What would you like to build?
  • What features? Sketches.
  • What technologies do you think you'll need?
  • What is your plan for developing?

Potential Projects

  • Your Own Personal Website
  • Data Analysis on a Dataset
  • Web or Mobile App
  • WCS Display for our TV
  • Chrome Extension
  • Slackbot

Submit your Projects!

http://tinyurl.com/ttproject