Data Acquisition for Analytics and Data Science

This certificate is intended for data analysts and data scientists with at least one programming course or equivalent experience in Python.

There are lots of fun and interesting tasks in analytics and machine learning, such as selecting features, training a model, and interpreting results. But all of that presupposes a tidy data set that is suitable for analysis or training models. Industry experts all agree that data collection and cleaning is roughly 3/4 of any analysis effort. This certificate teaches you how to collect, coalesce, and clean data from multiple sources in preparation for your analysis work. Participants will learn about data formats, how information flows through the Internet/web, how advertising companies track your web activity, how to use REST API services (from Zillow, YouTube, IMDB, etc.), how to scrape data from HTML, and finally how to simulate a human using a web browser to scrape data from JavaScript-based websites. Classes will be lab-driven with lectures, explanations, demos from the instructor.