Course Detail

Data Analysis with R and Python (32100)

Course Description by Faculty

  • Lu, Haihao
  • Content
    Digital technologies and data analytics are radically changing the business models of today's organizations. The new areas for exceptional growth now orbit around artificial intelligence, machine learning, and deep learning. Lacking an understanding of how to work with data using analytics puts you at a competitive disadvantage.

    For those seeking to learn the basics of programming and gain working knowledge in analytics, we offer a challenge: The Data Analysis with R and Python, which combines two of the most popular programming languages in one ten-week course. The best way to learn business analytics is to analyze data yourself, using modern languages and tools.

    This class is based on Data Analysis with R (41205) and Data Analysis with Python (36108). It could be taken as a preparation for several down-stream classes: Business Statistics (41000), Applied Regression (41100), and Business Applications of Natural Language Processing (42118).

    This course requires no prior programming experience. It is a "Zero to One" class designed for absolute beginners looking to become productive quickly. If you already have some skills in one or both of the languages, you should consider taking one of the more advanced courses listed above.

    Although you will be learning two languages, our goal will be to give you minimal proficiency that you can build upon in future coursework and other opportunities. The higher-level goals are to:

    1. Understand the process of data analytics on a personal, emotive and experiential level.

    2. Learn the “developer thinking” and communicate more
    effectively and credibly with programmers on your teams.

    3. Manage analytics projects better with accurate assessments of costs and benefits for different analytics processes.

    4. Be able to make changes and code a few snippets yourself.

    Course Format:
    We will learn through a series of case studies/data sets, starting with cleaner, more structured examples and learn the fundamental skills. We will then go through more complex, real-world applications as you become more knowledgeable and confident in analytics.

    Once you get a hold of the data analysis process in R, we will then practice it again in Python. This way you get to solidify what you previously learn while picking up specific skills in a new language. We will focus on essential, applicable knowledge instead of the theoretical backgrounds.

    The process will generally consist of the instructor demonstrating first and then giving the class a challenge to practice it and think up questions. You are strongly encouraged to team up and work together, but you will submit assignments individually.

    Expectations:
    We require workable laptops and up-to-date operating systems. The first class assignment will step you through the necessary preparations.

    We will provide detailed instructions, textbooks, video tutorials, and coding cheat sheets for each programming language. It also means one-on-one appointments, review sessions, and e-mail inquiries.

    Every student has his or her learning style and joins us to embrace a challenge and to learn through experience. We encourage you to gather additional information and share on the class discussion board.
    Format
    • Lectures

    • Discussion

    • Case Studies

    • Group Projects

  • Prerequisites
    Restrictions
    • No non-Booth Students

  • Materials
    You will receive a detailed guide to setting up a development environment on your Mac or Windows laptop. All required software is free.

    Problem sets data are drawn from Quantitative Social Science, An Introduction by Kosuke Imai and Introduction to Statistical Learning with R Applications. Python for Data Analysis by William McKinney will also be used.
    Resources
    • Canvas Site Available

  • Grades
    ATTENDANCE: Class will be mostly exercises/ labs: ATTENDANCE IS MANDATORY.

    HOMEWORKS: There will be in-class group assignments and at-home individual assignments.

    MIDTERM: Individual Midterm Take-Home Exam is assigned in week 5 and due in week 6.

    FINAL PROJECT: Final Project will be due during exam week. There is no class that week.

    CLASS GRADE CALCULATION: Grades will be based on the following breakdown: 40% group and individual assignments, 20% midterm, 20% final project, 20% participation.

    No auditors. No pass/fail grades. No non-Booth students.
    Grades
    • Graded homework assignments

    • Graded attendance/participation

    Assessment & Testing
    • Midterm

    • Final Project

    Restrictions
    • No auditors

    • No pass/fail grades

  • Syllabus
  • Spring 2022Section: 32100-01TH 8:30AM-11:30AMHarper CenterC08
  • Spring 2022Section: 32100-81TH 6:00PM-9:00PMBooth 455 (NBC Tower)130
Description and/or course criteria last updated: June 24 2021