Select Page
Laptop computer showing various chars and graphs of data visualization.
Do you work with data? Would your students benefit from being able to use advanced statistics to analyze real-world datasets? If so, ChatGPT Plus might be able to help. In short, ChatGPT’s Advanced Data Analysis mode accepts file uploads and runs Python to manipulate data. This means that, on top of everything else that ChatGPT Plus can already do, it can clean data, merge files, describe and interpret data, suggest and run statistical tests, make inferences, explain the methods it uses in detail, visualize data, and provide links to download files it creates.

To demo a few of the possibilities, let’s look at some health insurance claims data. The file contains 15,000 rows with each row containing a claims amount, the individual’s age, sex, weight, BMI, number of dependents, blood pressure, job title, hereditary diseases, city of residence, and whether the person is a smoker, has diabetes, or regularly exercises. First, I’ll upload the file and ask, “Using this dataset, will you please help me determine how different variables affect claims costs?” Then, we’ll see where things go. (This is a long video. I’ll put times of key events below. Also, here’s a link to the conversation.)

0:00 – 1:22 Exploratory data analysis

1:23 – 1:56 – Handling missing values and cleaning data

1:56 – 3:00 – Visualizing distributions

3:00 – 4:51 – Visualizing categorical variables (Boxplots)

4:51 – 5:44 – Visualizing continuous variables (Scatterplots)

5:44 – 6:20 – Suggesting statistical tests

6:20 – 7:35 – Multiple linear regression analysis and key findings

7:35 – 9:14 – Creating a table and providing a download link

9:14 – 10:24 – Drafting a methods section

10:24 – 11:58 – Explaining the test

11:58 – 13:20 – Generating conclusions.

In addition to quantitative data analysis, there are all sorts of other things that this mode of ChatGPT can do. My dabbling has only scratched the surface, but ChatGPT’s Advanced Data Analysis has also helped me to do the following:

    • Categorize, summarize, and draw conclusions from survey data.
    • Create a PowerPoint slideshow from a zipped folder of images.
    • Create a “game” out of a set of course design recommendations and then show me how to add the HTML, CSS, and JavaScript files it created to GitHub to preview.
    • Write 10 sample papers, score the papers based on a rubric, offer overall feedback as well as specific feedback on each criterion, create new files with the added comments, and produce a spreadsheet with each paper’s score on each criterion, total score for each paper, and overall descriptive statistics.
    • Create an interactive map showing changes in CO2 emissions over time. As part of this process, ChatGPT walked me through how to install Python, add the Plotly graphing library, and use Terminal to create the map. (The climate data I used to create the map came from this demo of ChatGPT’s Advanced Data Analysis from MIT.)

I hope that this overview has inspired you to see what you can do with your own data or to integrate real-world data analysis into your teaching. Just be sure that, if you do choose to use this tool, do not upload data containing sensitive information. Also, in another post, I discuss how advanced prompting techniques can help shape interactions with generative AI chatbots. A lot of the techniques shared there are also applicable when working with ChatGPT to do data analysis. Finally, ChatGPT Plus does costs $20/mo. However, in addition to the advanced data analysis functionality, a subscription also provides access to plug-ins that extend ChatGPT’s functionality, a means to connect ChatGPT to the Internet, and Dall-E 3, OpenAI’s amazing text-to-image generator. It’s worth checking out.

NOTE: Different modes of ChatGPT are merging. By the time you read this, you might not have to select the Advanced Data Analysis mode to access this functionality. It’s also possible that you will no longer see Advanced Data Analysis as an option but the functionality will still be available within a regular ChatGPT Plus search. This stuff is changing so fast.