Final Project: ChatGPT UX Case Study

Introduction and Project Focus:

Perhaps it’s a fool’s errand to attempt to improve the design and user experience of the fastest growing app in history. ChatGPT reached 1 million users in 5 days after its release and 100 million users in just two months. Some if-it-ain’t-broke,-don’t-fix-it wisdom might seem obvious to apply with this wildly successful product, but UX experts Gavin Lew and Robert Schumacher issue an important warning in their 2020 book AI and UX. According to them, users who encounter problems early on with their use of any product are likely to abandon the product, often never to return. Therefore, if OpenAI (parent company of ChatGPT 3.5) wants to hold on to their massive user base and attract and keep new users (and, equally importantly, transition users to their upgraded paid version), it is imperative that their ChatGPT 3.5 generative AI model delivers an accessible design and experience without any frustrating flaws. But to create the most desirable experience, UX designers must first figure out what their users are seeking and what exactly they need and desire from the product.

For this study of ChatGPT, the issue of use cases is the first area of focus. Beyond addressing the UX problems related to serving the most appropriate and popular use cases, there are two other aspects of the ChatGPT experience that merited attention: interface design/functionality and response accuracy. Informed by industry research, academic research, and focus group testing, this case study presents analysis and recommendations for ChatGPT’s improvement related to use cases, interface design/functionality, and response accuracy.

Competitive Analysis:

A recent (February 2024) competitive analysis study from Copenhagen Economics looks at the current state of generative AI:

https://copenhageneconomics.com/wp-content/uploads/2024/03/Copenhagen-Economics-Generative-Artificial-Intelligence-The-Competitive-Landscape.pdf

Included in the report’s executive summary is the following:

“This white paper provides a preliminary review of current developments in the GenAI space and the emerging implications for competition. Further research is needed to shed additional light on the evolving competitive situation in GenAI, as well as any case-specific matters, particularly as this is a fast-evolving sector.”

Also included is this analysis of the most notable generative AI models:

“First, most foundation model developers (e.g., OpenAI, Anthropic, Mistral) already provide their own user-facing applications. At the same time, many foundation model developers offer rival developers the ability to build applications on top of their existing models via open models or closed APIs (e.g., Jasper.ai writing assistant built on OpenAI GPT models).

 Second, players in different levels of the value chain often specialise in specific domains (e.g., BloombergGPT in finance, or Isomorphic Labs in the pharmaceutical drugs discovery process) or tasks (GitHub coding copilot, or Perplexity.ai in search). The success of these models suggests that specialisation is a viable business model.”

The second observation there is especially important for this ChatGPT case study because it reveals a difference between models that serve specific domain purposes and those (like ChatGPT, claude.ai, and Llama 2, and Gemini) which function more generally or without strongly defined purposes.

There are also a number of comprehensive comparison guides for generative AI models, and this February 2024 guide does an effective job of assessing a handful of the most common use cases:

https://www.zdnet.com/article/best-ai-chatbot/

This comparison guide offers valuable information for the UX researcher by pointing out some of the pros and cons of each model. Its analysis suggests that ChatGPT 3.5 lacks two distinct features that some other free generative AI models offer: live connection to internet and the ability to upload files for analysis. While the lack of these features may be hurting ChatGPT 3.5’s ability to compete with other free generative AI models, the paid subscription upgrade (ChatGPT 4) does offer both internet connectivity and the ability to upload files for the system to read. From a profit standpoint, it may make sense for OpenAI not to offer all its available features in the free version, but if other companies are offering them at no charge, then those other companies are positioning themselves to deliver a superior overall user experience among free generative AI models. To remain competitive, ChatGPT 3.5 should incorporate internet connectivity and file uploading. However, they could limit file sizes and number of responses that include live internet in order to maintain an incentive for users to upgrade. It seems important to at least demonstrate for users all that the upgraded version of a product is capable of.

Issue #1: Identifying Use Cases

Understanding a product’s primary use is key to designing the best experience for the user. Designers are used to having an end goal or ultimate purpose in mind when they seek to improve how a particular product works. If the product is a shovel, the designers know to work on creating the best digging experience for the target users. Understanding that products could have secondary or multiple uses is important as well, and sometimes those uses must be factored into design. In the case of a Swiss Army Knife, the designers considered a number of different uses for their product and then chose to accommodate all of those uses within a compact device.

ChatGPT and other similar generative AI tools are somewhat unique as human-created technologies because unlike most tools which are created to solve defined problems, generative AI is more like a solution looking for problems to solve. This is evident in their user interface as it currently provides users with four different suggestions for how it can “help you today.”

 
 

Google’s generative AI model Gemini presents users with a very similar set of use options. While these options may change daily or upon new access of the tool, this example from Gemini shows two of four options that are related to cooking. This makes it appear as if Gemini is limited in what it can do. Even the first and fourth suggestions about brainstorming for presentation ideas and outlining a logical sales pitch are similar.

Many first-time users of ChatGPT or other models are likely accessing their chosen tool with no defined purpose, so it may be helpful to include some very specific options for how the product can be used. However, the options provided on the webpage may not be the best strategy for engaging curious explorers. The current suggestions give some sense of what ChatGPT might be able to do, but they are also overly specific. A better option might be to highlight the broader tasks that ChatGPT can help with, along with multiple examples of each task (instead of just one), and then have the tool initiate the chat and help the user narrow down to something specific they want or need help with. Here is a mockup:

Issue #2: Interface Design and Functionality

Method for testing interface and functionality:

A group of 12 first-year college students used ChatGPT for a prewriting activity designed to help them explore a topic or tentative thesis statement. The use case was thus pre-determined and defined for them, although they had a choice between which initial prompt to use.

The demographics of the student test group are reflective of the college they attend: racially diverse, majority female, majority traditional college students age 18-24. While different groups of users will have different needs and expectations of ChatGPT, this group is a likely target audience for generative AI models as they may learn to use the technology in college and then bring their knowledge and use of generative AI with them as they enter into their career fields.

Prompts options for product test:

Prompt for exploring a topic:
Please help me prepare for writing my persuasive essay about [insert topic]. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s).

 

Prompt for finding and understanding potential credible sources:
Can you point me to some prominent researchers who have published on [insert topic] and very briefly summarize their positions?


Prompt for testing a tentative thesis when you have a guess of what your position might be:
Let’s engage in a dialectic exercise. I will present my tentative thesis, and then you will take the role of Socrates and ask me yes or no questions (one at a time) to expose any potential flaws or contradictions in my position.    

Data Collected: Students submitted their ChatGPT pre-writing activity conversations in the form of sharable links provided by ChatGPT. The user interface of ChatGPT does include a helpful “share” button at the top right of the screen (desktop version).

This share button creates a special link that captures an entire chat session that others can view without having to access the original user’s account. The user can also choose whether to remain named or anonymous in the shared version of the chat. Users in this case study generally found the share feature offered by ChatGPT to be easy to locate and use.

The researcher also conducted a focus group to gather opinions and insights regarding the design and experience of the students.

The full ChatGPT conversations collected from students are of limited value for UX purposes, but from a qualitative perspective they do reveal something important. Each student prompted and replied to ChatGPT in very different ways. Some used more formal, properly punctuated language while others were far more informal, using abbreviations and texting language. Despite this, the responses from ChatGPT did not show variation in tone and style. It might make sense to have ChatGPT adjust its formality according to user inputs—or make the level of formality an adjustable option for the user.

Focus Group Discussion Summary and Interface Recommendations:

  • A majority of the students reported using some type of generative AI before, typically for academic reasons, but none of the students were regular users except for one who said she used Snapchat’s AI feature. This particular student remarked that she liked the Snapchat feature better than ChatGPT because the Snapchat AI did not decline to answer questions like ChatGPT does and Snapchat’s AI was not as “neutral.”
  • Students found the website for ChatGPT easily and reported no problems with creating an account. Some reported that creating the account/signing in using their existing Google account made things easier.
    • Recommendation: Keep the simplified sign-in process for Google users.
  • Initial impressions of the interface were mixed with some liking the “clean” layout and others calling it “simple in a bad way.” Some users felt like they did not know exactly what to do first within the site. Based on some users’ confusion, the layout would not score extremely well on a five second test.
    • Recommendation: Reduce confusion about what to users should do first by having ChatGPT start the conversation with a letter-by-letter appearance of text instead of the static “How can I help you?” message. This text would automatically open the chat conversation instead of waiting for the user to initiate the first response.
  • Some students appreciated the ability to change their personal settings to show the site in dark mode, but most students did not seek or find that option. Some felt that the dark mode option from their browser was sufficient.
    • Recommendation: add a simple dark mode toggle switch at the top of the interface:
  • All students accessed the desktop version of the tool, but many said they would consider using the mobile app.
  • About half of the students did notice the accuracy disclaimer at the bottom of the screen, and some students agreed that a pop-up notification at the start of a chat would be better because the disclaimer is not highly visible.
    • Recommendation: Increase font size of disclaimer and add disclaimer as a pop-up message at the beginning of a chat session. Include an option on the pop-up that says, “Do not show this message again.”
  • Students commented that the letter-by-letter appearance of the text is “cool” and makes it feel like “someone is typing to you.”
  • Students generally agreed that the ability to verbally enter prompts and responses would be a good feature to add. Most did not realize that they could use certain internet browsers for that same feature.
  • Almost all students found it easy to get responses from ChatGPT, except for one student who had issues when using a Safari browser. When she switched to Chrome, that resolved her issues.
    • Recommendation: improve the tool’s functionality in Safari or stop offering the tool through Safari altogether. No experience may be better than a frustrating experience in this case.
  • Student generally had no issues with the speed of ChatGPT responses, but some thought the length of responses were too long.
    • Recommendation: Have ChatGPT ask users if its responses are sufficient in length during the chat.

 

Issue #3: Accuracy of Responses

There are a handful of reasons why potential users of generative AI instead choose to steer clear of the technology. Concerns about both intellectual property rights and bias related to the training of these models are enough to keep some away completely. A more pressing concern for UX designers is the accuracy of responses generated by the different models. For any users on the fence about trying ChatGPT, one inaccurate response may be enough to cement their opinion of the tool as unhelpful or even dangerous. As many users are using ChatGPT as a source of information on a range of topics and as an alternative to Google search, it is essential that responses include accurate information and avoid the so-called “hallucinations” and misinformation that harm the tool’s credibility and lead to the frustrating experiences that drive users to other alternatives. Furthermore, since ChatGPT 3.5’s training data cuts off at January of 2022 and it lacks internet connection, it will not usually provide information for events after January 2022. If it does, then it is most likely providing misinformation because it has no way of accessing new information. Other free models (like perplexity.ai, for example) do include internet connectivity and are able to pull up-to-date information giving them a major edge over ChatGPT 3.5.

Industry studies and scholarly research are pointing toward improvements in accuracy for generative AI. One 2023 study by Walters and Wilder shows that GPT4 showed a significant decline in the inclusion of false citations compared to ChatGPT 3.5. In the study, 55% of sources created by ChatGPT 3.5 were found to be fictitious whereas only 18% of those created by GPT4 were. It is likely that GPT4’s access to the internet contributed to this improved accuracy as the model was able to better verify the existence of sources. This is further reason to make internet connectivity a priority change for ChatGPT.

Other industry studies also suggest the rapid improvement in generative AI accuracy.

Originality.ai released data in February of 2024 showing GPT4 outperforming ChatGPT 3.5 in a fact checking task (Gillham), and Anyscale released data indicating GPT4 outperforms both ChatGPT 3.5 and humans in identifying accurate summaries of news reports (Kadous, 2024). This is good news for the user experience, but generative AI already has a significant stigma when it comes to accuracy. Some misinformation that makes its way into generative AI responses is a result of the program simply trying to execute on a prompt without taking into account any need for accuracy, but other inaccuracies can be the direct result of biases that influence outputs.  

Since the optimal user experience of ChatGPT includes program outputs with accurate information, designers need to ensure the model’s training data is more representative of marginalized online communities and cultures. This means finding better quality data than the primarily English language based swaths of internet collected by Common Crawl (Baak, 2024).

Another source of bias that generates less accurate information is the custom response feature of ChatGPT. This feature allows users to tailor the responses they receive based on their own preferences. For example:

A response from ChatGPT based on these instructions thus becomes biased and will ignore certain information, making the responses potentially less accurate.

 

Although these kinds of tailored responses do create a preferred user experience, in the long run, the presentation of biased info to users (even if they ask for it) is going to hurt the overall reputation of ChatGPT and other generative AI models. This is an ethical problem for UX designers that deserves careful consideration. There may be ways to provide users some of their preferences in terms of length and formality of responses without reflecting a user’s own bias back at them.

Conclusion

ChatGPT revolutionized the world on behalf of generative AI in 2022, but today it faces strong competition from similar free (and freemium) models that offer important features that ChatGPT only offers in its subscription based GPT4 model. There are a handful of minor changes that ChatGPT could make to its interface to enhance users’ overall experience, but if OpenAI chooses to keep ChatGPT 3.5 offline, they are taking a major risk. The level of misinformation in the responses of generative AI systems is a major concern to many, and users concerned with accuracy very well might choose to experiment with more accurate free internet-connected models like Perplexity.ai and Gemini before they choose to pay $20 per month for that feature in GPT4.

 

References

Baak, S. (2024, February 6). Training data for the price of a sandwich. Mozilla Insights. https://foundation.mozilla.org/en/research/library/generative-ai-training-data/common-crawl/

Gillham, J. (2024, February 2). AI fact checking accuracy study. Originality.ai. https://originality.ai/blog/ai-fact-checking-accuracy

Kadous, W. (2023, August 23). Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper. https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper

Lew, G., & Schumacher, R. Jr. (2020). AI and UX: Why artificial intelligence needs user experience. O’Reilly.

Walters, W. H., & Wilder, E. I. (2023). Fabrication and errors in the bibliographic citations generated by ChatGPT. Scientific Reports, 13(1), 1–8. https://doi.org/10.1038/s41598-023-41032-5

Project Update: Reflecting on Data Collection Through a Focus Group Session

I would be willing to bet that I have participated in far more focus groups than the average person. When I was hustling to pay off my student loans in the 2010s, I discovered that the city of Atlanta had a slew of market research companies that would pay me for my time and opinions on a number of different products: restaurant foods, scratch-off lottery tickets, pilot episodes of television shows and commercials, deep sea fishing excursions, NASCAR race events, household appliances. Those are just some of the focus groups I remember participating in.

There is a bit of a dirty secret to my former success in selling my opinions to market research companies though, and I wouldn’t say that I am proud of it now. I had to lie…rather, I chose to lie in order to qualify for most of the studies that I participated in. If a market research company’s screener was recruiting for subjects to taste a new menu item for a fast-food chain that sold seafood, then they needed to find people who ate at these establishments. They would call all the people on their database list of potential subjects and work to fill quotas that typically required some diversity in the group of subjects (age, race, gender, income, etc.), but the subjects all had to have one thing in common: they had to be a target audience for the product, likely users or consumers of the product. The market research companies didn’t want vegans in their Chickfila focus groups, unless they were testing out a new non-meat option or something.

But there is a critical flaw in the way these focus groups operate. It is a problem that opens up an avenue for unscrupulous people like me (old-me, anyway) to easily exploit their system. The company that wants the focus group data typically outsources this task to third parties. Company X hires a market research firm to collect data, and that market research firm often relies on another market research company to assemble the groups of participants. (At least this is how I saw it working in the 2010s.) The real breakdown seemed to occur with relying on the initial screeners to produce a group of people who fit the specifics that Chickfila or their hired market research firm decided they wanted data from. The call screeners’ job was to find participants, and back then, I quickly realized that doing so was not always easy for them to do. Sometimes I would get through a 10-20 minute screening process (answering questions about demographics and purchasing behavior) only to have the screener say, “Sorry, you don’t qualify” or “You do qualify, but I already have enough males in their 30s.”

Experienced screeners worked more efficiently and could figure out if I qualified much faster by asking the right questions in the right order (obviously diverting from their scripted questionnaire). And the occasional desperate-for-subjects screener would be intentionally leading in their questioning: “Are you sure you haven’t bought a scratch-off lottery ticket in the past week?” I think the screeners were even incented financially to fill groups, earning bonuses for quantifiables that encouraged them to ignore quality. Either way, once I caught on to the different motivations of the parties involved, I started to play the game. I would be intentionally vague in my screener call responses, allowing “good” screeners to lead me to the desirable answers. Then I would show up to the focus group, participate as much as I could, offer legitimate and genuine feedback whenever possible, and then collect my check or gift card. Rinse, repeat.

My dishonesty involved lying about mostly my consumer behavior. It wouldn’t have been easy to lie about demographic information because much of that would be obvious when I showed up to the focus group and they verified my date of birth and address. My race would have been relatively obvious too, although, one time my wife (accomplice and fellow liar) did get herself qualified for a self-tanner product and showed up as the only non-orange person at the group! I guess she didn’t think that one completely through, but she still got her honorarium.

After participating in a focus group, my wife and I would often discuss the performance of the focus group leader. As educators, we both knew what leading a group discussion entailed and we marveled at just how incompetent some of these market research professionals were at their jobs. There were also the rare few who were excellent, but the bad ones were always worth gossiping about. In my hubris, I almost always thought I could do better. Considering my level of experience as a participant toward the middle and end of my multi-year run as a frequent focus group imposter, I was probably somewhat right.

When it came time this semester for me to collect data for my current UX project, I thought about my experience with focus groups and decided that doing a dozen interviews at once would be an efficient way to go about this step. It was time for me to see if I really could run a focus group better than the professionals I critiqued years ago.

I was prepared. I had two pages of questions all related to the generative AI system that my students-turned-informal-research-participants had used just days prior. I also had 15 years of experience leading discussions on all types of subjects with diverse groups of students, including high schoolers, college students, and even incarcerated men in a state prison. I was ready to prove myself as a market research genius.

Ultimately, I failed at achieving true focus group leader glory for two reasons:

#1. It is difficult to keep the participants on topic. My list of questions was well thought out and logical. One question or set of questions logically led to another, but my research subjects were not privy to this planning and therefore jumped ahead and backwards many times. This made it harder to elicit the feedback I was seeking. Fortunately, I recorded the session so I could sift out the data, but I believe there were lost opportunities and lost data because of the way I failed to control and redirect the respondents at times. It was a balancing act because more feedback is generally preferable to no feedback, and I didn’t want to shut the respondents down.

#2. I found it extremely difficult to avoid leading my respondents toward my own opinions and observations regarding the user experience of the product (ChatGPT). Hubris strikes again. I could not remain an impartial enough recorder of responses. I saw opportunities to seek additional feedback from respondents that would confirm my own opinions, and I couldn’t help myself.

Despite these shortcomings, I did collect data that will allow me to better assess the user experience of the target product. If I could do the focus group again, I think I would start with more open-ended questions and allow the discussion to go where the respondents lead it, rather than trying to rigidly stick to the list of questions I developed.

Book Review: AI and UX: Why Artificial Intelligence Needs User Experience

AI and UX: Why Artificial Intelligence Needs User Experience, by Gavin Lew and Robert M. Schumacher Jr. 1st edition 2020, Apress. EBSCOhost, search.ebscohost.com/login.aspx?direct=true&AuthType=ip,shib&db=cat06559a&AN=ggc.996791144102945&site=eds-live&scope=site.

Why did I choose this book?

My current research interests revolve around all things generative AI, especially its applications for writing and teaching. I am trying to learn as much as I can, and I think an understanding of the UX aspects related to AI is beneficial. This led me to seek a UX book that might provide insight into how AI systems and products are designed with end users in mind.

Because the release of ChatGPT in November of 2022 drastically altered the entire AI landscape, I was hoping to find a recently released book on UX and AI, but I was unable to find one that looked like legitimate scholarship. This is not entirely surprising since it takes a while to research, write, and publish. So instead, I settled for the most recent book on the topic I could find that looked reliable. There were not many options to choose from, but this book, despite its 2020 copyright date, proved a good read for someone like me who is neither an expert in UX or AI (yet).

Who are the authors?

The authors for this book are Gavin Lew and Robert Schumacher Jr. Instead of trying to summarize their already relatively short biographies in the book, I will include them in full:

Gavin Lew has over 25 years of experience in the corporate and academic environment. He founded User Centric and grew the company to be the largest private UX consultancy in the United States. After selling the company, he continued to lead a North American UX team to become one of the most profitable business units of the parent organization. He is a frequent presenter at national and international conferences and the inventor of several patents. He is an adjunct professor at DePaul and Northwestern universities. Gavin has a Masters in Experimental Psychology from Loyola University and is currently the Managing Partner of Bold Insight, part of ReSight Global, a globally funded UX consulting practice across North America, Europe, and Asia.

 Robert M. Schumacher Jr. has more than 30 years of experience in academic, agency, and corporate worlds. He co-owned User Centric with Gavin from its early stages until it was sold to GfK in 2012. While at User Centric, Bob helped found the User Experience Alliance, a global alliance of UX agencies. Also, he founded User Experience Ltd, a UX agency in Beijing. He is co-founder, co-owner, and Managing Partner of Bold Insight, part of ReSight Global, a global UX company. Bob was the editor of and contributor to The Handbook of Global User Research (2009). He has several patents and dozens of technical publications, including user interface standards for health records for the US government. He also is an Adjunct Professor at Northwestern University. Bob has a Ph.D. in Cognitive and Experimental Psychology from the University of Illinois at Urbana-Champaign.

Basically, we have two writers with plenty of UX experience (related to technology and other fields) and backgrounds in psychology. It might have been nice to have an author with more of a computer-science background paired up with someone who knows the psychology behind UX, but these two authors also have a long-established working relationship which enhances their ability to communicate throughout the book.

Summarizing the Chapters and Some Highlights:

In the preface, the authors state the following:

“Our perspective on how AI can be more successful is admittedly and unashamedly from a UX point of view. AI needs a focus on UX to be successful.”

This is a central theme in the book. The authors recognize the role UX must play in the development of AI systems, tools, and interfaces. Having now had some experience myself with a few of the generative AI platforms, I think the authors are correct, and an emphasis on UX for AI tools won’t just make those tools easier and more pleasant to use, but a better UX experience can actually save these tools from being written off by the general public as novelties or passing fads. The failure of AI to live up to hype in past decades did lead to these kinds of dismissals, but the latest wave of advancements may have reached a tipping point that insulates AI from another major cultural setback or lengthy pause.

Chapter 1: Introduction to AI and UX

This chapter does a respectable job of making the important connections between UX and AI. The authors prove that they know enough about these connections to be credible voices from which the reader can learn.

Drawing from their significant UX work, Lew and Schumacher tell us that “For any product, whether it has AI or not, the bare minimum should be that it be usable and useful. It needs to be easy to operate, perform the tasks that users ask of it accurately, and not perform tasks it isn’t asked to do. That is setting the bar really low, but there are many products in the marketplace that are so poorly designed where this minimum bar is not met” (16).

Throughout the book, the authors make a good case for the application of pretty much all general UX principles to AI products. Chapter 1 just lays out the landscape and major connections.

Chapter 2: AI and UX: Parallel Journeys

As the title implies, Chapter two provides a nice historical walk through AI and UX development. Particularly interesting is the focus on “AI winters” that followed periods of overhyped AI performance in the 1960s and again in the 1980s.  Also, they mention the “domain-specific AI winter” for AI personal assistants which followed the overhyping of Siri in the early 2010s.Part of the reason for these AI winters is that the developers of the systems were not focused enough on user experience.

I appreciate the differentiation the authors try to make between HCI (human-computer interaction) and UX in chapter 2:

“Where HCI was originally focused heavily on the psychology of cognitive, motor, and perceptual functions, UX is defined at a higher level—the experiences that people have with things in their world, not just computers. HCI seemed too confining for a domain that now included toasters and door handles. Moreover, Norman, among others, championed the role of beauty and emotion and their impact on the user experience. Socio-technical factors also play a big part. So UX casts a broader net over people’s interactions with stuff. That’s not to say that HCI is/was irrelevant; it was just too limiting for the ways in which we experience our world” (50).

The way I interpret this is that HCI is akin to a substrata of UX.

Chapter 3: AI Enabled Products are Emerging All Around Us

And

Chapter 4: Garbage In, Garbage Out

These two chapters are where the book shows its age a bit as a pre-ChatGPT publication. Although there are some interesting examples of AI systems discussed in Chapter 3, the next chapter disconnects enough from the user experience that I did not find it valuable as a UX text. The focus of Chapter 4 is the data that AI runs on. The authors are correct that without quality data, the user experience of any AI product will suffer, but since current AI systems are such black boxes when it comes to their training data, this is somewhat of a moot point for me right now.  

I will say that I perked up a bit reading about voice assistants and Grice’s four maxims for communication (67). Anyone studying generative AI could benefit from using those maxims as a starting point for evaluating what our machines are capable of. Current LLMs and systems based on LLMs seem to handle the three of the maxims with relative ease much of time (quantity, relevance, and clarity), but the truthfulness of LLM’s communication is where many people are finding the most problems. One could argue that truthfulness is the most important of the four, but it is obvious that advances in the other three areas have come quickly and impressively. I think it is entirely possible that AI systems make progress on that fourth maxim in the near future. And if things in the AI world are not interesting enough for someone yet, they will be once the programs are more reliably accurate purveyors of information.

Chapter 5: Applying a UX Framework

This final chapter is still relevant in the post-ChatGPT world. It ties the idea of UX and AI back together (whereas they diverged a bit in the previous two chapters). This quote at the beginning of the chapter seems especially relevant:

“For many people, there’s still a hesitance, a resistance, to adopt AI. Perhaps it is because of the influence of sci-fi movies that have planted images of Skynet and the Terminator in our minds, or simply fear of those things that we don’t understand. AI has an image problem. Risks remain that people will get disillusioned with AI again” (109).

I think the authors are correct that people could become disillusioned with AI again, but this will probably be less about the UX dimension and more about the existential threats, security concerns, and intellectual property issues that accompany 21st century AI. Either way, since AI is becoming so ubiquitous, I would not predict another AI winter like authors detail in Chapter 2.

One of the most interesting points in Chapter 5 regards the purpose of a product. As they lay out the case for applying a UX framework to AI, the authors pose the following questions:

“Probably the most important thing that defines any application is what it does—we call this “utility” or “functionality” or “usefulness.” Basically, is there a perceived functional benefit? In more formal terms, does the application (tool) fit for the purpose it was designed for?”

The reason this is interesting is because I am not sure the creators of ChatGPT and the other generative AI systems (or any of the precursors dating back to the 1960s) really had a specific end user functions in mind—at least not as the driving motivation for their creations. It seems like the systems have all been designed just to see if the creators could make a machine that could communicate like a human and display some level of “intelligence.” Along the way, clever people have figured out how to leverage this technology for different purposes, and profit-driven people have too, but I really don’t think that thoughts of the usefulness of LLMs weighed heavily on the creator’s minds. Evidence for this exists within the current user experience of ChatGPT. When users first access this application they see an interface with suggestions for how the app could be used. That is weird.

When we buy tools or access technologies, we typically already have the function in mind; that’s why we sought the tool to begin with. Generative AI companies are almost saying to the user, “Here it is. Figure out for yourself what purpose it has for you.” For the time being, that is the user experience for many users of generative AI.

As for the user experience of Lew and Schumacher’s book, I think they did a decent job of connecting two fields that need to be connected. A reader with a good grasp on AI could probably skip chapters 3 and 4, but there is plenty of helpful information and background in Chapters 1,2, and 5 that still holds up well in this four-year-old title from Springer/Apress.

Final Project Outline and Schedule: ChatGPT Student Users Case Study

Project Goal: This project seeks to investigate and analyze how students in a first-year college writing course interact with the ChatGPT website/app in the context of educational assignment. From a UX perspective, the researcher wants to understand how the generative AI tool is used, whether students show variations in how they use the tool, and if there might be changes that could improve the experience for student users.

 

Schedule:

(Completed 3/4/2024) Create and post project outline/schedule

(Completed 3/4/2024) Create questions for focus group discussion

By 3/12/2024: Update assignment instructions

3/12/2024: Provide instructions for student writing assignment that will incorporate ChatGPT usage

3/12/2024:  Students create OpenAI accounts

3/12 and 3/14: Students complete the pre-writing/invention assignment

3/14/2024: Students submit links to their ChatGPT conversations

3/19/2024: Conduct focus group discussion

3/20-4/15/2024: Synthesize data and findings; write report (Organizational Structure of Report: Explore, Define, Ideate)

By 4/22/2024: Post final report

 

Target Research Sample: ~12 ENGL 1101 students at GGC. 

GGC student demographic profile:

Application of student demographic information: The researcher believes the institutional data for the college is representative of the student sample he is working with. The researcher has deemed that the benefits of having more accurate demographic data for the research sample is not worth the costs (time and effort) of collecting the data. An attempt to collect this data would be an additional burden and privacy encroachment for the research participants. The risks of not collecting the data are minimal but could result in some problems with the interpretation of certain data that is collected and the accurate representation of the student population of the institution. Generally, the researcher should take into account the diversity of the student population (specifically racial diversity and students with disabilities that impact learning) when designing the case study. Information on student socioeconomic status would be helpful as well because of correlations with access to technology. To that end, 2021 data shows that 62% of GGC students received Pell grants (compared to 47% at comparison group institutions). In sum, GGC serves a student body that is far more diverse and financially in need than other institutions in a comparison group of similar institutions.

https://nces.ed.gov/ipeds/use-the-data

Methods: Students will use ChatGPT for a prewriting activity that will help them explore their own tentative thesis statement or topic.

Prewriting Activity Prompts:

Please help me prepare for writing my persuasive essay about a significant problem for Generation Z. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s). The problem I want to focus on is…

–or—

Please help me prepare for writing my persuasive essay about a controversial issue in my field of study. Ask me questions–one at a time–that will help me narrow down my topic to a specific debatable question, come to a thesis, understand the strengths and weaknesses of my position, and understand any opposing view(s).

–or—

Please help me prepare for writing my persuasive essay about a social justice issue. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s). The problem I want to focus on is…

–or–

Let’s engage in a dialectic exercise. I will present my tentative thesis, and then you will take the role of Socrates and ask me yes or no questions (one at a time) to expose any potential flaws or contradictions in my position.

Data Collected: Students will be asked to submit their ChatGPT conversations in the form of sharable links provided by ChatGPT. The researcher can use this data for qualitative and quantitative analysis (number of replies and length of replies). Qualitative data will also be collected from a focus group discussion.

Focus Group Discussion Questions:

(Researcher will start with background info on research purpose, discussion of consent, and info about groupthink)

How would you describe your prior use of generative AI (like ChatGPT, claude.ai, etc.)?

              Never used it/Used once/Used a few times/Use regularly

              For what purposes?

 

Did you use the mobile app or desktop website?     Mobile / Desktop

What were your initial impressions of the appearance of the tool? Design? Functionality? Features?

              Notes:

 

How easy was it to…

find and access ChatGPT and create your account?  

Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

Comments:

use ChatGPT to….

get responses (any responses—helpful or unhelpful)

              Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

                             Comments:

share the link to your ChatGPT conversation?

              Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

                             Comments:

 

What about…

Speed of ChatGPT responses?  

Scroll speed of the screen as ChatGPT responds?

Appearance of the interface? Colors, font size? Layout?

              Comments:

 

Additional features

              Did you notice the thumbs down button? Did you use it?

                             Did not notice /Noticed but didn’t use / Noticed and used

              Did you notice the message that says, “ChatGPT can make mistakes. Consider checking

important information”?

              Noticed / Did not notice

              Comments:

Other features you wished the interface had?

              Comments:

Features that you wished the interface would change or remove?

              Comments:

Accessibility questions

How could the tool be better designed to improve the experience for people with disabilities?

              Comments:

Using ChatGPT for UX Persona Creation and Canva for Matching Persona Images

 
 
I used ChatGPT 3.5 (free version) to create some personas that could theoretically apply to my UX project. After that, I asked ChatGPT to help me think about ways I could apply the different characteristics from the personas to improve the UX design of ChatGPT’s own interface and functionality. Overall, I think the feedback that I got from the machine was helpful for UX research. Some answers were a bit vague or obvious, but even the vague answers were often decent starting points. The full conversation will be included below, so feel to judge for yourself how helpful AI can be as a UX research assistant. Of course, any shortcomings of the machine responses may only be reflecting the shortcomings of the prompts and prompt writer.
 
I also used Canva to create matching images (posted above) for the personas. Using image generation was a new experience for me, and I found the results impressive for a free version. I originally tried the Bing image generator without any luck at all. That one runs on Dall-e 3 and is supposed to be excellent, but I couldn’t get it to generate anything. With Canva, it did take a bit of work and multiple “re-generation” attempts for each image to come out looking realistic. The prompts that ChatGPT gave me to feed to Canva were too long. I had to just write my own. Here is what the one for the third persona looked like:
Realistic image of 35 year old male military vet in plain clothes sitting at a laptop.
That was a revised prompt. Since the first attempt did not include “in plain clothes,” all the images were of a man in military uniform. Of course, I could have just left out the “military vet” altogether, but I wanted to see if this extra detail would have an impact on the image. Maybe the physical stature of this persona is reflective of the military vet detail, just as the glasses on the last persona are probably reflective of the “visual impairment” detail that I included. When creating these images, issues with eyes and hands seem to be especially prominent (see hands on lower third image), but it is so easy to re-generate that a user can get as picky as they want–at least within the usage limits of the free version anyway. I suppose there is value in giving a persona a visual image for the researcher to connect with, yet I don’t see this serving my own research very much. It did give me decent visuals to help spice up this post though!
 
Here is a link to my ChatGPT session (better formatting), or you can read the basic full text below:
 

UX Persona Creation and Application

Jason PalmerFebruary 23, 2024
 
Me:
Please help me by creating four different personas for user experience/user design research. The product they are using is ChatGPT. They are all college students. Two of them are traditional and two of them are non-traditional students. Two of them have disabilities that impact how they learn and use technology. They are all students at a diverse four-year college in suburban Atlanta. Provide other useful characteristics, details, and insights for each of these four different personas: things like what they might use the product for, how they might go about using it, prior experience, etc.
 
ChatGPT:

Sure, I can provide some hypothetical personas for ChatGPT users. However, please note that the following personas are fictional and are based on general assumptions and typical user behaviors, not on real individuals.

  1. Traditional Student – Natalie:
  • Age: 21
  • Gender: Female
  • Major: Psychology
  • Background: Grew up in a suburban area, comes from a middle-class family.
  • Disability: Dyslexia
  • Tech Comfort: Somewhat comfortable, uses a laptop and smartphone regularly.
  • ChatGPT Usage: Uses ChatGPT for research and writing assistance, especially for tasks that involve extensive reading or writing. Finds ChatGPT’s summarization and paraphrasing features helpful.
  • Experience: Natalie has used language assistance tools in the past and is looking for a more versatile tool that can help her with her specific learning needs. She appreciates AI tools that can understand natural language and provide accurate and concise responses.
  1. Non-Traditional Student – Jason:
  • Age: 35
  • Gender: Male
  • Major: Computer Science
  • Background: Military veteran, came back to college after serving.
  • Disability: PTSD
  • Tech Comfort: Very comfortable with technology, has experience with coding and software development.
  • ChatGPT Usage: Uses ChatGPT to assist with coding tasks, such as generating code snippets or checking for errors in his code. Also uses ChatGPT for brainstorming and project planning.
  • Experience: Jason is familiar with AI and machine learning technologies. He values efficiency and accuracy in the tools he uses. He appreciates ChatGPT’s ability to understand technical language and provide relevant information.
  1. Traditional Student – Sarah:
  • Age: 20
  • Gender: Female
  • Major: Journalism
  • Background: Grew up in a rural area, comes from a working-class family.
  • Disability: ADHD
  • Tech Comfort: Comfortable with technology, uses a laptop and smartphone regularly. Struggles with focus and organization.
  • ChatGPT Usage: Uses ChatGPT for brainstorming and organizing ideas for articles and assignments. Finds the tool’s ability to generate ideas and provide structure helpful.
  • Experience: Sarah has used other productivity tools in the past but has found them to be overwhelming or too rigid. She appreciates ChatGPT’s flexibility and ease of use.
  1. Non-Traditional Student – Michael:
  • Age: 29
  • Gender: Male
  • Major: Business Administration
  • Background: Grew up in a suburban area, comes from an upper-middle-class family.
  • Disability: Visual Impairment (Partially Blind)
  • Tech Comfort: Comfortable with technology, uses screen readers and other assistive technologies.
  • ChatGPT Usage: Uses ChatGPT for research and writing assistance, especially for tasks that involve extensive reading or writing. Finds ChatGPT’s summarization and paraphrasing features helpful.
  • Experience: Michael has used other language assistance tools in the past but has found them to be less accurate or accessible. He appreciates ChatGPT’s ability to understand natural language and provide accurate responses.
 
Me:
Now provide a summary for persona #1 (Natalie) that explains how the UX designer of an updated version of ChatGPT could apply the characteristics of the personas (especially the disability information) to an updated version of the ChatGPT interfaces (both web and mobile).
 
ChatGPT:

Based on the persona of Natalie, a UX designer for an updated version of ChatGPT could consider several key characteristics to enhance the accessibility and usability of the product for users with dyslexia:

  1. Readable Text: Ensure that the text on the interface is clear, easy to read, and uses dyslexia-friendly fonts such as OpenDyslexic or Dyslexie. The designer can also consider options for adjustable font size and contrast settings to accommodate individual preferences.
  2. Intuitive Navigation: Simplify the navigation of the interface by using clear icons and labeling, and avoid using complex or ambiguous language that might be difficult for users with dyslexia to understand.
  3. Assistive Technology Integration: Make sure that the interface is compatible with screen readers and other assistive technologies commonly used by individuals with dyslexia.
  4. Language Processing Features: Enhance the language processing capabilities of the ChatGPT to provide accurate and concise responses that are easier for users with dyslexia to understand. This might include features like summarization and paraphrasing, as well as the ability to break down complex information into simpler terms.
  5. Customizable Settings: Allow users to customize the interface according to their individual needs, such as adjusting the speed of text presentation or enabling visual cues to aid in comprehension.
  6. User Education and Support: Provide clear instructions and guidance on how to use the interface effectively, and offer accessible support resources such as tutorials or FAQs.

By taking these characteristics into account, the UX designer can create an updated version of ChatGPT that is more accessible and user-friendly for individuals with dyslexia like Natalie.

 
Me:
Now, taking into account all of the users’ usage characteristics, what specific changes to the ChatGPT interface and functionality should the UX designer consider? Please be as specific as possible, providing examples and explanations of the possible changes.
 
ChatGPT:

Certainly, let’s consider the specific changes the UX designer of ChatGPT might want to consider for each of the personas mentioned earlier:

  1. Natalie (Traditional Student with Dyslexia):
    • Interface Changes: Increase font size options, provide a dyslexia-friendly font option, offer adjustable contrast settings, and minimize visual clutter.
    • Functionality Changes: Introduce a summarization feature that condenses lengthy text responses into key points, offer an option to paraphrase complex or jargon-filled responses into simpler language, and provide a spell-check feature that highlights potential spelling errors.
    • Example: Instead of displaying a long paragraph in a single block of text, the interface could provide a summarized version with bullet points, using simpler language and shorter sentences.
  2. Jason (Non-Traditional Student with PTSD):
    • Interface Changes: Provide a dark mode option for users who are sensitive to bright screens, offer customizable notification settings, and implement a progress tracker for long tasks to help manage focus and anxiety.
    • Functionality Changes: Introduce a distraction-free mode that hides non-essential elements of the interface, incorporate a timer feature to encourage breaks during long sessions, and add a feature that allows users to save and resume tasks.
    • Example: The interface could include a “Focus Mode” that hides the chatbox and displays only the current task, with a timer and progress bar visible to help Jason track his work.
  3. Sarah (Traditional Student with ADHD):
    • Interface Changes: Implement a clear and hierarchical layout, offer color-coded categories for tasks, and provide a task list feature with the option to set reminders.
    • Functionality Changes: Introduce an auto-save feature to prevent data loss due to distraction, include a focus mode that temporarily disables notifications, and add a “Brainstorming Mode” that helps organize ideas into a structured outline.
    • Example: The interface could have a “Task Manager” feature that displays Sarah’s assignments and allows her to set deadlines and receive reminders, helping her stay organized and on track.
  4. Michael (Non-Traditional Student with Visual Impairment):
    • Interface Changes: Ensure the interface is compatible with screen readers and supports keyboard navigation, provide a high-contrast mode, and offer adjustable magnification options.
    • Functionality Changes: Implement a voice command feature that allows users to navigate the interface hands-free, offer an audio summary option for lengthy responses, and include keyboard shortcuts for common actions.
    • Example: The interface could have a “Voice Control” feature that lets Michael navigate the interface and access functions using voice commands, reducing the need for manual interaction.

By considering these specific changes tailored to the needs of each persona, the UX designer can create an updated version of ChatGPT that offers a more accessible and user-friendly experience for everyone, taking into account their usage characteristics and disabilities.

 
Me:
Let’s now assume that our different personas all have different cultural backgrounds. What considerations does this place on the UX designer?
 
ChatGPT:

When considering the cultural backgrounds of the personas using ChatGPT, the UX designer should take into account the following considerations:

  1. Language and Communication Styles: Different cultures have unique language and communication styles. For example, some cultures may prefer indirect communication, while others may value directness. The UX designer should ensure that ChatGPT’s language models are trained to understand and respond appropriately to these different styles.
  2. Visual Design and Color Schemes: Colors and visual design elements can have cultural significance. For example, certain colors may have positive or negative connotations depending on the culture. The UX designer should consider cultural sensitivities when choosing color schemes and visual design elements for the interface.
  3. User Preferences and Expectations: Different cultures may have different preferences and expectations when it comes to technology and user interfaces. For example, some cultures may prefer minimalist designs, while others may prefer more visually rich interfaces. The UX designer should conduct user research to understand the preferences and expectations of the personas from different cultural backgrounds.
  4. Content and Content Localization: Content on the ChatGPT interface should be culturally sensitive and inclusive. The UX designer should consider the cultural context of the content and ensure that it is appropriate and relevant to all users, regardless of their cultural background.
  5. Accessibility and Inclusivity: Cultural inclusivity goes hand-in-hand with accessibility and inclusivity. The UX designer should ensure that ChatGPT’s interface is accessible to users from different cultural backgrounds, including those with disabilities, by implementing features such as multilingual support, culturally diverse avatars, and user-friendly interfaces.

Overall, the UX designer should take a culturally sensitive and inclusive approach when designing ChatGPT’s interface, taking into account the unique needs, preferences, and expectations of the personas from different cultural backgrounds. This will help create a more inclusive and user-friendly experience for all users.

 
Me:
Let’s assign different cultural backgrounds to our personas.
 
ChatGPT:

Sure, let’s assign different cultural backgrounds to our personas:

  1. Traditional Student with Dyslexia – Natalie:
    • Cultural Background: Natalie comes from a culturally diverse family with Hispanic roots. She speaks both English and Spanish fluently.
  2. Non-Traditional Student with PTSD – Jason:
    • Cultural Background: Jason grew up in a small town in the Southern United States and has a strong sense of Southern culture and traditions.
  3. Traditional Student with ADHD – Sarah:
    • Cultural Background: Sarah comes from a multicultural background. Her mother is Chinese, and her father is African American. She is fluent in English and conversational in Mandarin.
  4. Non-Traditional Student with Visual Impairment – Michael:
    • Cultural Background: Michael is originally from the Philippines and moved to the United States for college. He is fluent in English and Tagalog, the national language of the Philippines.

With these diverse cultural backgrounds in mind, the UX designer should consider the specific cultural sensitivities, communication styles, and preferences of each persona when designing ChatGPT’s interface. This will help create a more inclusive and culturally sensitive experience for all users.

 
Me:
Now lets make some UX improvement suggestions for each of the personas based on cultural background.
 
ChatGPT:

Certainly! Here are some UX improvement suggestions for each of the personas based on their cultural backgrounds:

  1. Natalie (Hispanic Background):
    • UX Improvement: Offer a language option for Spanish-speaking users like Natalie. ChatGPT could have an option to switch between English and Spanish language modes, allowing users like Natalie to communicate more comfortably in their preferred language.
  2. Jason (Southern United States Background):
    • UX Improvement: Incorporate elements of Southern culture into the interface design, such as using warm, earthy tones and friendly, informal language. This could help Jason feel more at home while using the product.
  3. Sarah (Multicultural Background):
    • UX Improvement: Provide a culturally diverse set of avatars and user interface elements that reflect Sarah’s multicultural background. This could include avatars with different ethnicities and customizable design elements that allow users to express their unique cultural identities.
  4. Michael (Filipino Background):
    • UX Improvement: Integrate multilingual support for Tagalog-speaking users like Michael. ChatGPT could include an option to switch between English and Tagalog language modes, allowing users to communicate more comfortably in their preferred language.

These UX improvements take into account the unique cultural backgrounds of each persona and aim to create a more inclusive and culturally sensitive experience for all users.

 
Me:
Do you have any other suggestions for how a UX designer could use these personas to improve ChatGPT? Is there anything we may have missed in the creation of the personas that could lead to a better product and experience for the diverse range of ChatGPT users?
 
ChatGPT:

Certainly! Here are some additional suggestions for how a UX designer could use these personas to improve ChatGPT:

  1. Cultural Sensitivity Training for AI: To ensure that ChatGPT’s responses are culturally sensitive and appropriate, the UX designer could work with AI trainers to provide cultural sensitivity training to the language models. This could involve incorporating diverse cultural perspectives and values into the training data and ensuring that the language models understand and respect cultural differences.
  2. User Research and Feedback: The UX designer could conduct user research and gather feedback from users from diverse cultural backgrounds to understand their specific needs, preferences, and pain points. This could involve conducting focus groups, interviews, surveys, and usability testing with users from different cultural backgrounds.
  3. Localization and Translation: To make ChatGPT more accessible to users from different cultural backgrounds, the UX designer could work on localizing the interface and content for different languages and regions. This could involve translating the interface and content into multiple languages and adapting the design to suit the cultural norms and preferences of different regions.
  4. Cultural Contextualization: The UX designer could work on contextualizing ChatGPT’s responses to better reflect the cultural context of the user. For example, if a user asks a question about a cultural tradition or practice, ChatGPT could provide a response that takes into account the cultural context and significance of the tradition or practice.
  5. Inclusive Design and Accessibility: To ensure that ChatGPT is accessible to users from diverse cultural backgrounds, the UX designer could work on making the interface and content more inclusive and accessible. This could involve incorporating features such as text-to-speech, screen readers, and voice commands to make ChatGPT more accessible to users with disabilities.

By taking these suggestions into account, the UX designer can create a more inclusive and culturally sensitive experience for all users of ChatGPT, regardless of their cultural background.

 
Me:
Can you give me a good prompt to use for an image generator that could provide me with images of each of these personas? Include the relevant details the image generator would need.
 
ChatGPT:

Certainly! Here’s a good prompt for an image generator that could provide you with images of each of these personas:

Prompt: Generate a set of diverse avatar images representing the following personas:

  1. Natalie – A 21-year-old female college student majoring in Psychology. She is of Hispanic descent, has dyslexia, and speaks both English and Spanish fluently.
  2. Jason – A 35-year-old male college student majoring in Computer Science. He is a military veteran with PTSD and has a strong sense of Southern culture and traditions.
  3. Sarah – A 20-year-old female college student majoring in Journalism. She is of mixed Chinese and African American heritage, has ADHD, and is conversational in Mandarin.
  4. Michael – A 29-year-old male college student majoring in Business Administration. He is originally from the Philippines, has visual impairment, and is fluent in English and Tagalog.

Details Needed:

  • Gender: Male or Female
  • Age: 20-35
  • Ethnicity/Cultural Background: Hispanic (Natalie), Southern United States (Jason), Mixed Chinese and African American (Sarah), Filipino (Michael)
  • Disabilities: Dyslexia (Natalie), PTSD (Jason), ADHD (Sarah), Visual Impairment (Michael)
  • Major/Field of Study: Psychology (Natalie), Computer Science (Jason), Journalism (Sarah), Business Administration (Michael)
  • Languages Spoken: English, Spanish (Natalie), English (Jason), English, Mandarin (Sarah), English, Tagalog (Michael)

The image generator could use these details to create a diverse set of avatar images that accurately represent each of the personas.

 
Me:
Please abbreviate that significantly and ask for just one image showing them with handwritten name tags on their shirts.
 
ChatGPT:

Certainly, here’s a more concise version of the prompt:

Prompt: Generate one image of four avatars, each representing a college student with a handwritten name tag on their shirt. Each student should be visibly different, representing the following personas:

  1. Natalie: 21-year-old female with Hispanic background, dyslexia, Psychology major, speaks English and Spanish.
  2. Jason: 35-year-old male with Southern United States background, PTSD, Computer Science major, English speaker.
  3. Sarah: 20-year-old female with mixed Chinese and African American heritage, ADHD, Journalism major, speaks English and Mandarin.
  4. Michael: 29-year-old male from the Philippines, visual impairment, Business Administration major, speaks English and Tagalog.

Details Needed:

  • Gender: Male or Female
  • Age: 20-35
  • Ethnicity/Cultural Background: Hispanic (Natalie), Southern United States (Jason), Mixed Chinese and African American (Sarah), Filipino (Michael)
  • Disabilities: Dyslexia (Natalie), PTSD (Jason), ADHD (Sarah), Visual Impairment (Michael)
  • Major/Field of Study: Psychology (Natalie), Computer Science (Jason), Journalism (Sarah), Business Administration (Michael)
  • Languages Spoken: English, Spanish (Natalie), English (Jason), English, Mandarin (Sarah), English, Tagalog (Michael)

Please provide a single image showing all four avatars, each with a handwritten name tag on their shirt.

 

 

 

Thoughts on the Ethics of Offering Incentives for Surveys and Questionnaires

At the end of each semester, my students are provided with a link to a survey seeking their feedback regarding my performance as an instructor for their course. This is standard practice at many institutions and typically organized and/or administered at the direction of an institutional effectiveness office, not the instructors themselves. Although the students at my institution receive an email notification that their participation is needed/requested (and a reminder email or two), the emails include only a vague (meaningless?) incentive of institutional improvement connected to the request.

Because an educational institution should theoretically value higher levels of participation in end of course surveys, and because the level of my own students’ participation can be viewed (unfairly) as an indirect reflection of my professional capabilities, I have often wondered whether I should provide some type of incentive for my own students to complete their evaluation surveys. I am aware that some of my colleagues do provide extra credit for end of course survey completion, but this strikes me as somewhat unethical because a student (i.e. research participant) who is aware of a reward attached to their participation could then have a different outlook on the subject of the research or the research itself. In other words, rewards make people happy. Rewards change people’s moods and perceptions. Rewards, then, can introduce a type of subtle bias. This is why I have typically eschewed rewards to encourage participation in my end of course surveys. But this doesn’t necessarily mean that I have eliminated bias. Not at all.

When people (like students) come to expect rewards for completing surveys, researchers could potentially create a bias simply by not offering some kind of expected incentive for a person’s time and thoughts. If a person feels like they are being compelled to complete a survey without any promise of a reward, they may be unhappy/annoyed/frustrated about doing so, and those kinds of feelings could potentially filter into their survey responses—perhaps even unconsciously.

By not offering my own students rewards for their participation in my end of course surveys, I do fear that I am impacting their responses in an unintended way connected to their emotional state when completing the survey. This is made worse because of my colleagues who do offer incentives and thereby create a stronger expectation on the part of students, an expectation that is naturally followed with disappointment when not met.

Instead of being “damned if I do, damned if I don’t,” I am more “favored if I do, damned if I don’t” when it comes to providing incentives for survey participation. But either way, I would argue that the data is damned.

If the principles of offering (and refraining from) survey incentives which I have laid out for my own course surveys hold generally true for all research efforts that include voluntary data collection, then it seems the researcher concerned with participant bias is faced with a no-win situation: provide a reward and introduce a favorable response bias -or- refrain from providing a reward and introduce a potential unfavorable response bias.

This matter is further complicated by factors such as socioeconomic status since those who are more in need of incentives may be more susceptible to bias (in my case, students with lower grades and more in need of extra credit would be impacted differently by rewards than those who do not need them). A 2019 study supports the idea that “There is also a risk that incentives may introduce bias, by being more appealing to those with lower socioeconomic status.” Although that particular article is concerned with bias in terms of the sample generated, the connection between incentives and participants’ attitudes/responses should be a matter of concern as well.

So what is an ethical researcher to do?

At Georgia State University, where I am a graduate student, the course evaluation process doesn’t exactly provide an incentive for students who complete instructor evaluations. Instead, they withhold a student’s ability to view their final grades until they cooperate and submit their reviews.

This doesn’t seem like the most ethical way of solving the participation problem, but I do imagine it is effective at generating high response rates without costing the institution a nickel. A better approach, I think, is working harder to educate students (research participants) about the risks involved in providing incentives and punishments when trying to encourage participation in research. This at least gives students some answers as to why an incentive is not being offered and hopefully garners more credibility for the study in the eyes of the people who are being asked to participate. I have taken to this approach with my course evaluations, and I think it may be somewhat effective at preventing the bias I am seeking to eliminate. It is not, however, the most effective way to increase response rates.

 

Implications for my UX research

There is probably no perfect system for achieving optimal survey participation while also eliminating bias, but researchers can at least be thoughtful and careful about their efforts. Although I have not designed my final UX course project yet, I can imagine the need for survey responses to inform my efforts. Perhaps I will simply try to encourage participation by explaining the virtue of my educational field work and also explaining the problems associated with offering incentives. Perhaps I will just offer incentives to participants and carefully explain this choice within my report. Another option might be to try to diffuse the potential bias created by an incentive by adding a question at the beginning of the questionnaire like the following:

“Could providing incentives to participate in surveys make respondents less honest (i.e. more positive) in their answers?”

Such a question would at least force the participants to consider the issue right before they answer other questions where their unbiased feedback is sought. I can see how creating that kind of basic awareness could encourage less biased, more honest responses.

 

Other readings and notes

Most sources I found related to this issue deal more with introducing bias in the sample. I didn’t find any that specifically deal with the quality of responses being impacted by incentives.

This article found that incentives provided upfront work better than incentives provided after participation. I would have expected the opposite, and I suppose that is me not believing the best in people.

“Prepaid incentives (unconditional incentives) are more effective in increasing response rates in comparison to payment after survey completion (conditional incentives) [6465]. This could be explained by social exchange theory as providing participants monetary incentives in advance encourages them because they feel they should reciprocate for the reward they receive by completing the survey.”

This is decent write up of different types of bias in surveys and surveying.

ChatGPT and claude.ai reflect mostly what I found (or didn’t find) when looking into this issue, but when pressed a bit, ChatGPT gave me something closer to what I was looking for in terms of response bias. Of course, just because it gave me what I was after, I wouldn’t trust it without further verification from some better human sources. (These generative AI systems are no solution for confirmation bias, and I do fear they might make that even problem worse.)

UX Research Notes: Using Screen Recording to Capture ChatGPT Usage

On a Friday morning I recruited two family members to participate in a brief research activity using ChatGPT. I used screen recording through Microsoft Teams to capture the participants’ interactions with the free version of OpenAI’s ChatGPT tool (ChatGPT 3.5). My goal as the researcher was exploratory in nature, seeking some feedback on the interface and functionality of the ChatGPT desktop web application.

Instead of simply providing the participants access to the ChatGPT website without any defined purpose for interacting with it, I designed an activity to encourage them to make use of the tool. The instructions for the activity were as follows:

Instructions: Use the generative-AI tool (ChatGPT or claude.ai) to engage in a virtual debate. You can ask the AI tool to be your debate partner and for suggestions on a topic, or you may suggest a topic. The goal is to “win” the debate.

*Note: topics related to recent current events don’t work well because the generative AI tools do not have access to information after 2021 (ChatGPT) and 2022 (claude.ai). 

Both users were given this same information on a sheet of paper and verbally informed of the purpose for the activity. They were not given time parameters for completing the activity, and were instead told to take as much time as they needed to complete the activity, or they could stop when they were bored or felt like they could no longer make progress toward their goal.

The two participants were significantly different in terms of demographics. This seemed desirable because younger and older tech users sometimes have different approaches and attitudes toward interacting with technology. Here are the participant profiles:

Participant 1 (P1): 10-13 years old, white male, from a diverse suburb of Atlanta, public school educated, never used ChatGPT or other generative-AI platforms.

Participant 2 (P2): 35-40 years old, white female, from a diverse suburb of Atlanta, college educated, very limited past use of ChatGPT—maybe 2 or 3 times total for work and curiosity reasons.

Results Summary

P1 engaged with ChatGPT for over 20 minutes, exploring different topics he proposed, then asking for a suggestion for a topic from ChatGPT which he debated, and finally proposing one last topic himself which he also debated.

  • Interactions by the numbers: 14 typed prompts/replies, average length 20 words.

P2 engaged with ChatGPT for 10 minutes, focusing on only one topic which she proposed.

  • Interactions by the numbers: 9 typed prompts/replies, average length 13 words.

UX Observations

Both participants found the system to be intuitive and easy to use.

Neither participant made use of the small icons below each ChatGPT response (thumbs down, regenerate, copy). When asked why, participant 2 said she never uses those features (especially the “like/dislike” buttons) because she doesn’t care to inform the algorithms used for determining the types of content she is fed on social media sites.

Both participants commented that the speed at which ChatGPT responses displayed and auto-scrolled was too fast. From Participant 1:

“I would have the I would have chat GPT pull up its arguments slower because it feels like a lot to read it. The pace felt really, really fast. It was almost like stressful to read them. And I’m really fast reader, and I still couldn’t read it all.”

Although I did not want to interrupt the session, I was curious to know if participant 1 was reading the full responses generated by ChatGPT, so I asked him in the middle of his session. He replied that he was just “skimming” them. The fact that he chose to skim was likely because the responses were autos-scrolling quickly and somewhat lengthy compared to the short prompts/replies he was typing.

Both participants also said they liked the disclaimer at the bottom of the screen that says “ChatGPT can make mistakes. Consider checking important information.” But both participants did not notice this until the end of their session. Neither was sure if it was there at the beginning and participant 2 suggested it ought to be a pop-up at the beginning that must be X’ed out by the user.

Participant 1 suggested custom font options, especially in terms of size for readability.

Participant 1 also suggested the ability to input prompts/replies with speech and an option to have ChatGPT’s responses read out loud to the user.

Final Thoughts

Using the Teams screen recording option was easy, and the transcript proved useful despite its inaccuracies. I would use Teams in the future for similar data collection and the recording of post-use interviews.

In the future, I would encourage participants to verbalize more of their thoughts during the use of the product. I was able to get some good feedback after my participants completed their tasks, but more in-the-moment feedback would give me more material to work with in a post-use interview.

Using MS Teams Screen Recording for UX Research

When conducting UX research on digital artifacts, like websites or mobile apps, UX designers can benefit greatly from seeing how a user interacts with the product in real time. Researchers/designers rely on data to make decisions about how to improve a product, and a complete look at a user’s interactions can provide insightful data points. In the past, designers who wanted this kind of data may have been limited to looking over a user’s shoulder as they interacted with a product, or they may have relied on post-use reports in text formats which, although helpful, would have been lacking any kind of “soft” data that can be gleaned from observing a user in action.

Today, there is no shortage of tools for UX researchers to choose from when it comes to screen recording. The choices are overwhelming and even finding reliable comparisons is tricky as so many of the reviews are sponsored by the companies who are selling them—sometimes in misleading ways that are hard to detect. This one, for example, has nothing negative at all to say about Camtasia. This comparison here seems a bit more reliable and points to Camtasia’s high upfront costs as a drawback.

Since evaluating and testing new screen recording tools is such a time consuming process, it might make sense for a UX researcher to simply rely on the platforms they already know and use for other personal and professional reasons. Ubiquitous tools like Zoom and Apple’s facetime allow for screen sharing and Zoom has recording options as well. (Facetime recording is possible by adding an app to the device.)

One productivity platform used by over 300 million people around the world is Microsoft’s Teams. It too has screen sharing and recording capabilities. Since this is a platform I have been required to use in the past and have come to know a bit better than other options, I decided to review it through the lens of a UX researcher/designer. My opinion is that Teams is an effective and reasonably easy to operate option for screen sharing and recording, especially for people who already have access through institutions and organizations that are already paying the licensing fees to use the product.

Below are some more detailed thoughts and a demo of using Teams for UX research purposes.

 

There are at least two ways to start a new session: “Meet Now” and “Add New Meeting”, but if the purpose is to get more than one person together with some advanced notice, the “Add New Meeting” option provides the means to invite others and directly syncs with their Outlook calendars (assuming they have and use the related MS Outlook for email). The other “Meet Now” option works quickly and allows the meeting creator to invite users to a meeting in progress. Either way, it is relatively easy to invite others to a meeting session. If the guests are also part of the organization, they can be invited with a search feature. If the invited guests are not part of the organization, they can still join, but the meeting creator must send them a link.

I am sure as a UX researcher, an important goal is to make the process of joining a session as easy as possible. As Teams continues to gain market share as a productivity platform for organizations, I think it could overtake Zoom and others as the platform of choice for people who want to record screen sharing sessions. Using the most well known and most used platform is a good move for a UX researcher, even if doing so sacrifices some functions available from other tools. Research participants should not have to download new apps and learn new interfaces. Placing those barriers in the way will only make it harder for a UX researcher to get willing subjects.

Because of Team’s design as meeting facilitating tool, a meeting session allows multiple people to join and view the meeting. I have been in meetings with around 50 people before. I don’t know how many one meeting can handle, but I would imagine other platforms could be more limited than Teams in this regard.

It is important to note that Teams meetings, like Zoom calls, can be run from a browser or an installed app. I have found this feature especially helpful when one or the other gives me issues. It’s nice to have the option to switch, usually from the app to the browser in my case. Of course, for invited guests who do not have the Teams app, the browser option should still allow them to join and participate.

I did notice as I created a demo meeting using “Add New Meeting” for this review that there was an option to “record automatically.” That seems helpful, because I have had sessions before that I meant to record and forgot. A pop-up once the meeting is started that asks if the user would like to record would be even better, but I do not think I have seen this happen before. Of course, the downside to the pre-meeting record setting is that one might get caught unaware of being recorded after the meeting has already started. 

Teams meetings offer users dozens of background filters that range from a simple blurred background to professional looking office backgrounds to distracting animations. These are helpful for privacy purposes and add an element of fun as well, but the animated backgrounds could pose problems for researchers and participants who need to focus on a specific artifact.

For a look at some of the features of Teams screen recordings offer during a session, I created the following demo:

UX Demo 3 with Jason Palmer-20240202_152729-Meeting Recording.mp4

I get the feeling that most screen recording tools are about the same. It might make sense for UX researchers to just pick one they are already somewhat familiar with (like Teams, or Zoom) and learn how to maximize its utility–as opposed to spending their time shopping around and trying to learn new interfaces and functions.

For me, Teams seems like a competent and effective tool with enough versatility for most UX research purposes.

Week 3 Field Notes

The Five Second Test (for sample To-Do List)

  1. What was the app for? Looks like it could be a search feature of some sort.
  2. Did you want to use it? Yes, even though I did not know exactly what it was, the simplicity of the design was inviting—probably because it looked to me like I wouldn’t screw anything up by messing around with it.
  3. If so, what would you do first? I would probably type “test” in the text field and click the button to see what happens.

 

  1. Note anything relevant that occurred to you during our discussion.

Students seemed to have some similar thoughts and reactions to the tool. As we discussed the artifact, I was reminded of the many focus group studies I participated in in the 2010s as I sought to supplement my income and pay off student loans. I learned so much about market research and facilitating focus groups from these.

  1. Note your thoughts about 5 second testing as a Usability tool.

The 5 second test is useful because we know that many prospective users/buyers are going to make decisions based on very initial impressions of a product. The book gets judged by its cover whether we like it or not. Therefore, seeking the first thoughts one has provides the designer the chance to better evaluate the product and make necessary changes. With enough data from a strong enough sample of 5 second users, the designer should be able to figure out which elements are most appealing to the largest number of potential users.

Heuristic analysis

  1. visibility of system status — keep users informed about behind the screen processing, loading, successfully uploaded, searching please wait

The importance of status visibility depends on the artifact, but the spinning wheel of death, as it is sometimes called, is one way software/web tools let us know something is going on and we (as users) do not need to do anything else—yet. I do think the visuals that show percentages of status progress are far more helpful than those that do not.

  1. use familiar, real world, language — no jargon, no site-specific lingo

Again, this will depend on the artifact and audience, but in the case of a basic to do list meant for the general public, this is solid advice. They can’t use it easily if they don’t understand it.

  1. users should be in control — nothing relevant to the experience should be happening behind the screen

This hints at transparency and data security too. We can have users forfeit control (of their data and usage behavior) by including a long terms and conditions page (that they won’t read) and have them mindlessly click “agree” so the artifact can take control of whatever benefits the designer. I do not think this kind of unethical strategy should be employed, but it does feel more like the norm today.

  1. follow industry standards — CNTRL S means save regardless of platform or Cmd because Apple

It is also important to keep up with changing standards then. People (young ones, especially) are always tweaking the way technology is used, and if a designer only sticks to the standards of their time, they will be missing opportunities to improve their products usability.

  1. don’t let users make mistakes — multiple levels of undo, popup in place warnings about required form fields, greyed out representation of features unavailable in the current context

This means that the designer needs as much usage data as possible to see what mistakes can be made. This is hard to predict. Just like how people can find creative ways to use a product that were completely unexpected, they can also find an infinite number of ways to misuse a product so that it doesn’t work at all.

  1. recognition over recall — don’t make users remember or have to think

The whole point of technology is to make tasks easier. The harder the user has to work or think, the less need they are going to have for that particular tech.

  1. flexible designs — experts should have shortcuts and other tools that aren’t visible to novices who will be distracted or confused by them

Interesting concept. Not sure I agree completely, but it does make sense to at least let users know in some logical way that advanced features are available to more expert users. We see this many times indicated with an “advanced” button that reveals those features. That seems like a good application of this design principle.

  1. minimalist design — don’t clutter the screen, don’t add images as decoration

Yes! Yes! Yes! I can’t imagine anyone loves their screens cluttered with ads and things they don’t need. This should be #1 on the list. Only add what is needed. As for not adding images for decoration, I might argue that some images that appear to be decoration are really serving some important other purpose as well, like appealing to emotion. A picture of a tropical island might have the effect of calming the user, and paired with music (that some would initially think is superfluous) could actually have a positive effect on the user’s experience.

  1. no error should be fatal — offer clear signposts and ways to start over efficiently, auto populate form data when possible

I have not put much thought into this yet, but I can see how one fatal error would lose a user for good. As for auto-populating, that is helpful most of the time. I always appreciate when “United States” is listed at the top of a dropdown that is otherwise alphabetical. I hope for users in other countries, the designers make their country rise to the top based on ip address or something—especially for our friends in Zambia.

  1. provide help — but design so no one needs it (and assume no one will read it)

But what help will the user need? How many avenues of help are enough? Chatbot features can sometimes be helpful, but they have a history of being frustrating too. Not to mention they can clutter a screen. I still appreciate a live chat with a human assistant, but those are not always practical. FAQ pages are hit or miss; the most thorough can be too long to work through, and the more basic will miss too many common issues. User input is essential to determine where the help should be provided.

UX Project: Fully Functional Demo Website Coded by ChatGPT

Demo website created with ChatGPT; click image to view working demo.

I have never written a line of computer code in my life. I don’t know the first thing about the differences between HTML and Java, or any other software languages. But harnessing the power of ChatGPT today, I created the website pictured above in less than an hour. Mind you, this was not some drag and drop, customizable website template process like many website builders offer (Wix and WordPress, for example). I have used those without breaking a sweat before (this WordPress site you are reading is an example), but those are not coding efforts. What the free version of ChatGPT was able to create for me was functional HTML code generated completely from scratch (or rather, translated and generated from my written English language prompts). I then pasted the code into the free web “space” offered by W3Schools and voila! Operational website.

The UX class assignment I was working on gave me the option to create a mockup of a to-do list application/website (think basic non-operational image) or go a step further and play around with GenAI to see if I could create a working model. Despite my fascination with ChatGPT, and my frequent use and exploration of it over the past year as an educational tool, I was hesitant to use it for coding purposes. I was intimidated. A quick demo at the end our online class meeting changed my posture though (thanks Franco and Dr. Pullman!). 

Here is a summary of how I created my online to-do list website.

  1. Set up account at W3Schools (<5 minutes).
  2. Went to ChatGPT and prompted it as follows: “Please create a web page in html that includes a to-do list feature that allows the user to add items to one of three catagories, each of which has a column on the page. Name the three categories home, teaching, and scholarship. Provide me the html code.” (<5 minutes)  *I still have no clue what HTML code is or how it works. I only asked it to use HTML because I thought I heard someone say it is the easiest or most basic code to work with.
  3. Copy the code generated by ChatGPT, and then go back to W3Schools, create “New Root File” and paste the code into the the text box. (2 minutes) *This was the only tricky step for me, and it was the step I was scared of before seeing my classmate do it. The user interface of W3Schools might make sense to people who know a little bit about coding and web design, but to me, a complete coding novice, it was not intuitive.
  4. Click “Run.”
  5. Preview the newly born website.
  6. Go back to ChatGPT, ask it to make specific changes and rewrite the entire code. Then go to W3Schools, delete the root file and replace with the new code. (40 minutes) *I did this probably six or seven times, requesting one or two changes each time, and then testing it out. 

          Here is a link to the entire conversation I had with ChatGPT if you want to see all of the prompts I              gave it. Note: I do communicate with the machine as if it is a human assistant/tutor/student, so I              typically use pleases, thank yous, etc. I do this mostly because I don’t want to develop less polite                  communication habits, not because I think of the machine as a person–but I don’t claim to be                    completely immune to the ELIZA effect either.

Key Takeaway

When I found out last year that ChatGPT could write functional code when prompted by English text, I had a hunch that English (or any common spoken/written human language) had just become the new computer language. But I then read somewhere (wish I could remember where I saw it) that English could never really replace the learned skill of computer coding because the English language is too ambiguous, vague, and complex to work on the logic level of a computer. That sounded plausible, so I shelfed my hypothesis hunch and left it for another day. Well, today is that day. I am now convinced that the “skill” of learning, understanding, and writing computer code can be almost completely replaced by the combination of GenAI and effective English prompt engineering. I do understand that there are still critical reasons why some people need to understand and practice coding, but the number of those people will drop precipitously now as non-techies like me partner up with GenAI to do what most of us never dreamed we could do in a digital space–at least not without hours upon hours of instruction in a host of computer languages. The new computer language is indeed English: specifically, concise English delivered repeatedly and patiently to a non-human audience with superhuman coding abilities. Whoever said English wouldn’t replace coding languages because it is too ambiguous, vague, etc. failed to realize that users of English can be insightful, creative, and persistent enough to overcome those inherent problems with the language. They also failed to realize that generative AI is (often, but not always) incredibly competent at making sense out of human language inputs. 

UX Takeaways

ChatGPT and other GenAI make the designer’s job way easier, but only if that designer can communicate properly with the machine and/or have some patience and recognize that designing with AI as co-pilot is an iterative process.

A designer still needs a solid plan and then decent feedback to make appropriate improvements. AI can provide both if prompted correctly, but human feedback should always be sought to help the designer catch their own blind spots.