Prompt Writing vs. Prompt Speaking

When I was in high school in the late 1990s, cell phones were just becoming ubiquitous among my Generation X peer group. However, the way we used these early generation phones for communication purposes did not look exactly like how most people use them now. Text messaging, in particular, only became realistically functional around 1999 when the feature started working across different networks, allowing one cell phone user to text any other, regardless of whether their service provider was AT&T, Sprint, etc. Yet even with its lower-than-calling cost, the new medium of texting did not catch on right away. I can distinctly remember sharing a sentiment with many others at the time: why would I text when I could just call? Speaking live to the other person just seemed so much easier, so much more efficient and timely, so much more…familiar. Phones– whether in a glass booth on a city street corner, attached to the wall of your house, or in your pocket–were for calling and talking.

For a while anyway, neither the early cell phone user nor the technology hardware was quite ready to usher in a significant replacement of live verbal over-the-phone communication in favor of finger-pecked messages. The users at the time, still getting used to digital communications in the form of electronic mail (then electronic-mail, then e-mail, and finally email), had spent their entire lives understanding landline phones as talking and listening devices. Why text from a phone what could be much more easily typed and emailed from a computer? The cell phones of the 1990s and early 2000s, without full alphabetical keypads, made texting a very clunky process.

Despite these challenges, the utility of the text message via cell phone ensured its survival, and as Blackberry phones introduced full keypads (2002) and the iphone introduced a full touchscreen keypad (2008), the user experience for texting improved dramatically. By 2011, Pew reported that almost a third of cell phone users who used text messaging preferred receiving and sending texts over calls. A little over a decade later, texting achieved status as the preferred choice of communication over calls, in-person conversations, video calls, and email. Writing (of a sort) actually defeated speaking.

The matter seems to be settled then, for now at least: texting is generally preferred as the means of communication between people today. But what about how humans interact with computers?

Virtual assistants in the form of Alexa speakers and Apple’s Siri have been around for over a decade, and they are technologies that exclusively work via voice activation and verbal inputs from the user. Speech-to-text programs have also been around for quite some time, and their advancement has made it easier for many people to dictate text messages, “write” emails, and utilize word processing programs like Google docs and Word. For many, typing textual inputs for their computers and devices is not a preferred method over speaking into the devices. The path to this point was/is not without challenges, however:

The latest generative AI models like ChatGPT can also respond to text or speech inputs from users. Considering the rapid integration of generative AI into existing technologies like phones and personal computers, how people choose to interact with the generative AI will largely drive both how educators might train people to use the AI and how the developers of generative AI adapt the user experience of the products.

As an instructor of writing, I decided a couple of years ago that generative AI might not be such a terrible disruption to my line of work –IF– the interface of ChatGPT and others like it was actually forcing the users to write their prompts for the machines. In fact, like others, I saw prompt writing as a new and exciting way to teach both basic writing skills and critical thinking. After all, getting a generative AI model to do what the user wants relies heavily on clear written communication (along with persistence and revision and creativity). However, as these chatbots increasingly incorporate better speech-to-text capabilities, I wonder if the idea of prompt writing will soon lose out in favor of prompt speaking—similar to how a reverse situation unfolded with verbal cell phone conversations losing the battle against written text messaging. If science fiction is any indication of what the future of human-computer interaction will look like (and sci-fi has quite the impressive track record), then keyboards could be on their way out. Hal and JARVIS and Kit have superior listening comprehension.

Already, some computer coders have turned away from natural language text prompts of generative AI models in favor what they are calling “vibe coding,” or speaking their instructions rather than typing them. This does not seem like a good sign for my plans to teach writing through the lens of AI prompting, but we will just have to wait and see if speech prompting catches on and influences the way that the tech industry updates AI experiences and interfaces for users.

Voodoo, Alchemy, and Incantations: The Enigmatic Nature of Generative AI

When it comes to AI (and particularly the generative AI of the 21st century), there are two disclaimers that should precede any discussion about how this computer technology functions:

  1. Not even the creators and top researchers of generative AI fully understand how the technology does everything it does.

and

  1. The field is advancing so quickly and drastically that what was true about AI one minute is quite often untrue the next.

Now, that second disclaimer means that at some point the first disclaimer may be rendered false and the experts will have it all figured out—maybe even due to the assistance from AI itself.

But even without a full understanding today, there are numerous resources which attempt to explain the basics of how generative AI (like chatbots and LLMs) can take a user’s natural language input (i.e. prompt) and produce a natural language output as a response. However, I think these resources (articles, videos, etc.) mostly fall short of boiling this process down into simple terms that non-computer scientists can understand. Depending on the particular explanation and the analogies used to describe the process, a learner might have to become familiar with a host of technical terms and advanced computing concepts if they are to reach a meaningful level of understanding.

Interestingly, generative AI itself can do a remarkable job of explaining complicated concepts in plain language suited for whatever audience a user specifies:

Note that even at the high-school level, ChatGPT has already introduced the ideas of neural networks and LLMs. These are not simple ideas in computer science and understanding them requires the understanding of further underlying concepts. According to ChatGPT, understanding a neural network requires a basic knowledge of algorithms, machine learning, and mathematics like statistics and calculus.

One possible conclusion then is that a sufficient explanation of how generative AI works does not exist for the non-mathematician/computer scientist. It is just too complicated of a technology. Asking a question like “how does ChatGPT work?” isn’t all that different from asking how a human being is able to verbally answer a question. There is just so much to it, and one lengthy article or video or even video series isn’t enough to cover all the necessary ground.

My own incomplete understanding of AI has been largely shaped by three books: Brian Christian’s The Alignment Problem (2020), Joseph Weizenbaum’s Computer Power and Human Reason (1976) and Yuval Noah Harari’s Nexus (2024). Combined, these books provide a decent primer on the history of AI and what its capabilities and limitations are, but I would not say that they sufficiently explain exactly how it works.

One source that I think does do an admirable job of trying to explain ChatGPT’s inner-workings is “What Is ChatGPT Doing … and Why Does It Work?” by Stephen Wolfram. I came across this in early 2024, and while Wolfram explains much of the mathematics behind ChatGPT/LLMs without completely overwhelming the mathematically challenged reader, I find something even more intriguing about how he approaches the areas of ChatGPT’s functionality that experts still cannot explain.

Here are the key excerpts from Wolfram’s article:

“…when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again ‘given the text so far, what should the next word be?’—and each time adding a word.

One might think it should be the “highest-ranked” word (i.e. the one to which the highest “probability” was assigned). But this is where a bit of voodoo begins to creep in. Because for some reason—that maybe one day we’ll have a scientific-style understanding of—if we always pick the highest-ranked word, we’ll typically get a very “flat” essay, that never seems to “show any creativity” (and even sometimes repeats word for word). But if sometimes (at random) we pick lower-ranked words, we get a “more interesting” essay.”

As a basic predictor of the next most probable word, Wolfram gives an example of what text looks like from GPT-2 (the 2019 model):

The text quickly devolves into non-sensical rambling. However, when the randomness of the next word is increased slightly, the text becomes more coherent. Turn up the randomness (i.e. temperature) a little higher and the text appears coherent and even creative, but turn the randomness up too high and it will again become non-sensical.

The fact that Wolfram calls this randomness factor “voodoo” is striking—and not just for its cultural insensitivity. We expect more concrete answers and explanations from the experts. Wolfram does a fine job of breaking down what can be said of the operations of ChatGPT, but it’s what the experts cannot explain that interests me the most.

A March 2024 article from MIT Technology Review digs into the mystery of LLMs further. Here are some relevant quotations:

“a remarkable fact about deep learning, the fundamental technology behind today’s AI boom: for all its runaway success, nobody knows exactly how—or why—it works.”

Mikhail Belkin, a computer scientist at the University of California, San Diego: “But our theoretical analysis is so far off what these models can do. Like, why can they learn language? I think this is very mysterious.”

“rapid advances in deep learning over the last 10-plus years came more from trial and error than from understanding.”

“It works, which is amazing. Our minds are blown by how powerful these things are”… And yet for all their success, the recipes are more alchemy than chemistry: “We figured out certain incantations at midnight after mixing up some ingredients,”

The major unknowns and mysterious randomness in generative AI operations ought to change the way some people (mis)understand the technology. If LLMs and chatbots were simply operating on basic probability, then that would make them easier to understand, easier to explain, and consequently, easy to dismiss as fancy calculators. The fact that they are not easy to understand or explain tells me that the people who do dismiss them as stochastic parrots are perhaps guilty of underestimating an alien/artificial intelligence that now powers, distills, and summarizes an ever-increasing amount of our daily internet searches for information.

If generative AI poses legitimate risks and threats (it does), if it creates problems (it has), then the best solutions and mitigation strategies ought to be based on the most complete understanding of the technology. That is far less likely to happen as long as prominent critics mistakenly believe they are dealing with a parlor trick instead of the complex enigma that is increasing in its reasoning capabilities with each passing day.

Thoughts on the Commonplace Book

This week’s ideas about commonplace books leads me to reflect on a book that I read recently. William Poundstone’s Head in the Cloud (2016) examines the value of knowing facts in a world where we can quickly Google or search for (and find) answers for everything.

To summarize one of the book’s main points, Poundstone thinks that even though technology does and should reduce our need to memorize facts, we still need some minimum level of memorized facts in order to make sense of the world around us and function well in a society. This makes sense to me, and I feel like the base level of factual knowledge that I see from my college freshman students has declined over the past 15 or 16 years. Of course, there is also the cognitive bias of declinism which I know could be impacting my perspective, but Poundstone’s research does provide data that affirms my anecdotal observations.

Maybe today’s technology affords everyone the choice to use their brain space for something other than fact storage. And maybe this helps to explain why the commonplace seems less common and why we no longer share a set of cultural touchstones or cultural canon the way we once did. However, I think there is something else we should consider as responsible for the disappearance of the common place.

Social media and the nature of new entertainment distribution gives people such a wide variety of content to choose from that it feels more difficult to connect with others the way we (Americans) used to. Influential algorithms certainly play a part in customizing our media/entertainment consumption in the 21st century, but individual choice within the menus that the algorithms provide each consumer still seems to allow far greater variation in a single person’s reading/viewing/listening profile than was the case a few decades ago.

It is worth accounting for the “viral” nature of the most widely shared internet content like memes, gifs, and Tiktok videos. Could these be forming 21st century commonplace? I think that is possible, but there is something so short-lived and ethereal about this kind of content that I wonder if a commonplace based on such a corpus is distinctly different from what previous generations shared.

I presented this idea to ChatGPT (after I wrote the above reflection) and the response is below. I like the observation that one’s social media account and corresponding history of shared and liked posts could be considered as their commonplace book. Since I do not participate much with social media, this did not occur to me, but I do keep my own version of a digital commonplace book in youtube, saving and organizing content that I intend to come back to. I suppose my email folders serve the same purpose at times.

ChatGPT:

The concept of the commonplace has a long history rooted in rhetorical tradition, particularly in ancient Greek and Roman education. A commonplace (locus communis in Latin) traditionally refers to a shared idea, theme, or argument that is widely recognized and can be used as a foundation for further reasoning, debate, or persuasion. In classical rhetoric, students maintained commonplace books, which were collections of quotations, maxims, or arguments that they could draw upon when composing speeches or writing.

The Traditional Commonplace

Historically, the commonplace served multiple functions:

  • Memory and Invention: It allowed speakers and writers to store useful ideas for future use.
  • Shared Cultural Knowledge: It provided a repository of widely accepted wisdom, strengthening arguments by appealing to common values.
  • Pedagogical Tool: It was used in rhetorical education to help students develop their ability to analyze and deploy persuasive ideas.

During the Renaissance and Early Modern period, commonplace books became highly personalized and were often filled with excerpts from literature, philosophy, and even scientific texts, reflecting the individual’s intellectual pursuits.

The 21st-Century Commonplace

While the fundamental idea of collecting, organizing, and drawing from shared knowledge remains, the digital age has transformed the commonplace in several key ways:

  1. Digital Curation Over Personal Notebooks
    • Instead of handwritten journals, people now use digital platforms like blogs, Twitter, Pinterest, or Notion to collect and organize knowledge.
    • Algorithms curate information, shifting from personal selection to automated suggestion.
  2. Fragmentation and Speed
    • Traditional commonplaces were structured collections of ideas meant for deep reflection; today, social media and digital reading encourage more ephemeral and rapid engagement with ideas.
    • Memes and viral trends serve as modern commonplaces, often reducing complex ideas into highly shareable but simplified forms.
  3. Community-Driven Knowledge
    • While historical commonplaces were curated individually, modern knowledge repositories (Wikipedia, Reddit, online forums) are collectively built, making the process of knowledge-gathering more communal.
    • Hashtags and online movements function as dynamic, evolving commonplaces.
  4. Multimedia Integration
    • Unlike the text-based commonplaces of the past, today’s versions incorporate video, audio, and interactive media, expanding the ways in which knowledge is stored and transmitted.

Fundamental Differences?

Despite these changes, the core purpose of the commonplace—to collect, store, and reuse knowledge—remains intact. However, some fundamental shifts have occurred:

  • Authority and Personalization: In the past, commonplace books often relied on established authorities, whereas today, the sources of knowledge are more democratized (but also more prone to misinformation).
  • Ephemerality vs. Permanence: Digital platforms prioritize trending ideas that can disappear quickly, whereas historical commonplaces aimed for long-term intellectual reflection.
  • Interactivity: The modern commonplace is more participatory, with collective engagement and commentary shaping the content in real-time.

A Literacy Narrative, Interview Style

The following questions and answers detail some of my thoughts and experience with reading, writing, and AI (which I have mostly interpreted as generative AI). I present this as a rather informal Q&A format, but to see how ChatGPT attempted to weave the same information into a more cohesive narrative, here is the link:

https://chatgpt.com/share/678a72be-1d2c-8004-8539-5a6595b495f8

I think the language of my responses below better captures the essence of my personality and experience, but the ChatGPT-constructed narrative does do more to create cohesion, as my prompt specifically instructed.

For what it’s worth, ChatGPT identified the central theme of the narrative as “the transformative power of language—how writing, reading, and thinking shape personal identity, intellectual growth, and teaching practices. It explores how these elements intertwine to create a lifelong journey of discovery, reflection, and connection, both with oneself and the broader world.”

Writing

What is your earliest memory of writing?

I remember writing “themes” in high school, probably 10th grade, late 90s. These were short essays on a given topic or with a given form. I don’t remember completing specific steps of the process, but they were handwritten. At least some of the process was done in class. Computers weren’t really an option from what I recall—not in the classroom. I don’t remember much writing on computers until college.

What is your earliest memory of being taught to write?

I remember learning to write in cursive when I was in third grade. It was difficult for me, and being left-handed seemed to make it worse. The pencil or ink would smear across the page and get all over my hand that didn’t have anywhere to go but all over the freshly scribbled letters.

*Note to self: It takes a bit of mental energy to dig around the closet of my mind’s memories to find this kind of stuff. I don’t poke around in this closet much—probably because the memory of learning to write might reside on top of some poorly hidden metaphorical skeleton.

Did your parents read to you?

I remember my mother reading to me. My grandmother and aunt more so. I cannot remember my father ever reading to me.

Did they have books lying around the house?

A few here and there, self-help type books maybe, but I rarely saw them read.

Did you have an influential school teacher and if so, what do you remember of them?

Since I have reading and writing on the brain, I do remember that my fourth-grade teacher, Mrs. Reif, was a fantastic out-loud reader of fiction. She helped stories come alive and probably helped influence my relationship with reading/literature.

What part did reading play in your development as a writer?

Dr. Suess and Shell Silverstein books gave me an appreciation for verbal gymnastics and rhyme and meter when I was young.  I have written a few short pieces of poetry in my life, and in some distant way, those children’s books influenced them all. Later in life, I was drawn to The Divine Comedy for both its story and style—same with Homer’s epics. If I write creatively (which is rare), then those works are serving as some kind of inspiration and maybe even archetypes.

Did you ever read a precis or watch a movie instead of reading the book?

Hell yeah, but I would prefer the gift of uninterrupted time and quiet brain space for reading the book.

Did you have an influential professor and if so, what do you remember of them?

Undergraduate: I had a film professor who I liked, maybe because we just got to watch movies most of the time and talk about them instead of the harder work of writing papers. He may have planted a seed in my mind that the job of a teacher/professor might be a decent way to make a living. He seemed pleased enough to be getting paid for what he was doing. I didn’t really understand the other responsibilities of a higher-ed professor though.

Before I committed to a concentration at GSU, Dr. Pullman convinced me to choose rhet-comp instead of literature and that I would have an avenue to study AI. He was right, and I am thankful for his influence.

Describe your writing processes and behaviors.

Part of my process is to internally think, think, and think some more. This often resembles what I would call an internal dialogue with myself. I would like to think of it like a dialectic where I am desperate to uncover the information I need to write and make appropriate/logical/original connections between that information and prior knowledge.  It’s getting harder and harder to compartmentalize my writing project(s), and they tend to dominate my brain space. Honestly, I have had dreams recently where the topic of my writing/thinking has been the subject matter, and when I wake up, I even feel like I have worked something out while I slept. I really wish I could do more pre-writing in my sleep for no other reason than efficiency, but I am impossibly far from realizing that. Other than that, my processes vary and can be either relatively linear or more scattered. I sometimes feel like I get trapped writing a never-ending introduction as my thinking faucet remains open and active as write.

How do you teach writing?

Frequently. Is the short answer. It depends on the course though. For co-req/remedial English, I start by focusing on writing at the sentence level and then expand to the paragraph and essay—and then the research and documentation components of academic and/or persuasive writing. That’s the sequence, anyway.

For second semester freshman, I am taking different approach this semester. I am having them write one short first-drafty piece every week that is a response to an idea/quotation from the assigned course text (a self-help type book). I hope to teach them something about writing as we look at their works together as a class.

Reading

What is your earliest memory of reading?

Jurassic Park when I was in 5th grade—at the community pool in the summer. What a book.

What part does reading play in your writing?

Reading the way I do currently is probably way too analytical and inefficient, but the thinking patterns I develop doing that might be beneficial as a writer because I am used to mentally and linguistically trying to, for lack of a better terminology, figure stuff out.

Have you ever read a summary or watched a movie instead of reading an assigned book?

I wouldn’t have my degree in English if I didn’t. I don’t read efficiently enough, and I didn’t love most of the assigned literature enough to give it the time and attention it probably deserved.

Does listening to someone read a book provide you with the same experience as reading it yourself? Elaborate.

No, not at all. Two very different experiences. When I read, I am more in control of the pacing, which allows for more internal dialogue with myself about what I am reading. I also take notes when I read. I do not do that when I am listening, although I probably could if I really wanted to. Listening would be more efficient for me, but it would not allow for same depth of connections as reading does for me.

Opining

Do you have an opinion about the present state of the economy?

I am trying not to. I only have so many effs to give right now, and I am actively trying to give up politics. The state of the economy is too close to politics for me. I recognize that I am very privileged to be able to tune out like this. I am mid-way through a stable career that pays the bills. I choose to teach partly as insulation from concerns about the economy. Since I am studying AI, I feel like I might be further insulating myself (at least a little bit) from the potential negative impact AI might have on employment in higher ed. That’s what I have to tell myself anyway.

Do you have an opinion about NATO?

It seems to have served an important purpose of thwarting expansionist dictatorial powers. That seems worth preserving to me, even at a significant financial cost to the US.

Do you have an opinion about the relationship between sacrifice and faith?

I have plenty of both satisfying and unsatisfying opinions about faith. I suppose if you choose to have faith in one thing (particularly one conception of God), you could be sacrificing the chance to have faith in another–depends on how all-encompassing your conception of God is though. The faith I wrestle with for myself requires an acknowledgement of sacrifice made on my behalf and the duty to love sacrificially, so I see a strong relationship between the two ideas.

Do you ever use words you can’t define?

Yuval Noah Harari talks about how difficult it is to define fundamental concepts. I use words like love, faith, AI, writing, literature, American, and learning, all the time, but I cannot define them as well as clearly as I wish I could.

How comfortable are you with saying, “I don’t know.”?

It depends on the day, but I hope that I am getting more comfortable with it as time passes and I learn just how much is out there that I don’t know.

Thinking

Do you have an internal monologue?

I do, and I want to learn more about it. The theory of linguistic determinism says that the internal dialogue I have with myself in English could completely shape and constrain my thoughts. I know alternative theories say otherwise, but they still indicate a relationship between thoughts and language worth exploring and understanding.

Do you ever edit as your narrate to yourself?

Sort of. I think there is constant editing going on. It might be very similar to how LLMs predict the next word. I have to shuffle through my vocabulary to choose the right word in real time, so there could be considerations taking place that one could call edits.

What is it like to think wordlessly?

Probably some kind of combination of internal visual and emotional processing would be my guess. I am pretty sure my dreams are devoid of actual language, so there may be some wordless thinking going on there.

Is there a difference between rationalizing and thinking?

Rationalizing seems like a specific application or direction for thinking.

AIing

Have you used AI?

Yes. Specifically ChatGPT, claude.ai, and perplexity.ai, but AI is so embedded within other technologies that I probably use other forms without even knowing it.

How many times?

Intentionally, at least a few times every week.

Toward what ends?

I am using it (perplexity.ai especially) much more often as a search/research tool. I used to use it more experimentally out of curiosity.

Current thoughts about it?

It is terrifying and fascinating. Knowing what many of the experts say about it’s destructive potential and harmful present impacts on the environment, labor forces, and artists, I wonder if using it (even for educational reasons) could be contributing to a world that doesn’t align with my own values.

Final Project: ChatGPT UX Case Study

Introduction and Project Focus:

Perhaps it’s a fool’s errand to attempt to improve the design and user experience of the fastest growing app in history. ChatGPT reached 1 million users in 5 days after its release and 100 million users in just two months. Some if-it-ain’t-broke,-don’t-fix-it wisdom might seem obvious to apply with this wildly successful product, but UX experts Gavin Lew and Robert Schumacher issue an important warning in their 2020 book AI and UX. According to them, users who encounter problems early on with their use of any product are likely to abandon the product, often never to return. Therefore, if OpenAI (parent company of ChatGPT 3.5) wants to hold on to their massive user base and attract and keep new users (and, equally importantly, transition users to their upgraded paid version), it is imperative that their ChatGPT 3.5 generative AI model delivers an accessible design and experience without any frustrating flaws. But to create the most desirable experience, UX designers must first figure out what their users are seeking and what exactly they need and desire from the product.

For this study of ChatGPT, the issue of use cases is the first area of focus. Beyond addressing the UX problems related to serving the most appropriate and popular use cases, there are two other aspects of the ChatGPT experience that merited attention: interface design/functionality and response accuracy. Informed by industry research, academic research, and focus group testing, this case study presents analysis and recommendations for ChatGPT’s improvement related to use cases, interface design/functionality, and response accuracy.

Competitive Analysis:

A recent (February 2024) competitive analysis study from Copenhagen Economics looks at the current state of generative AI:

https://copenhageneconomics.com/wp-content/uploads/2024/03/Copenhagen-Economics-Generative-Artificial-Intelligence-The-Competitive-Landscape.pdf

Included in the report’s executive summary is the following:

“This white paper provides a preliminary review of current developments in the GenAI space and the emerging implications for competition. Further research is needed to shed additional light on the evolving competitive situation in GenAI, as well as any case-specific matters, particularly as this is a fast-evolving sector.”

Also included is this analysis of the most notable generative AI models:

“First, most foundation model developers (e.g., OpenAI, Anthropic, Mistral) already provide their own user-facing applications. At the same time, many foundation model developers offer rival developers the ability to build applications on top of their existing models via open models or closed APIs (e.g., Jasper.ai writing assistant built on OpenAI GPT models).

 Second, players in different levels of the value chain often specialise in specific domains (e.g., BloombergGPT in finance, or Isomorphic Labs in the pharmaceutical drugs discovery process) or tasks (GitHub coding copilot, or Perplexity.ai in search). The success of these models suggests that specialisation is a viable business model.”

The second observation there is especially important for this ChatGPT case study because it reveals a difference between models that serve specific domain purposes and those (like ChatGPT, claude.ai, and Llama 2, and Gemini) which function more generally or without strongly defined purposes.

There are also a number of comprehensive comparison guides for generative AI models, and this February 2024 guide does an effective job of assessing a handful of the most common use cases:

https://www.zdnet.com/article/best-ai-chatbot/

This comparison guide offers valuable information for the UX researcher by pointing out some of the pros and cons of each model. Its analysis suggests that ChatGPT 3.5 lacks two distinct features that some other free generative AI models offer: live connection to internet and the ability to upload files for analysis. While the lack of these features may be hurting ChatGPT 3.5’s ability to compete with other free generative AI models, the paid subscription upgrade (ChatGPT 4) does offer both internet connectivity and the ability to upload files for the system to read. From a profit standpoint, it may make sense for OpenAI not to offer all its available features in the free version, but if other companies are offering them at no charge, then those other companies are positioning themselves to deliver a superior overall user experience among free generative AI models. To remain competitive, ChatGPT 3.5 should incorporate internet connectivity and file uploading. However, they could limit file sizes and number of responses that include live internet in order to maintain an incentive for users to upgrade. It seems important to at least demonstrate for users all that the upgraded version of a product is capable of.

Issue #1: Identifying Use Cases

Understanding a product’s primary use is key to designing the best experience for the user. Designers are used to having an end goal or ultimate purpose in mind when they seek to improve how a particular product works. If the product is a shovel, the designers know to work on creating the best digging experience for the target users. Understanding that products could have secondary or multiple uses is important as well, and sometimes those uses must be factored into design. In the case of a Swiss Army Knife, the designers considered a number of different uses for their product and then chose to accommodate all of those uses within a compact device.

ChatGPT and other similar generative AI tools are somewhat unique as human-created technologies because unlike most tools which are created to solve defined problems, generative AI is more like a solution looking for problems to solve. This is evident in their user interface as it currently provides users with four different suggestions for how it can “help you today.”

 
 

Google’s generative AI model Gemini presents users with a very similar set of use options. While these options may change daily or upon new access of the tool, this example from Gemini shows two of four options that are related to cooking. This makes it appear as if Gemini is limited in what it can do. Even the first and fourth suggestions about brainstorming for presentation ideas and outlining a logical sales pitch are similar.

Many first-time users of ChatGPT or other models are likely accessing their chosen tool with no defined purpose, so it may be helpful to include some very specific options for how the product can be used. However, the options provided on the webpage may not be the best strategy for engaging curious explorers. The current suggestions give some sense of what ChatGPT might be able to do, but they are also overly specific. A better option might be to highlight the broader tasks that ChatGPT can help with, along with multiple examples of each task (instead of just one), and then have the tool initiate the chat and help the user narrow down to something specific they want or need help with. Here is a mockup:

Issue #2: Interface Design and Functionality

Method for testing interface and functionality:

A group of 12 first-year college students used ChatGPT for a prewriting activity designed to help them explore a topic or tentative thesis statement. The use case was thus pre-determined and defined for them, although they had a choice between which initial prompt to use.

The demographics of the student test group are reflective of the college they attend: racially diverse, majority female, majority traditional college students age 18-24. While different groups of users will have different needs and expectations of ChatGPT, this group is a likely target audience for generative AI models as they may learn to use the technology in college and then bring their knowledge and use of generative AI with them as they enter into their career fields.

Prompts options for product test:

Prompt for exploring a topic:
Please help me prepare for writing my persuasive essay about [insert topic]. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s).

 

Prompt for finding and understanding potential credible sources:
Can you point me to some prominent researchers who have published on [insert topic] and very briefly summarize their positions?


Prompt for testing a tentative thesis when you have a guess of what your position might be:
Let’s engage in a dialectic exercise. I will present my tentative thesis, and then you will take the role of Socrates and ask me yes or no questions (one at a time) to expose any potential flaws or contradictions in my position.    

Data Collected: Students submitted their ChatGPT pre-writing activity conversations in the form of sharable links provided by ChatGPT. The user interface of ChatGPT does include a helpful “share” button at the top right of the screen (desktop version).

This share button creates a special link that captures an entire chat session that others can view without having to access the original user’s account. The user can also choose whether to remain named or anonymous in the shared version of the chat. Users in this case study generally found the share feature offered by ChatGPT to be easy to locate and use.

The researcher also conducted a focus group to gather opinions and insights regarding the design and experience of the students.

The full ChatGPT conversations collected from students are of limited value for UX purposes, but from a qualitative perspective they do reveal something important. Each student prompted and replied to ChatGPT in very different ways. Some used more formal, properly punctuated language while others were far more informal, using abbreviations and texting language. Despite this, the responses from ChatGPT did not show variation in tone and style. It might make sense to have ChatGPT adjust its formality according to user inputs—or make the level of formality an adjustable option for the user.

Focus Group Discussion Summary and Interface Recommendations:

  • A majority of the students reported using some type of generative AI before, typically for academic reasons, but none of the students were regular users except for one who said she used Snapchat’s AI feature. This particular student remarked that she liked the Snapchat feature better than ChatGPT because the Snapchat AI did not decline to answer questions like ChatGPT does and Snapchat’s AI was not as “neutral.”
  • Students found the website for ChatGPT easily and reported no problems with creating an account. Some reported that creating the account/signing in using their existing Google account made things easier.
    • Recommendation: Keep the simplified sign-in process for Google users.
  • Initial impressions of the interface were mixed with some liking the “clean” layout and others calling it “simple in a bad way.” Some users felt like they did not know exactly what to do first within the site. Based on some users’ confusion, the layout would not score extremely well on a five second test.
    • Recommendation: Reduce confusion about what to users should do first by having ChatGPT start the conversation with a letter-by-letter appearance of text instead of the static “How can I help you?” message. This text would automatically open the chat conversation instead of waiting for the user to initiate the first response.
  • Some students appreciated the ability to change their personal settings to show the site in dark mode, but most students did not seek or find that option. Some felt that the dark mode option from their browser was sufficient.
    • Recommendation: add a simple dark mode toggle switch at the top of the interface:
  • All students accessed the desktop version of the tool, but many said they would consider using the mobile app.
  • About half of the students did notice the accuracy disclaimer at the bottom of the screen, and some students agreed that a pop-up notification at the start of a chat would be better because the disclaimer is not highly visible.
    • Recommendation: Increase font size of disclaimer and add disclaimer as a pop-up message at the beginning of a chat session. Include an option on the pop-up that says, “Do not show this message again.”
  • Students commented that the letter-by-letter appearance of the text is “cool” and makes it feel like “someone is typing to you.”
  • Students generally agreed that the ability to verbally enter prompts and responses would be a good feature to add. Most did not realize that they could use certain internet browsers for that same feature.
  • Almost all students found it easy to get responses from ChatGPT, except for one student who had issues when using a Safari browser. When she switched to Chrome, that resolved her issues.
    • Recommendation: improve the tool’s functionality in Safari or stop offering the tool through Safari altogether. No experience may be better than a frustrating experience in this case.
  • Student generally had no issues with the speed of ChatGPT responses, but some thought the length of responses were too long.
    • Recommendation: Have ChatGPT ask users if its responses are sufficient in length during the chat.

 

Issue #3: Accuracy of Responses

There are a handful of reasons why potential users of generative AI instead choose to steer clear of the technology. Concerns about both intellectual property rights and bias related to the training of these models are enough to keep some away completely. A more pressing concern for UX designers is the accuracy of responses generated by the different models. For any users on the fence about trying ChatGPT, one inaccurate response may be enough to cement their opinion of the tool as unhelpful or even dangerous. As many users are using ChatGPT as a source of information on a range of topics and as an alternative to Google search, it is essential that responses include accurate information and avoid the so-called “hallucinations” and misinformation that harm the tool’s credibility and lead to the frustrating experiences that drive users to other alternatives. Furthermore, since ChatGPT 3.5’s training data cuts off at January of 2022 and it lacks internet connection, it will not usually provide information for events after January 2022. If it does, then it is most likely providing misinformation because it has no way of accessing new information. Other free models (like perplexity.ai, for example) do include internet connectivity and are able to pull up-to-date information giving them a major edge over ChatGPT 3.5.

Industry studies and scholarly research are pointing toward improvements in accuracy for generative AI. One 2023 study by Walters and Wilder shows that GPT4 showed a significant decline in the inclusion of false citations compared to ChatGPT 3.5. In the study, 55% of sources created by ChatGPT 3.5 were found to be fictitious whereas only 18% of those created by GPT4 were. It is likely that GPT4’s access to the internet contributed to this improved accuracy as the model was able to better verify the existence of sources. This is further reason to make internet connectivity a priority change for ChatGPT.

Other industry studies also suggest the rapid improvement in generative AI accuracy.

Originality.ai released data in February of 2024 showing GPT4 outperforming ChatGPT 3.5 in a fact checking task (Gillham), and Anyscale released data indicating GPT4 outperforms both ChatGPT 3.5 and humans in identifying accurate summaries of news reports (Kadous, 2024). This is good news for the user experience, but generative AI already has a significant stigma when it comes to accuracy. Some misinformation that makes its way into generative AI responses is a result of the program simply trying to execute on a prompt without taking into account any need for accuracy, but other inaccuracies can be the direct result of biases that influence outputs.  

Since the optimal user experience of ChatGPT includes program outputs with accurate information, designers need to ensure the model’s training data is more representative of marginalized online communities and cultures. This means finding better quality data than the primarily English language based swaths of internet collected by Common Crawl (Baak, 2024).

Another source of bias that generates less accurate information is the custom response feature of ChatGPT. This feature allows users to tailor the responses they receive based on their own preferences. For example:

A response from ChatGPT based on these instructions thus becomes biased and will ignore certain information, making the responses potentially less accurate.

 

Although these kinds of tailored responses do create a preferred user experience, in the long run, the presentation of biased info to users (even if they ask for it) is going to hurt the overall reputation of ChatGPT and other generative AI models. This is an ethical problem for UX designers that deserves careful consideration. There may be ways to provide users some of their preferences in terms of length and formality of responses without reflecting a user’s own bias back at them.

Conclusion

ChatGPT revolutionized the world on behalf of generative AI in 2022, but today it faces strong competition from similar free (and freemium) models that offer important features that ChatGPT only offers in its subscription based GPT4 model. There are a handful of minor changes that ChatGPT could make to its interface to enhance users’ overall experience, but if OpenAI chooses to keep ChatGPT 3.5 offline, they are taking a major risk. The level of misinformation in the responses of generative AI systems is a major concern to many, and users concerned with accuracy very well might choose to experiment with more accurate free internet-connected models like Perplexity.ai and Gemini before they choose to pay $20 per month for that feature in GPT4.

 

References

Baak, S. (2024, February 6). Training data for the price of a sandwich. Mozilla Insights. https://foundation.mozilla.org/en/research/library/generative-ai-training-data/common-crawl/

Gillham, J. (2024, February 2). AI fact checking accuracy study. Originality.ai. https://originality.ai/blog/ai-fact-checking-accuracy

Kadous, W. (2023, August 23). Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper. https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper

Lew, G., & Schumacher, R. Jr. (2020). AI and UX: Why artificial intelligence needs user experience. O’Reilly.

Walters, W. H., & Wilder, E. I. (2023). Fabrication and errors in the bibliographic citations generated by ChatGPT. Scientific Reports, 13(1), 1–8. https://doi.org/10.1038/s41598-023-41032-5

Project Update: Reflecting on Data Collection Through a Focus Group Session

I would be willing to bet that I have participated in far more focus groups than the average person. When I was hustling to pay off my student loans in the 2010s, I discovered that the city of Atlanta had a slew of market research companies that would pay me for my time and opinions on a number of different products: restaurant foods, scratch-off lottery tickets, pilot episodes of television shows and commercials, deep sea fishing excursions, NASCAR race events, household appliances. Those are just some of the focus groups I remember participating in.

There is a bit of a dirty secret to my former success in selling my opinions to market research companies though, and I wouldn’t say that I am proud of it now. I had to lie…rather, I chose to lie in order to qualify for most of the studies that I participated in. If a market research company’s screener was recruiting for subjects to taste a new menu item for a fast-food chain that sold seafood, then they needed to find people who ate at these establishments. They would call all the people on their database list of potential subjects and work to fill quotas that typically required some diversity in the group of subjects (age, race, gender, income, etc.), but the subjects all had to have one thing in common: they had to be a target audience for the product, likely users or consumers of the product. The market research companies didn’t want vegans in their Chickfila focus groups, unless they were testing out a new non-meat option or something.

But there is a critical flaw in the way these focus groups operate. It is a problem that opens up an avenue for unscrupulous people like me (old-me, anyway) to easily exploit their system. The company that wants the focus group data typically outsources this task to third parties. Company X hires a market research firm to collect data, and that market research firm often relies on another market research company to assemble the groups of participants. (At least this is how I saw it working in the 2010s.) The real breakdown seemed to occur with relying on the initial screeners to produce a group of people who fit the specifics that Chickfila or their hired market research firm decided they wanted data from. The call screeners’ job was to find participants, and back then, I quickly realized that doing so was not always easy for them to do. Sometimes I would get through a 10-20 minute screening process (answering questions about demographics and purchasing behavior) only to have the screener say, “Sorry, you don’t qualify” or “You do qualify, but I already have enough males in their 30s.”

Experienced screeners worked more efficiently and could figure out if I qualified much faster by asking the right questions in the right order (obviously diverting from their scripted questionnaire). And the occasional desperate-for-subjects screener would be intentionally leading in their questioning: “Are you sure you haven’t bought a scratch-off lottery ticket in the past week?” I think the screeners were even incented financially to fill groups, earning bonuses for quantifiables that encouraged them to ignore quality. Either way, once I caught on to the different motivations of the parties involved, I started to play the game. I would be intentionally vague in my screener call responses, allowing “good” screeners to lead me to the desirable answers. Then I would show up to the focus group, participate as much as I could, offer legitimate and genuine feedback whenever possible, and then collect my check or gift card. Rinse, repeat.

My dishonesty involved lying about mostly my consumer behavior. It wouldn’t have been easy to lie about demographic information because much of that would be obvious when I showed up to the focus group and they verified my date of birth and address. My race would have been relatively obvious too, although, one time my wife (accomplice and fellow liar) did get herself qualified for a self-tanner product and showed up as the only non-orange person at the group! I guess she didn’t think that one completely through, but she still got her honorarium.

After participating in a focus group, my wife and I would often discuss the performance of the focus group leader. As educators, we both knew what leading a group discussion entailed and we marveled at just how incompetent some of these market research professionals were at their jobs. There were also the rare few who were excellent, but the bad ones were always worth gossiping about. In my hubris, I almost always thought I could do better. Considering my level of experience as a participant toward the middle and end of my multi-year run as a frequent focus group imposter, I was probably somewhat right.

When it came time this semester for me to collect data for my current UX project, I thought about my experience with focus groups and decided that doing a dozen interviews at once would be an efficient way to go about this step. It was time for me to see if I really could run a focus group better than the professionals I critiqued years ago.

I was prepared. I had two pages of questions all related to the generative AI system that my students-turned-informal-research-participants had used just days prior. I also had 15 years of experience leading discussions on all types of subjects with diverse groups of students, including high schoolers, college students, and even incarcerated men in a state prison. I was ready to prove myself as a market research genius.

Ultimately, I failed at achieving true focus group leader glory for two reasons:

#1. It is difficult to keep the participants on topic. My list of questions was well thought out and logical. One question or set of questions logically led to another, but my research subjects were not privy to this planning and therefore jumped ahead and backwards many times. This made it harder to elicit the feedback I was seeking. Fortunately, I recorded the session so I could sift out the data, but I believe there were lost opportunities and lost data because of the way I failed to control and redirect the respondents at times. It was a balancing act because more feedback is generally preferable to no feedback, and I didn’t want to shut the respondents down.

#2. I found it extremely difficult to avoid leading my respondents toward my own opinions and observations regarding the user experience of the product (ChatGPT). Hubris strikes again. I could not remain an impartial enough recorder of responses. I saw opportunities to seek additional feedback from respondents that would confirm my own opinions, and I couldn’t help myself.

Despite these shortcomings, I did collect data that will allow me to better assess the user experience of the target product. If I could do the focus group again, I think I would start with more open-ended questions and allow the discussion to go where the respondents lead it, rather than trying to rigidly stick to the list of questions I developed.

Final Project Outline and Schedule: ChatGPT Student Users Case Study

Project Goal: This project seeks to investigate and analyze how students in a first-year college writing course interact with the ChatGPT website/app in the context of educational assignment. From a UX perspective, the researcher wants to understand how the generative AI tool is used, whether students show variations in how they use the tool, and if there might be changes that could improve the experience for student users.

 

Schedule:

(Completed 3/4/2024) Create and post project outline/schedule

(Completed 3/4/2024) Create questions for focus group discussion

By 3/12/2024: Update assignment instructions

3/12/2024: Provide instructions for student writing assignment that will incorporate ChatGPT usage

3/12/2024:  Students create OpenAI accounts

3/12 and 3/14: Students complete the pre-writing/invention assignment

3/14/2024: Students submit links to their ChatGPT conversations

3/19/2024: Conduct focus group discussion

3/20-4/15/2024: Synthesize data and findings; write report (Organizational Structure of Report: Explore, Define, Ideate)

By 4/22/2024: Post final report

 

Target Research Sample: ~12 ENGL 1101 students at GGC. 

GGC student demographic profile:

Application of student demographic information: The researcher believes the institutional data for the college is representative of the student sample he is working with. The researcher has deemed that the benefits of having more accurate demographic data for the research sample is not worth the costs (time and effort) of collecting the data. An attempt to collect this data would be an additional burden and privacy encroachment for the research participants. The risks of not collecting the data are minimal but could result in some problems with the interpretation of certain data that is collected and the accurate representation of the student population of the institution. Generally, the researcher should take into account the diversity of the student population (specifically racial diversity and students with disabilities that impact learning) when designing the case study. Information on student socioeconomic status would be helpful as well because of correlations with access to technology. To that end, 2021 data shows that 62% of GGC students received Pell grants (compared to 47% at comparison group institutions). In sum, GGC serves a student body that is far more diverse and financially in need than other institutions in a comparison group of similar institutions.

https://nces.ed.gov/ipeds/use-the-data

Methods: Students will use ChatGPT for a prewriting activity that will help them explore their own tentative thesis statement or topic.

Prewriting Activity Prompts:

Please help me prepare for writing my persuasive essay about a significant problem for Generation Z. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s). The problem I want to focus on is…

–or—

Please help me prepare for writing my persuasive essay about a controversial issue in my field of study. Ask me questions–one at a time–that will help me narrow down my topic to a specific debatable question, come to a thesis, understand the strengths and weaknesses of my position, and understand any opposing view(s).

–or—

Please help me prepare for writing my persuasive essay about a social justice issue. Ask me questions–one at a time–that will help me come to a thesis, understand the strengths and weaknesses of my position and any opposing view(s). The problem I want to focus on is…

–or–

Let’s engage in a dialectic exercise. I will present my tentative thesis, and then you will take the role of Socrates and ask me yes or no questions (one at a time) to expose any potential flaws or contradictions in my position.

Data Collected: Students will be asked to submit their ChatGPT conversations in the form of sharable links provided by ChatGPT. The researcher can use this data for qualitative and quantitative analysis (number of replies and length of replies). Qualitative data will also be collected from a focus group discussion.

Focus Group Discussion Questions:

(Researcher will start with background info on research purpose, discussion of consent, and info about groupthink)

How would you describe your prior use of generative AI (like ChatGPT, claude.ai, etc.)?

              Never used it/Used once/Used a few times/Use regularly

              For what purposes?

 

Did you use the mobile app or desktop website?     Mobile / Desktop

What were your initial impressions of the appearance of the tool? Design? Functionality? Features?

              Notes:

 

How easy was it to…

find and access ChatGPT and create your account?  

Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

Comments:

use ChatGPT to….

get responses (any responses—helpful or unhelpful)

              Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

                             Comments:

share the link to your ChatGPT conversation?

              Very Easy / Somewhat Easy / Somewhat Difficult / Very Difficult

                             Comments:

 

What about…

Speed of ChatGPT responses?  

Scroll speed of the screen as ChatGPT responds?

Appearance of the interface? Colors, font size? Layout?

              Comments:

 

Additional features

              Did you notice the thumbs down button? Did you use it?

                             Did not notice /Noticed but didn’t use / Noticed and used

              Did you notice the message that says, “ChatGPT can make mistakes. Consider checking

important information”?

              Noticed / Did not notice

              Comments:

Other features you wished the interface had?

              Comments:

Features that you wished the interface would change or remove?

              Comments:

Accessibility questions

How could the tool be better designed to improve the experience for people with disabilities?

              Comments:

Using ChatGPT for UX Persona Creation and Canva for Matching Persona Images

 
 
I used ChatGPT 3.5 (free version) to create some personas that could theoretically apply to my UX project. After that, I asked ChatGPT to help me think about ways I could apply the different characteristics from the personas to improve the UX design of ChatGPT’s own interface and functionality. Overall, I think the feedback that I got from the machine was helpful for UX research. Some answers were a bit vague or obvious, but even the vague answers were often decent starting points. The full conversation will be included below, so feel to judge for yourself how helpful AI can be as a UX research assistant. Of course, any shortcomings of the machine responses may only be reflecting the shortcomings of the prompts and prompt writer.
 
I also used Canva to create matching images (posted above) for the personas. Using image generation was a new experience for me, and I found the results impressive for a free version. I originally tried the Bing image generator without any luck at all. That one runs on Dall-e 3 and is supposed to be excellent, but I couldn’t get it to generate anything. With Canva, it did take a bit of work and multiple “re-generation” attempts for each image to come out looking realistic. The prompts that ChatGPT gave me to feed to Canva were too long. I had to just write my own. Here is what the one for the third persona looked like:
Realistic image of 35 year old male military vet in plain clothes sitting at a laptop.
That was a revised prompt. Since the first attempt did not include “in plain clothes,” all the images were of a man in military uniform. Of course, I could have just left out the “military vet” altogether, but I wanted to see if this extra detail would have an impact on the image. Maybe the physical stature of this persona is reflective of the military vet detail, just as the glasses on the last persona are probably reflective of the “visual impairment” detail that I included. When creating these images, issues with eyes and hands seem to be especially prominent (see hands on lower third image), but it is so easy to re-generate that a user can get as picky as they want–at least within the usage limits of the free version anyway. I suppose there is value in giving a persona a visual image for the researcher to connect with, yet I don’t see this serving my own research very much. It did give me decent visuals to help spice up this post though!
 
Here is a link to my ChatGPT session (better formatting), or you can read the basic full text below:
 

UX Persona Creation and Application

Jason PalmerFebruary 23, 2024
 
Me:
Please help me by creating four different personas for user experience/user design research. The product they are using is ChatGPT. They are all college students. Two of them are traditional and two of them are non-traditional students. Two of them have disabilities that impact how they learn and use technology. They are all students at a diverse four-year college in suburban Atlanta. Provide other useful characteristics, details, and insights for each of these four different personas: things like what they might use the product for, how they might go about using it, prior experience, etc.
 
ChatGPT:

Sure, I can provide some hypothetical personas for ChatGPT users. However, please note that the following personas are fictional and are based on general assumptions and typical user behaviors, not on real individuals.

  1. Traditional Student – Natalie:
  • Age: 21
  • Gender: Female
  • Major: Psychology
  • Background: Grew up in a suburban area, comes from a middle-class family.
  • Disability: Dyslexia
  • Tech Comfort: Somewhat comfortable, uses a laptop and smartphone regularly.
  • ChatGPT Usage: Uses ChatGPT for research and writing assistance, especially for tasks that involve extensive reading or writing. Finds ChatGPT’s summarization and paraphrasing features helpful.
  • Experience: Natalie has used language assistance tools in the past and is looking for a more versatile tool that can help her with her specific learning needs. She appreciates AI tools that can understand natural language and provide accurate and concise responses.
  1. Non-Traditional Student – Jason:
  • Age: 35
  • Gender: Male
  • Major: Computer Science
  • Background: Military veteran, came back to college after serving.
  • Disability: PTSD
  • Tech Comfort: Very comfortable with technology, has experience with coding and software development.
  • ChatGPT Usage: Uses ChatGPT to assist with coding tasks, such as generating code snippets or checking for errors in his code. Also uses ChatGPT for brainstorming and project planning.
  • Experience: Jason is familiar with AI and machine learning technologies. He values efficiency and accuracy in the tools he uses. He appreciates ChatGPT’s ability to understand technical language and provide relevant information.
  1. Traditional Student – Sarah:
  • Age: 20
  • Gender: Female
  • Major: Journalism
  • Background: Grew up in a rural area, comes from a working-class family.
  • Disability: ADHD
  • Tech Comfort: Comfortable with technology, uses a laptop and smartphone regularly. Struggles with focus and organization.
  • ChatGPT Usage: Uses ChatGPT for brainstorming and organizing ideas for articles and assignments. Finds the tool’s ability to generate ideas and provide structure helpful.
  • Experience: Sarah has used other productivity tools in the past but has found them to be overwhelming or too rigid. She appreciates ChatGPT’s flexibility and ease of use.
  1. Non-Traditional Student – Michael:
  • Age: 29
  • Gender: Male
  • Major: Business Administration
  • Background: Grew up in a suburban area, comes from an upper-middle-class family.
  • Disability: Visual Impairment (Partially Blind)
  • Tech Comfort: Comfortable with technology, uses screen readers and other assistive technologies.
  • ChatGPT Usage: Uses ChatGPT for research and writing assistance, especially for tasks that involve extensive reading or writing. Finds ChatGPT’s summarization and paraphrasing features helpful.
  • Experience: Michael has used other language assistance tools in the past but has found them to be less accurate or accessible. He appreciates ChatGPT’s ability to understand natural language and provide accurate responses.
 
Me:
Now provide a summary for persona #1 (Natalie) that explains how the UX designer of an updated version of ChatGPT could apply the characteristics of the personas (especially the disability information) to an updated version of the ChatGPT interfaces (both web and mobile).
 
ChatGPT:

Based on the persona of Natalie, a UX designer for an updated version of ChatGPT could consider several key characteristics to enhance the accessibility and usability of the product for users with dyslexia:

  1. Readable Text: Ensure that the text on the interface is clear, easy to read, and uses dyslexia-friendly fonts such as OpenDyslexic or Dyslexie. The designer can also consider options for adjustable font size and contrast settings to accommodate individual preferences.
  2. Intuitive Navigation: Simplify the navigation of the interface by using clear icons and labeling, and avoid using complex or ambiguous language that might be difficult for users with dyslexia to understand.
  3. Assistive Technology Integration: Make sure that the interface is compatible with screen readers and other assistive technologies commonly used by individuals with dyslexia.
  4. Language Processing Features: Enhance the language processing capabilities of the ChatGPT to provide accurate and concise responses that are easier for users with dyslexia to understand. This might include features like summarization and paraphrasing, as well as the ability to break down complex information into simpler terms.
  5. Customizable Settings: Allow users to customize the interface according to their individual needs, such as adjusting the speed of text presentation or enabling visual cues to aid in comprehension.
  6. User Education and Support: Provide clear instructions and guidance on how to use the interface effectively, and offer accessible support resources such as tutorials or FAQs.

By taking these characteristics into account, the UX designer can create an updated version of ChatGPT that is more accessible and user-friendly for individuals with dyslexia like Natalie.

 
Me:
Now, taking into account all of the users’ usage characteristics, what specific changes to the ChatGPT interface and functionality should the UX designer consider? Please be as specific as possible, providing examples and explanations of the possible changes.
 
ChatGPT:

Certainly, let’s consider the specific changes the UX designer of ChatGPT might want to consider for each of the personas mentioned earlier:

  1. Natalie (Traditional Student with Dyslexia):
    • Interface Changes: Increase font size options, provide a dyslexia-friendly font option, offer adjustable contrast settings, and minimize visual clutter.
    • Functionality Changes: Introduce a summarization feature that condenses lengthy text responses into key points, offer an option to paraphrase complex or jargon-filled responses into simpler language, and provide a spell-check feature that highlights potential spelling errors.
    • Example: Instead of displaying a long paragraph in a single block of text, the interface could provide a summarized version with bullet points, using simpler language and shorter sentences.
  2. Jason (Non-Traditional Student with PTSD):
    • Interface Changes: Provide a dark mode option for users who are sensitive to bright screens, offer customizable notification settings, and implement a progress tracker for long tasks to help manage focus and anxiety.
    • Functionality Changes: Introduce a distraction-free mode that hides non-essential elements of the interface, incorporate a timer feature to encourage breaks during long sessions, and add a feature that allows users to save and resume tasks.
    • Example: The interface could include a “Focus Mode” that hides the chatbox and displays only the current task, with a timer and progress bar visible to help Jason track his work.
  3. Sarah (Traditional Student with ADHD):
    • Interface Changes: Implement a clear and hierarchical layout, offer color-coded categories for tasks, and provide a task list feature with the option to set reminders.
    • Functionality Changes: Introduce an auto-save feature to prevent data loss due to distraction, include a focus mode that temporarily disables notifications, and add a “Brainstorming Mode” that helps organize ideas into a structured outline.
    • Example: The interface could have a “Task Manager” feature that displays Sarah’s assignments and allows her to set deadlines and receive reminders, helping her stay organized and on track.
  4. Michael (Non-Traditional Student with Visual Impairment):
    • Interface Changes: Ensure the interface is compatible with screen readers and supports keyboard navigation, provide a high-contrast mode, and offer adjustable magnification options.
    • Functionality Changes: Implement a voice command feature that allows users to navigate the interface hands-free, offer an audio summary option for lengthy responses, and include keyboard shortcuts for common actions.
    • Example: The interface could have a “Voice Control” feature that lets Michael navigate the interface and access functions using voice commands, reducing the need for manual interaction.

By considering these specific changes tailored to the needs of each persona, the UX designer can create an updated version of ChatGPT that offers a more accessible and user-friendly experience for everyone, taking into account their usage characteristics and disabilities.

 
Me:
Let’s now assume that our different personas all have different cultural backgrounds. What considerations does this place on the UX designer?
 
ChatGPT:

When considering the cultural backgrounds of the personas using ChatGPT, the UX designer should take into account the following considerations:

  1. Language and Communication Styles: Different cultures have unique language and communication styles. For example, some cultures may prefer indirect communication, while others may value directness. The UX designer should ensure that ChatGPT’s language models are trained to understand and respond appropriately to these different styles.
  2. Visual Design and Color Schemes: Colors and visual design elements can have cultural significance. For example, certain colors may have positive or negative connotations depending on the culture. The UX designer should consider cultural sensitivities when choosing color schemes and visual design elements for the interface.
  3. User Preferences and Expectations: Different cultures may have different preferences and expectations when it comes to technology and user interfaces. For example, some cultures may prefer minimalist designs, while others may prefer more visually rich interfaces. The UX designer should conduct user research to understand the preferences and expectations of the personas from different cultural backgrounds.
  4. Content and Content Localization: Content on the ChatGPT interface should be culturally sensitive and inclusive. The UX designer should consider the cultural context of the content and ensure that it is appropriate and relevant to all users, regardless of their cultural background.
  5. Accessibility and Inclusivity: Cultural inclusivity goes hand-in-hand with accessibility and inclusivity. The UX designer should ensure that ChatGPT’s interface is accessible to users from different cultural backgrounds, including those with disabilities, by implementing features such as multilingual support, culturally diverse avatars, and user-friendly interfaces.

Overall, the UX designer should take a culturally sensitive and inclusive approach when designing ChatGPT’s interface, taking into account the unique needs, preferences, and expectations of the personas from different cultural backgrounds. This will help create a more inclusive and user-friendly experience for all users.

 
Me:
Let’s assign different cultural backgrounds to our personas.
 
ChatGPT:

Sure, let’s assign different cultural backgrounds to our personas:

  1. Traditional Student with Dyslexia – Natalie:
    • Cultural Background: Natalie comes from a culturally diverse family with Hispanic roots. She speaks both English and Spanish fluently.
  2. Non-Traditional Student with PTSD – Jason:
    • Cultural Background: Jason grew up in a small town in the Southern United States and has a strong sense of Southern culture and traditions.
  3. Traditional Student with ADHD – Sarah:
    • Cultural Background: Sarah comes from a multicultural background. Her mother is Chinese, and her father is African American. She is fluent in English and conversational in Mandarin.
  4. Non-Traditional Student with Visual Impairment – Michael:
    • Cultural Background: Michael is originally from the Philippines and moved to the United States for college. He is fluent in English and Tagalog, the national language of the Philippines.

With these diverse cultural backgrounds in mind, the UX designer should consider the specific cultural sensitivities, communication styles, and preferences of each persona when designing ChatGPT’s interface. This will help create a more inclusive and culturally sensitive experience for all users.

 
Me:
Now lets make some UX improvement suggestions for each of the personas based on cultural background.
 
ChatGPT:

Certainly! Here are some UX improvement suggestions for each of the personas based on their cultural backgrounds:

  1. Natalie (Hispanic Background):
    • UX Improvement: Offer a language option for Spanish-speaking users like Natalie. ChatGPT could have an option to switch between English and Spanish language modes, allowing users like Natalie to communicate more comfortably in their preferred language.
  2. Jason (Southern United States Background):
    • UX Improvement: Incorporate elements of Southern culture into the interface design, such as using warm, earthy tones and friendly, informal language. This could help Jason feel more at home while using the product.
  3. Sarah (Multicultural Background):
    • UX Improvement: Provide a culturally diverse set of avatars and user interface elements that reflect Sarah’s multicultural background. This could include avatars with different ethnicities and customizable design elements that allow users to express their unique cultural identities.
  4. Michael (Filipino Background):
    • UX Improvement: Integrate multilingual support for Tagalog-speaking users like Michael. ChatGPT could include an option to switch between English and Tagalog language modes, allowing users to communicate more comfortably in their preferred language.

These UX improvements take into account the unique cultural backgrounds of each persona and aim to create a more inclusive and culturally sensitive experience for all users.

 
Me:
Do you have any other suggestions for how a UX designer could use these personas to improve ChatGPT? Is there anything we may have missed in the creation of the personas that could lead to a better product and experience for the diverse range of ChatGPT users?
 
ChatGPT:

Certainly! Here are some additional suggestions for how a UX designer could use these personas to improve ChatGPT:

  1. Cultural Sensitivity Training for AI: To ensure that ChatGPT’s responses are culturally sensitive and appropriate, the UX designer could work with AI trainers to provide cultural sensitivity training to the language models. This could involve incorporating diverse cultural perspectives and values into the training data and ensuring that the language models understand and respect cultural differences.
  2. User Research and Feedback: The UX designer could conduct user research and gather feedback from users from diverse cultural backgrounds to understand their specific needs, preferences, and pain points. This could involve conducting focus groups, interviews, surveys, and usability testing with users from different cultural backgrounds.
  3. Localization and Translation: To make ChatGPT more accessible to users from different cultural backgrounds, the UX designer could work on localizing the interface and content for different languages and regions. This could involve translating the interface and content into multiple languages and adapting the design to suit the cultural norms and preferences of different regions.
  4. Cultural Contextualization: The UX designer could work on contextualizing ChatGPT’s responses to better reflect the cultural context of the user. For example, if a user asks a question about a cultural tradition or practice, ChatGPT could provide a response that takes into account the cultural context and significance of the tradition or practice.
  5. Inclusive Design and Accessibility: To ensure that ChatGPT is accessible to users from diverse cultural backgrounds, the UX designer could work on making the interface and content more inclusive and accessible. This could involve incorporating features such as text-to-speech, screen readers, and voice commands to make ChatGPT more accessible to users with disabilities.

By taking these suggestions into account, the UX designer can create a more inclusive and culturally sensitive experience for all users of ChatGPT, regardless of their cultural background.

 
Me:
Can you give me a good prompt to use for an image generator that could provide me with images of each of these personas? Include the relevant details the image generator would need.
 
ChatGPT:

Certainly! Here’s a good prompt for an image generator that could provide you with images of each of these personas:

Prompt: Generate a set of diverse avatar images representing the following personas:

  1. Natalie – A 21-year-old female college student majoring in Psychology. She is of Hispanic descent, has dyslexia, and speaks both English and Spanish fluently.
  2. Jason – A 35-year-old male college student majoring in Computer Science. He is a military veteran with PTSD and has a strong sense of Southern culture and traditions.
  3. Sarah – A 20-year-old female college student majoring in Journalism. She is of mixed Chinese and African American heritage, has ADHD, and is conversational in Mandarin.
  4. Michael – A 29-year-old male college student majoring in Business Administration. He is originally from the Philippines, has visual impairment, and is fluent in English and Tagalog.

Details Needed:

  • Gender: Male or Female
  • Age: 20-35
  • Ethnicity/Cultural Background: Hispanic (Natalie), Southern United States (Jason), Mixed Chinese and African American (Sarah), Filipino (Michael)
  • Disabilities: Dyslexia (Natalie), PTSD (Jason), ADHD (Sarah), Visual Impairment (Michael)
  • Major/Field of Study: Psychology (Natalie), Computer Science (Jason), Journalism (Sarah), Business Administration (Michael)
  • Languages Spoken: English, Spanish (Natalie), English (Jason), English, Mandarin (Sarah), English, Tagalog (Michael)

The image generator could use these details to create a diverse set of avatar images that accurately represent each of the personas.

 
Me:
Please abbreviate that significantly and ask for just one image showing them with handwritten name tags on their shirts.
 
ChatGPT:

Certainly, here’s a more concise version of the prompt:

Prompt: Generate one image of four avatars, each representing a college student with a handwritten name tag on their shirt. Each student should be visibly different, representing the following personas:

  1. Natalie: 21-year-old female with Hispanic background, dyslexia, Psychology major, speaks English and Spanish.
  2. Jason: 35-year-old male with Southern United States background, PTSD, Computer Science major, English speaker.
  3. Sarah: 20-year-old female with mixed Chinese and African American heritage, ADHD, Journalism major, speaks English and Mandarin.
  4. Michael: 29-year-old male from the Philippines, visual impairment, Business Administration major, speaks English and Tagalog.

Details Needed:

  • Gender: Male or Female
  • Age: 20-35
  • Ethnicity/Cultural Background: Hispanic (Natalie), Southern United States (Jason), Mixed Chinese and African American (Sarah), Filipino (Michael)
  • Disabilities: Dyslexia (Natalie), PTSD (Jason), ADHD (Sarah), Visual Impairment (Michael)
  • Major/Field of Study: Psychology (Natalie), Computer Science (Jason), Journalism (Sarah), Business Administration (Michael)
  • Languages Spoken: English, Spanish (Natalie), English (Jason), English, Mandarin (Sarah), English, Tagalog (Michael)

Please provide a single image showing all four avatars, each with a handwritten name tag on their shirt.

 

 

 

Thoughts on the Ethics of Offering Incentives for Surveys and Questionnaires

At the end of each semester, my students are provided with a link to a survey seeking their feedback regarding my performance as an instructor for their course. This is standard practice at many institutions and typically organized and/or administered at the direction of an institutional effectiveness office, not the instructors themselves. Although the students at my institution receive an email notification that their participation is needed/requested (and a reminder email or two), the emails include only a vague (meaningless?) incentive of institutional improvement connected to the request.

Because an educational institution should theoretically value higher levels of participation in end of course surveys, and because the level of my own students’ participation can be viewed (unfairly) as an indirect reflection of my professional capabilities, I have often wondered whether I should provide some type of incentive for my own students to complete their evaluation surveys. I am aware that some of my colleagues do provide extra credit for end of course survey completion, but this strikes me as somewhat unethical because a student (i.e. research participant) who is aware of a reward attached to their participation could then have a different outlook on the subject of the research or the research itself. In other words, rewards make people happy. Rewards change people’s moods and perceptions. Rewards, then, can introduce a type of subtle bias. This is why I have typically eschewed rewards to encourage participation in my end of course surveys. But this doesn’t necessarily mean that I have eliminated bias. Not at all.

When people (like students) come to expect rewards for completing surveys, researchers could potentially create a bias simply by not offering some kind of expected incentive for a person’s time and thoughts. If a person feels like they are being compelled to complete a survey without any promise of a reward, they may be unhappy/annoyed/frustrated about doing so, and those kinds of feelings could potentially filter into their survey responses—perhaps even unconsciously.

By not offering my own students rewards for their participation in my end of course surveys, I do fear that I am impacting their responses in an unintended way connected to their emotional state when completing the survey. This is made worse because of my colleagues who do offer incentives and thereby create a stronger expectation on the part of students, an expectation that is naturally followed with disappointment when not met.

Instead of being “damned if I do, damned if I don’t,” I am more “favored if I do, damned if I don’t” when it comes to providing incentives for survey participation. But either way, I would argue that the data is damned.

If the principles of offering (and refraining from) survey incentives which I have laid out for my own course surveys hold generally true for all research efforts that include voluntary data collection, then it seems the researcher concerned with participant bias is faced with a no-win situation: provide a reward and introduce a favorable response bias -or- refrain from providing a reward and introduce a potential unfavorable response bias.

This matter is further complicated by factors such as socioeconomic status since those who are more in need of incentives may be more susceptible to bias (in my case, students with lower grades and more in need of extra credit would be impacted differently by rewards than those who do not need them). A 2019 study supports the idea that “There is also a risk that incentives may introduce bias, by being more appealing to those with lower socioeconomic status.” Although that particular article is concerned with bias in terms of the sample generated, the connection between incentives and participants’ attitudes/responses should be a matter of concern as well.

So what is an ethical researcher to do?

At Georgia State University, where I am a graduate student, the course evaluation process doesn’t exactly provide an incentive for students who complete instructor evaluations. Instead, they withhold a student’s ability to view their final grades until they cooperate and submit their reviews.

This doesn’t seem like the most ethical way of solving the participation problem, but I do imagine it is effective at generating high response rates without costing the institution a nickel. A better approach, I think, is working harder to educate students (research participants) about the risks involved in providing incentives and punishments when trying to encourage participation in research. This at least gives students some answers as to why an incentive is not being offered and hopefully garners more credibility for the study in the eyes of the people who are being asked to participate. I have taken to this approach with my course evaluations, and I think it may be somewhat effective at preventing the bias I am seeking to eliminate. It is not, however, the most effective way to increase response rates.

 

Implications for my UX research

There is probably no perfect system for achieving optimal survey participation while also eliminating bias, but researchers can at least be thoughtful and careful about their efforts. Although I have not designed my final UX course project yet, I can imagine the need for survey responses to inform my efforts. Perhaps I will simply try to encourage participation by explaining the virtue of my educational field work and also explaining the problems associated with offering incentives. Perhaps I will just offer incentives to participants and carefully explain this choice within my report. Another option might be to try to diffuse the potential bias created by an incentive by adding a question at the beginning of the questionnaire like the following:

“Could providing incentives to participate in surveys make respondents less honest (i.e. more positive) in their answers?”

Such a question would at least force the participants to consider the issue right before they answer other questions where their unbiased feedback is sought. I can see how creating that kind of basic awareness could encourage less biased, more honest responses.

 

Other readings and notes

Most sources I found related to this issue deal more with introducing bias in the sample. I didn’t find any that specifically deal with the quality of responses being impacted by incentives.

This article found that incentives provided upfront work better than incentives provided after participation. I would have expected the opposite, and I suppose that is me not believing the best in people.

“Prepaid incentives (unconditional incentives) are more effective in increasing response rates in comparison to payment after survey completion (conditional incentives) [6465]. This could be explained by social exchange theory as providing participants monetary incentives in advance encourages them because they feel they should reciprocate for the reward they receive by completing the survey.”

This is decent write up of different types of bias in surveys and surveying.

ChatGPT and claude.ai reflect mostly what I found (or didn’t find) when looking into this issue, but when pressed a bit, ChatGPT gave me something closer to what I was looking for in terms of response bias. Of course, just because it gave me what I was after, I wouldn’t trust it without further verification from some better human sources. (These generative AI systems are no solution for confirmation bias, and I do fear they might make that even problem worse.)

UX Research Notes: Using Screen Recording to Capture ChatGPT Usage

On a Friday morning I recruited two family members to participate in a brief research activity using ChatGPT. I used screen recording through Microsoft Teams to capture the participants’ interactions with the free version of OpenAI’s ChatGPT tool (ChatGPT 3.5). My goal as the researcher was exploratory in nature, seeking some feedback on the interface and functionality of the ChatGPT desktop web application.

Instead of simply providing the participants access to the ChatGPT website without any defined purpose for interacting with it, I designed an activity to encourage them to make use of the tool. The instructions for the activity were as follows:

Instructions: Use the generative-AI tool (ChatGPT or claude.ai) to engage in a virtual debate. You can ask the AI tool to be your debate partner and for suggestions on a topic, or you may suggest a topic. The goal is to “win” the debate.

*Note: topics related to recent current events don’t work well because the generative AI tools do not have access to information after 2021 (ChatGPT) and 2022 (claude.ai). 

Both users were given this same information on a sheet of paper and verbally informed of the purpose for the activity. They were not given time parameters for completing the activity, and were instead told to take as much time as they needed to complete the activity, or they could stop when they were bored or felt like they could no longer make progress toward their goal.

The two participants were significantly different in terms of demographics. This seemed desirable because younger and older tech users sometimes have different approaches and attitudes toward interacting with technology. Here are the participant profiles:

Participant 1 (P1): 10-13 years old, white male, from a diverse suburb of Atlanta, public school educated, never used ChatGPT or other generative-AI platforms.

Participant 2 (P2): 35-40 years old, white female, from a diverse suburb of Atlanta, college educated, very limited past use of ChatGPT—maybe 2 or 3 times total for work and curiosity reasons.

Results Summary

P1 engaged with ChatGPT for over 20 minutes, exploring different topics he proposed, then asking for a suggestion for a topic from ChatGPT which he debated, and finally proposing one last topic himself which he also debated.

  • Interactions by the numbers: 14 typed prompts/replies, average length 20 words.

P2 engaged with ChatGPT for 10 minutes, focusing on only one topic which she proposed.

  • Interactions by the numbers: 9 typed prompts/replies, average length 13 words.

UX Observations

Both participants found the system to be intuitive and easy to use.

Neither participant made use of the small icons below each ChatGPT response (thumbs down, regenerate, copy). When asked why, participant 2 said she never uses those features (especially the “like/dislike” buttons) because she doesn’t care to inform the algorithms used for determining the types of content she is fed on social media sites.

Both participants commented that the speed at which ChatGPT responses displayed and auto-scrolled was too fast. From Participant 1:

“I would have the I would have chat GPT pull up its arguments slower because it feels like a lot to read it. The pace felt really, really fast. It was almost like stressful to read them. And I’m really fast reader, and I still couldn’t read it all.”

Although I did not want to interrupt the session, I was curious to know if participant 1 was reading the full responses generated by ChatGPT, so I asked him in the middle of his session. He replied that he was just “skimming” them. The fact that he chose to skim was likely because the responses were autos-scrolling quickly and somewhat lengthy compared to the short prompts/replies he was typing.

Both participants also said they liked the disclaimer at the bottom of the screen that says “ChatGPT can make mistakes. Consider checking important information.” But both participants did not notice this until the end of their session. Neither was sure if it was there at the beginning and participant 2 suggested it ought to be a pop-up at the beginning that must be X’ed out by the user.

Participant 1 suggested custom font options, especially in terms of size for readability.

Participant 1 also suggested the ability to input prompts/replies with speech and an option to have ChatGPT’s responses read out loud to the user.

Final Thoughts

Using the Teams screen recording option was easy, and the transcript proved useful despite its inaccuracies. I would use Teams in the future for similar data collection and the recording of post-use interviews.

In the future, I would encourage participants to verbalize more of their thoughts during the use of the product. I was able to get some good feedback after my participants completed their tasks, but more in-the-moment feedback would give me more material to work with in a post-use interview.