Human - GenAI System

Before OpenAI's rise, inline next-phrase AI suggestions were still emerging. Recognizing this, we aimed to understand writers' cognitive processes when interacting with AI suggestions. Through qualitative research, we analyzed user interactions, identified design opportunities, and developed a model of Human-AI interaction. I also led the design of a user-centered next-phrase AI suggestion system. Our work was published at ACM IUI 2023, earning a 'Best Honorable Mention'.

DURATION

12 months

MY ROLE

Researcher & Designer at IIT Bombay

METHODS

User Interviews, Concurrent & Retrospective Think - Aloud Protocols, Thematic Analysis, Interaction Design

Context

With advancements in language models, text generation has become sophisticated enough to offer synchronous next-phrase suggestions within writing interfaces.

While NLP and large language models are rapidly progressing to enable these interfaces, research into how writers interact with these technologies—and the effects on their cognitive processes—remains in its early stages.

We worked on this project before the rise of OpenAI

Goal

Solution

We developed an MVP by designing monthly summary emails for each team, providing relevant metrics and tailored recommendations to enhance the quality of their APIs.

Results

Our proposal was accepted and awarded a "Best Honorable Mention" at ACM IUI 2023.

Getting into the research

The initial problem statement I received from the PM was vague and fuzzy. I organized 1:1 meetings with the Product Manager, Product Head, Design Manager, and fellow designers to better understand the problem we were addressing, why it needed to be solved, why the current design wasn't working, and what success would look like.

Research Questions

While extensive research has been conducted on this topic from a behavioral perspective, there is limited exploration from a cognitive viewpoint, leading us to our Research Question 1 (RQ1).

We were also interested in understanding how the presence and the content of suggestions affect the writing process, leading us to our Research Question 2 (RQ2)

Language models often inherit biases and dominant views from their training text corpus. This issue is particularly relevant as generative AI becomes integral to our lives. Understanding how these biases affect human interaction with technology, especially in the writing process, guided us to our Research Question 3 (RQ3)

Research Methodology

For this research, it is crucial to observe, collect, and analyze participants' interactions while writing. Consequently, we brainstormed extensively on what and how to ask our participants to write.

Why movie review?

The following reasons contributed to the decision to finalize movie reviews as the core task for users in this research:

Writing movie reviews involves expressing one’s opinion and arguing for it

Movie reviews have a star rating which will give us a linear scale of sentiment that may be used to create a variety of writer -suggestion misalignments.

Countless movie reviews exist on the internet, along with available datasets, giving us a rich resource to train a language model.

How should we ask them to write?

We asked participants to complete two tasks: First, they watched a movie we selected and wrote a review without any suggestions. In the second task, they watched a different movie, this time writing their review with suggestions provided.

We conducted our research with 14 L2 participants, individuals for whom English is a second language rather than their native tongue.

How should we collect data?

We asked participants to write movie reviews while sharing their screen in a COVID-safe setup with two researchers and one participant. Using a concurrent think-aloud protocol, participants verbalized their thoughts during writing. Later, we conducted a retrospective think-aloud by replaying the session, asking follow-up questions. The process was recorded, and we took detailed memos.

Data Analysis

We conducted thematic analysis on the qualitative data using the Atlas.ti tool.

Findings

Following were some of the findings along with what users were suggested and what did the finally wrote to back the findings:

Contribution of suggestions to the processes of composition

Abstracted themes from suggestions to inspire new sentences

Adapted sentence structures from suggestions in their own writing

Copied suggestions that matched their working memory

Evaluating suggestions

Evaluation influenced by the beliefs about the suggestion system

Evaluation with respect to text written so far

Evaluation with respect to the Working Memory State

Effects on the writing process

Alignment between suggested text and the writer's intent influences their writing process

Relevant suggested text shapes their next writing steps

Suggestions can distract users based on their current writing stage

Model

Based on our findings, we proposed a model that builds upon the categories and concepts proposed by Hayes and articulates the findings of our study. Following is the image of the model that we proposed:

Design Opportunities

We believe future systems can leverage the above research findings and following design implications that could be helpful in building effective Human-AI suggestive systems in future:

Strategic Sampling

Strategically controlling sampling in the language model to suggest phrases that will aid writers with more ideas and language

Personalizing Models

Personalizing the suggestion models to avoid generic suggestions and reduce evaluation time.

More customization

Giving users freedom to select for which cognitive process (proposer, translator & transcriber) they need suggestions, & when.

Explainable AI

Making the suggestion AI model more explainable so that users can collaborate better to get more relevant suggestions

Seamless selection

Making the process of extracting some phrases from the suggestive text seamless

How can we assist them in building quality APIs?

Producers need our assistance in making relevant improvements in their APIs.

Final Designs

Building on the identified design opportunities, I created the following screens to illustrate a human-centered approach for AI-powered next-phrase suggestion systems:

During onboarding

When the user is about to start writing

While writing

Reflections

During my undergraduate studies, I had the opportunity to work as a researcher and designer for the first time on an academic project. This experience sharpened my ability to identify appropriate research methods and execute them effectively.

One of the most valuable lessons I learned was developing hypotheses and validating them through data collection, along with honing my data analysis skills.

The project also introduced me to the fundamentals of AI, sparking my interest in researching and designing AI-driven experiences. Given the emerging nature of this technology, I explored different modalities to enhance transparency and explainability in AI design, laying a foundation for creating more user-centric and trustworthy AI systems.

Additionally, I iterated extensively on how users interact with AI-generated suggestions—focusing on how the process of viewing and accepting suggestions could be made more intuitive and seamless.

Out of all the pixels in all the screens, you ended up here. Grateful for your visit!

2024 Ravi Jangir. All Rights Reserved.

Let’s connect