Based out of England, UK

Ava Speaker Diarization

Ava Speaker Diarization

Revolutionizing accessibility and inclusivity in communication methods for Deaf and hard of hearing individuals

Revolutionizing accessibility and inclusivity in communication methods for Deaf and hard of hearing individuals

AI-based research for captioning and speaker identification feature for Deaf and Hard-of-Hearing individuals on iOS.

AI-based research for captioning and speaker identification feature for Deaf and Hard-of-Hearing individuals on iOS.

Impact

Competitive advantage over other captioning software.

Competitive advantage over other captioning software.

Competitive advantage over other captioning software.

Role

AI model discovery, Product Designer

AI model discovery, Designer

AI model discovery, Designer

Duration

4 months

4 months

Tools

Figjam, Figma, Notion

Figjam, Figma, Notion

Year

2022

2022

Challenge

Deaf and hard-of-hearing users who used the Ava app in solo mode were unable to identify "who said what" when multiple speakers in a group were talking, unless they switched to a group conversation where all participants (different speakers) were connected on the Ava app.

Deaf and hard-of-hearing users who used the Ava app in solo mode were unable to identify "who said what" when multiple speakers in a group were talking, unless they switched to a group conversation where all participants (different speakers) were connected on the Ava app.

Deaf and hard-of-hearing users who used the Ava app in solo mode were unable to identify "who said what" when multiple speakers in a group were talking, unless they switched to a group conversation where all participants (different speakers) were connected on the Ava app.

Goals

• Allow users to easily identify speakers with 1 device (in a solo conversation) with as little impact on the conversation flow as possible.

• Be able to easily rename a speaker or re-assign a transcript to oneself or another user if the backend assigns a speaker wrongly.

• Be able to express the uncertainty of the speaker by the backend during a conversation. Additionally, users should be able to easily invite the identified speaker to their conversation for better accuracy.

• Allow users to easily identify speakers with 1 device (in a solo conversation) with as little impact on the conversation flow as possible.

• Be able to easily rename a speaker or re-assign a transcript to oneself or another user if the backend assigns a speaker wrongly.

• Be able to express the uncertainty of the speaker by the backend during a conversation. Additionally, users should be able to easily invite the identified speaker to their conversation for better accuracy.

• Allow users to easily identify speakers with 1 device (in a solo conversation) with as little impact on the conversation flow as possible.

• Be able to easily rename a speaker or re-assign a transcript to oneself or another user if the backend assigns a speaker wrongly.

• Be able to express the uncertainty of the speaker by the backend during a conversation. Additionally, users should be able to easily invite the identified speaker to their conversation for better accuracy.

Technical constraints

• The backend could not immediately recognize a new speaker during a conversation. Initially, it makes guesses about the speaker's identity until it has received enough audio that meets the assigned threshold for the new speaker.

• For the first version, the backend will not be able to retain the voice iD of the identified speakers in the conversation if the user already named them.

• The design should be compatible with the iOS app only for the first version.

• The backend could not immediately recognize a new speaker during a conversation. Initially, it makes guesses about the speaker's identity until it has received enough audio that meets the assigned threshold for the new speaker.

• For the first version, the backend will not be able to retain the voice iD of the identified speakers in the conversation if the user already named them.

• The design should be compatible with the iOS app only for the first version.

• The backend could not immediately recognize a new speaker during a conversation. Initially, it makes guesses about the speaker's identity until it has received enough audio that meets the assigned threshold for the new speaker.

• For the first version, the backend will not be able to retain the voice iD of the identified speakers in the conversation if the user already named them.

• The design should be compatible with the iOS app only for the first version.

Plan & Product Specification

This research and discovery project involved close collaboration with the Product and AI teams to draft the product specification and test the underlying AI model.

Due to legal constraints, I've omitted a lot of information on the design process, but I've compiled a detailed article & guide on designing intuitive experiences for AI & ML systems in real-time.

Plan & Product Specification

This research and discovery project involved close collaboration with the Product and AI teams to draft the product specification and test the underlying AI model.

Due to legal constraints, I've omitted a lot of information on the design process, but I've compiled a detailed article & guide on designing intuitive experiences for AI & ML systems in real-time.

This research and discovery project involved close collaboration with the Product and AI teams to draft the product specification and test the underlying AI model.

Due to legal constraints, I've omitted a lot of information on the design process, but I've compiled a detailed article & guide on designing intuitive experiences for AI & ML systems in real-time.

Product Showcase

Product Showcase

Assign speaker experience

Here is a gif. of the product demo illustrating the user's experience when an identified speaker is named during a conversation.

Here is a gif. of the product demo illustrating the user's experience when an identified speaker is named during a conversation.

Summary and Learnings

I gained valuable insights from this project, honing skills in iteration, refining product specifications, & collaborating closely with the AI team.


My primary challenge involved designing the conversation experience to intuitively express the uncertainty of the speaker in real-time.

To tackle this challenge, I collaborated closely with the AI team to test the prototype. This allowed me to not only better understand user needs but also what the AI side of the project could and could not deliver. Furthermore, I've documented my approach and learnings on designing intuitive experiences for AI and ML systems in my latest article, Design for Uncertainty.

I gained valuable insights from this project, honing skills in iteration, refining product specifications, & collaborating closely with the AI team.

My primary challenge involved designing the conversation experience to intuitively express the uncertainty of the speaker in real-time.

To tackle this challenge, I collaborated closely with the AI team to test the prototype. This allowed me to not only better understand user needs but also what the AI side of the project could and could not deliver. Furthermore, I've documented my approach and learnings on designing intuitive experiences for AI and ML systems in my latest article, Design for Uncertainty.