Step 7: Preference Discovery - Research & Best Practices
Last Updated: December 2024 Purpose: Industry research and best practices reference for AI-powered conversational preference discovery
This document captures research on AI-powered conversational preference discovery systems. It serves as a reference for what excellence looks like in this space, independent of our current implementation state.
Table of Contents
Industry Best Practices
1. Conversational UX Fundamentals
Source: Smashing Magazine - Conversational AI Design Guide
| Principle | Description | Priority |
|---|---|---|
| Design for user intent | Focus on what user wants to achieve, not UI tasks | P1 |
| Context as gold | Remember, reuse, and verify prior information | P1 |
| Transparency | Show what, why, and how | P1 |
| User control | Confirm, undo, cancel, override capabilities | P1 |
| Consistency | Reduce friction with predictable patterns | P2 |
| Guardrails | Safe boundaries and ethical defaults | P2 |
2. Interaction Patterns
Source: WillowTree - 7 UX/UI Rules for AI Assistants
| Pattern | Description | Priority |
|---|---|---|
| AI proposes, user approves | Recommendations with user confirmation | P1 |
| Proactive assistance | AI anticipates needs, user can override | P2 |
| Multi-turn planning | Break down goals with check-ins | P1 |
| Mixed-initiative | Both AI and user can drive conversation flow | P2 |
| Loops over lines | Allow interruptions and returns to previous topics | P2 |
3. Visual UI Elements
Source: AIMultiple - Conversational UI Best Practices
| Element | Purpose | Priority |
|---|---|---|
| Quick-reply buttons | Tap instead of type for common options | P1 |
| Suggestion chips | Show available options visually | P1 |
| Typing indicators | Show "Searching...", "Thinking..." states | P2 |
| Visual formatting | Lists, bold, icons for readability | P2 |
| Confidence cues | "I'm 80% sure...", "Did I understand correctly?" | P2 |
4. Personalization Approaches
Source: Survicate - User Onboarding Design
| Approach | Description | Priority |
|---|---|---|
| Persona-based flow | Different question paths for different user types | P1 |
| Experience filtering | Show relevant options only based on user context | P1 |
| Adaptive UI | Adjust interface based on user preferences/behavior | P2 |
| Progress indicators | Show completion status and remaining steps | P1 |
5. Feedback Mechanisms
Source: Mind the Product - UX Best Practices for AI Chatbots
| Mechanism | Purpose | Priority |
|---|---|---|
| Thumbs up/down | Quick feedback on AI responses | P2 |
| Regenerate button | Get alternative response if current one isn't helpful | P2 |
| Edit responses | Modify AI's understanding of user input | P2 |
| Explicit confirmation | "Did I understand correctly?" for ambiguous inputs | P1 |
6. Confidence Scoring Best Practices
Source: OpenReview - PrefEval
| Practice | Description | Priority |
|---|---|---|
| Multi-factor scoring | Combine linguistic, semantic, and contextual signals | P1 |
| Transparent reasoning | Explain why confidence is high/low | P2 |
| Clarification triggers | Low confidence should prompt follow-up questions | P1 |
| Alternative interpretations | Track other possible meanings of user input | P2 |
| Conflict detection | Identify when preferences contradict each other | P1 |
7. Preference Learning
Source: arXiv - On the Way to LLM Personalization
| Practice | Description | Priority |
|---|---|---|
| Learn from past behavior | Use historical preferences to pre-fill or suggest | P2 |
| Smart defaults for new users | Research-backed defaults based on user type/goal | P2 |
| Preference stability tracking | Know which preferences are consistent vs volatile | P3 |
| Cross-session memory | Remember preferences across multiple interactions | P2 |
8. Open-Ended Capture
Source: ACM - Generating Usage-related Questions
| Practice | Description | Priority |
|---|---|---|
| Entity extraction | Parse free-form text into structured categories | P1 |
| Impact classification | Tag extracted info as critical/important/nice-to-have | P2 |
| Plan generation notes | Generate actionable notes for downstream systems | P2 |
| Prompt hints | Provide examples of what to share in open-ended responses | P2 |
Competitor Analysis
Fitness App Comparison
| Feature | Fitbod | WHOOP | Trainerize | Best Practice Target |
|---|---|---|---|---|
| Onboarding Style | Form wizard | Sensor-first | Form wizard | AI conversational |
| Preference Capture | Checkboxes, dropdowns | Automatic from biometrics | Forms | Natural language + structured |
| Personalization | Algorithm-driven | Data-driven | Coach-driven | Coach persona + context |
| User Education | Minimal | Metrics-focused | Varies by coach | Deep "why" explanations |
| Question Order | Fixed | N/A | Fixed | Tier-based with AI flexibility |
| Adaptation | From workout history | From biometrics | Manual coach adjustment | Real-time conversation |
Fitbod's Approach
Source: Fitbod Blog
"Fitbod begins with a smart onboarding process. It collects details like fitness goals, preferred workout style, equipment available, and experience level."
Key Features:
- Form-based onboarding (not conversational)
- ML generates dynamic plans that evolve
- Learns from workout history and recovery patterns
- Progressive overload intelligence
Opportunity: Provide educational context during preference collection, explaining why each choice matters. Collect data AND teach simultaneously.
WHOOP's Approach
Source: BuddyX Theme - AI Tools for Fitness Coaching
"WHOOP digs deep into your body's data to tell you when you're ready to train hard or when you need to chill out."
Key Features:
- Passive data collection via wearable
- HRV, sleep, and strain analysis
- Recommendations based on recovery metrics
Opportunity: Capture subjective preferences that sensors can't detect (exercise preferences, training philosophy, lifestyle constraints).
Academic Research Insights
LLM-ConvRec Architecture
Source: GitHub - D3Mlab/llm-convrec
"LLM-ConvRec maintains an explicit internal state that tracks user preferences and constraints. This structure improves response consistency, memory retention, and control."
Four-Stage Pipeline (Recommended Architecture):
- Intent Classification - Determine if user is asking, answering, or commenting
- State Update - Track answered preferences explicitly
- Action Selection - Decide next question based on tiers/priority
- LLM-based Response Generation - Generate conversational coach message
Conversational Recommender Systems
Source: ACM - Conversational Style Impact
"The success of these systems is heavily influenced by the preference elicitation process. While existing research mainly focuses on what questions to ask, there is a notable gap in understanding what role broader interaction patterns—including tone, pacing, and level of proactiveness—play."
Key Factors to Consider:
- Tone: Personality and communication style
- Pacing: Question ordering and density
- Proactiveness: Anticipating user needs vs waiting for explicit input
Preference Elicitation Strategies
Source: ACM - Preference Elicitation Strategy for CRS
Key Findings:
- Start with open-ended questions, gradually elicit specifics
- Trade-off between conversation efficiency and information accuracy
- Usage-based questions are effective: "Are you looking for X that is great for Y?"
- Present trade-offs explicitly: "If you choose X, you get Y but it means Z"
LLM Preference Following Challenges
Source: OpenReview - PrefEval
"In zero-shot settings, preference following accuracy falls below 10% at merely 10 turns (~3k tokens) across most evaluated models."
Mitigations Required:
- Explicit state tracking (don't rely on LLM memory alone)
- Confidence scoring to catch misunderstandings
- Low-confidence should trigger clarification
- Store preferences in database, not just conversation history
Proactive Suggestions
Source: arXiv - COMPASS: User Preferences with Knowledge Graph
Key Concept: When a user answers one preference, surface related preferences proactively.
Example Relationships:
- Training split → rest period recommendations
- Deficit approach → cardio strategy suggestions
- Intensity techniques → deload frequency implications
Feature Checklist
Use this checklist to compare implementation against research best practices.
Core Architecture
| Feature | Description | Priority |
|---|---|---|
| Goal-specific question bank | Different questions for different fitness goals | P1 |
| Tiered question system | Required → Important → Optional flow | P1 |
| Educational context | Each question explains "why it matters" | P1 |
| Trade-off presentation | Each option shows impact/consequences | P1 |
| Multi-preference extraction | Extract multiple answers from single message | P1 |
| Intent detection | Distinguish questions from answers | P1 |
| Coach personalization | Goal-specific messaging with personality | P1 |
Confidence & Extraction
| Feature | Description | Priority |
|---|---|---|
| Multi-factor confidence scoring | 5+ factors: linguistic, semantic, contextual, etc. | P1 |
| Confidence threshold | Only save preferences above threshold (e.g., 0.6) | P1 |
| Transparent reasoning | Explain why confidence is high/low | P2 |
| Alternative interpretations | Track other possible meanings | P2 |
| Conflict detection | Identify contradicting preferences | P1 |
| Clarification suggestions | Generate follow-up for low confidence | P1 |
Open-Ended Capture
| Feature | Description | Priority |
|---|---|---|
| Entity extraction | Parse free-text into structured categories | P1 |
| Category taxonomy | Injuries, equipment, lifestyle, exercises, etc. | P1 |
| Impact classification | Critical vs important vs nice-to-have | P2 |
| Plan generation notes | Actionable notes for plan generation | P2 |
| Prompt hints | Examples of what to share | P2 |
User Control & Revision
| Feature | Description | Priority |
|---|---|---|
| Undo capability | Revert previous answer | P1 |
| Revision detection | Detect "actually I meant..." patterns | P1 |
| Inline revision handling | Handle corrections mid-conversation | P1 |
| Preference summary | Show what's been answered | P2 |
Learning & Prediction
| Feature | Description | Priority |
|---|---|---|
| Learn from past plans | Store and reuse preferences | P2 |
| Predict for new plans | Suggest based on history | P2 |
| Smart defaults (new users) | Research-backed defaults by goal/experience | P2 |
| Workout feedback learning | Adjust based on completed workouts | P3 |
| Preference stability scoring | Track consistent vs volatile preferences | P3 |
Feedback & Improvement
| Feature | Description | Priority |
|---|---|---|
| Thumbs up/down | Quick feedback on responses | P2 |
| Regenerate capability | Get alternative response | P2 |
| Feedback storage | Store for AI improvement | P3 |
| Regeneration tracking | Track original vs regenerated | P3 |
Proactive Suggestions
| Feature | Description | Priority |
|---|---|---|
| Preference relationships | Map related preferences | P2 |
| Suggestion generation | "Based on X, you might want Y" | P2 |
| User engagement scoring | Adjust proactiveness by engagement level | P3 |
| Suggestion filtering | Only show high-confidence suggestions | P2 |
Visual/UI Elements
| Feature | Description | Priority |
|---|---|---|
| Quick-reply chips | Clickable option buttons | P1 |
| Progress visualization | Tier completion indicators | P1 |
| Typing indicators | "Thinking..." states | P2 |
| Markdown rendering | Format coach messages | P2 |
| Streaming responses | Token-by-token display | P3 |
Advanced Features (Backlog)
| Feature | Description | Priority |
|---|---|---|
| Knowledge graph | Link preferences to exercise database | P4 |
| A/B testing | Test question orders/phrasings | P4 |
| Cross-user patterns | Learn from aggregate user behavior | P4 |
Sources
Industry Best Practices
- Smashing Magazine - How To Design Effective Conversational AI Experiences
- WillowTree - 7 UX/UI Rules for Designing a Conversational AI Assistant
- AIMultiple - Conversational UI: 6 Best Practices
- Springs Apps - 10 Chatbot Best Practices In 2025
- Exotel - Conversational UX 101 Guide for 2025
- Mind the Product - Nine UX Best Practices for AI Chatbots
- Botpress - Conversational AI Design in 2025
Fitness App Research
- Fitbod - Best AI Fitness Apps 2025
- QuickPose.ai - Fitness App Trends 2024
- AppInventiv - 15 Use Cases of AI in Fitness Industry
- BuddyX Theme - Best AI Tools for Fitness Coaching
Academic Research
- GitHub - D3Mlab/llm-convrec: LLM-based Conversational Recommendation Architecture
- OpenReview - Do LLMs Recognize Your Preferences? (PrefEval)
- ACM - Should We Tailor the Talk? Conversational Styles in Preference Elicitation
- ACM - Preference Elicitation Strategy for Conversational Recommender System
- ACM - Generating Usage-related Questions for Preference Elicitation
- arXiv - COMPASS: Unveiling User Preferences with Knowledge Graph and LLM
- arXiv - On the Way to LLM Personalization: Learning to Remember User Conversations
Onboarding Design
- Survicate - User Onboarding Design: How to Get it Right
- UserPilot - AI User Onboarding: 8 Real Ways to Optimize
- Landbot - Onboarding Chatbot Guide
- Specific.app - Great Questions for Chatbot Onboarding
This document should be updated as new research emerges. It serves as a reference standard, not an implementation tracker.