Why We Built Our Own Chatbot
Website visitors have questions. Some are simple, others require context about your services. A contact form works, but it introduces friction. A well-implemented chatbot can resolve doubts instantly and capture quali...
Why We Built Our Own Chatbot
Website visitors have questions. Some are simple, others require context about your services. A contact form works, but it introduces friction. A well-implemented chatbot can resolve doubts instantly and capture qualified leads at the moment of peak interest.
We decided to build our own instead of using a SaaS solution for three reasons: full control over the experience, native integration with our website, and the opportunity to compare different AI providers.
Multi-Provider Architecture
Instead of committing to a single language model, we implemented support for four providers: Anthropic (Claude), OpenAI (GPT), xAI (Grok), and Google (Gemini).
Each provider has its strengths. Claude excels at following complex instructions. GPT offers broad general knowledge. Grok delivers more direct responses. Gemini balances speed and quality.
The architecture abstracts the differences between APIs through a common interface. Each provider implements the same methods: check availability, send messages, and stream responses.
A/B Testing System
We didn’t want to choose a provider based on intuition. We implemented an A/B testing system that assigns each user to a provider consistently.
The system works as follows:
- When a user opens the chat, we generate a unique identifier
- We combine that ID with the test identifier and calculate a hash
- The hash determines a bucket (0-99) that maps to a provider
- We store the assignment in a cookie to maintain consistency
We currently distribute traffic evenly: 25% for each provider. This allows us to compare metrics such as latency, error rate, perceived quality, and lead conversion.
Streaming and Fallback
AI responses can take several seconds to generate fully. Waiting for completion before displaying anything would create a poor experience.
We implemented streaming using Server-Sent Events (SSE). The server sends text fragments as the model generates them, and the client renders them progressively. The user sees the response building in real time.
If a provider fails, the system automatically tries the next one in priority order. The user experiences a slight delay but still receives a response.
Smart Lead Capture
The chatbot does more than answer questions. It is trained to detect purchase intent signals: questions about pricing, timelines, or the work process.
When it detects genuine interest, it gently suggests that the user leave their details so we can contact them. The form appears integrated within the chat without interrupting the conversation.
We include an invisible honeypot field to filter bots. If that field contains content, we know it is not a human.
Metrics We Track
For each provider we measure:
- Response latency
- Error and fallback rates
- Number of messages per conversation
- Lead conversion rate
- Positive/negative feedback
These data points will allow us to make informed decisions about which provider performs best for our specific use case.
Security Considerations
API keys are never exposed to the client. All communication with the providers occurs on the server.
Assignment cookies are HttpOnly and Secure. Input validation uses strict Zod schemas with length limits.
We do not currently store conversations in a database, although this is a planned improvement to analyze patterns and refine the prompt system.
What We Learned
Building a multi-provider chatbot is not trivial, but it is not as complex as it seems. The key is to properly abstract the differences between APIs and design for failures from the start.
Streaming dramatically improves the perceived experience. Seeing text appear progressively feels more natural than waiting for a complete response.
And perhaps most importantly: don’t assume which provider is best. Measure it.