In late 2024, Decisions Lab CTO, Devansh Gandhi, was invited to speak at the St. Gallen Symposium held at The Chinese University of Hong Kong. As one of the first major forums in the region addressing the implications of artificial intelligence, the event brought together academics, MBA students, and industry leaders to explore what it means to build AI systems responsibly.
My sharing focused on a fundamental truth that is often overlooked: bias in AI is not a flaw to be eliminated, but a reality that must be understood and intentionally shaped. Every model reflects the data it is trained on, the cultural assumptions embedded within that data, and the objectives of its developers. Much like an accent in spoken language, every AI model ‘speaks’ with a cultural inflection. The question is whether it matches the people it’s speaking for. However, responsible AI is not about achieving neutrality. It is about making deliberate choices regarding the biases we embed and ensuring they align with the communities and contexts we serve.
The Problem: General Models Are Not Culturally Neutral
At Decisions Lab, we use large language models to simulate how people think, behave, and respond to policy or organizational change. These simulations are designed to help leaders anticipate human outcomes before real-world decisions are made. However, in our early work, we observed a consistent problem: general-purpose models, like those from OpenAI, often failed to reflect the cultural norms of the populations we were modeling.
To evaluate this systematically, we evaluated a series of models on CD Eval, a cultural dimensions benchmark designed to measure a model’s cultural dimensions. The results confirmed what our instincts suggested: these models were heavily biased toward Western values and communication patterns.
This finding was not just academic. In practical terms, it meant that simulations intended to represent Hong Kong stakeholders were producing distorted behavioral assumptions. This meant the models were overestimating directness, underestimating deference, and often missing subtle contextual cues entirely.
We believe strongly in responsible AI, and cultural fidelity is central to that. That’s why we released CultureKit, an open-source CLI tool that allows developers and researchers to audit the cultural alignment of their own language models. Evaluating bias is not just a research problem. It is a necessary step in building AI that is fair, useful, and accountable to the people it is meant to serve.
The Solution: Designing Bias for Context
Acknowledging the limitations of general-purpose models, we made a strategic decision: instead of trying to eliminate bias, we would design it with intent. If bias is inevitable, then our responsibility is to ensure it reflects the realities of the communities we model.
This led to the development of Dlab-852-Mini, a culturally fine-tuned language model optimized for the Hong Kong context. Rather than relying solely on prompt engineering, we built a complete data pipeline that incorporates localized sources, real-world behavioral patterns, and cultural nuances. In benchmark testing, Dlab-852-Mini was shown to be 2x as accurate in matching local attitudes compared to standard models.
We’ve applied this model in real-world simulations, including internal policy change scenarios. In one case, we tested how employees would respond to a shift in communication language at the workplace. The OpenAI predicted high acceptance. Our fine-tuned model, by contrast, detected anticipated resistance, particularly among mid-level staff, due to not being able to adapt and learn a new language quickly.
This is the power of designing with bias. When bias is visible and intentional, it enables realism. And realism is what makes simulations credible, actionable, and responsible.
The Philosophy: Building the Human Layer of AI
At Decisions Lab, we think of our work as building the human layer of AI. While much of the industry focuses on scale, speed, or raw generalization, we focus on how technology reflects the people it is intended to model. This is especially critical when AI systems are used to simulate behavior, test policy, or inform decisions that affect real lives.
Our view is straightforward: bias in AI should not be treated as an error — it should be treated as a design choice. Every model carries bias. The difference lies in whether that bias is unexamined or explicitly constructed to match the context it serves.
We ground our models in real-world data and disclose our assumptions, and we use tools like CultureKit and CD Eval to continuously measure the alignment of our systems with local realities. This philosophy informs every model we train, every simulation we run, and every decision we help evaluate.
This is not an argument for reinforcing stereotypes or building systems that exclude. Designing with bias means surfacing and aligning implicit assumptions with the lived experience of those we seek to model, not codifying them uncritically.
By centering human behavior, we aim not only to improve prediction accuracy but also to make AI more accountable and transparent. In practice, this means developing tools that don’t just perform well on benchmarks, but that understand people, their context, their culture, and their complexity.
The Future: AI Simulations
AI is no longer just a tool for automation. It is becoming a tool for understanding — particularly when applied to human systems like organizations, cities, or communities. At Decisions Lab, we believe AI powered simulations based on digital twins are essential for making informed, responsible decisions in complex social environments.
By creating digital twins of real-world stakeholders, we allow policymakers, leaders, and organizations to explore how people might react to change before those changes happen. These simulations do not produce perfect predictions. But they offer something equally valuable: plausible behavioral foresight, rooted in local culture and human psychology.
As we look ahead, we believe this capability will play a growing role in how society manages complexity. From internal policy shifts to public health messaging to retention marketing, AI simulations can help identify blind spots, test assumptions, and reduce unintended consequences.
This approach raises important questions. Whose values should an AI reflect in a multicultural society? What happens when fairness looks different across contexts? These are questions we cannot avoid — and simulations are one of the few tools we have to explore them before they become real-world failures.
For this to work, the AI we build must reflect the specific people it is meant to represent. That means moving beyond abstract ideals of neutrality, and toward a more honest, intentional practice: designing AI systems that carry bias — but the right bias, for the right reason, in the right place.