Decoding Cross-Lingual Responses: Why Your AI Assistant Switches from Chinese to Korean and How to Fix It

Overview

Have you ever typed a prompt in Chinese to your coding assistant, only to get a reply in Korean? This puzzling behavior is more than a glitch—it’s a window into how large language models (LLMs) handle multilingual input, especially when code vocabulary reshapes the embedding space. In this tutorial, we’ll explore the mechanics behind such cross-lingual responses, then build a practical solution to detect and prevent unwanted language switches. By the end, you’ll understand embedding spaces, token overlap, and how to fine-tune your assistant for consistent language output.

Decoding Cross-Lingual Responses: Why Your AI Assistant Switches from Chinese to Korean and How to Fix It
Source: towardsdatascience.com

Prerequisites

To follow along, you’ll need:

Install dependencies:

pip install transformers torch sentence-transformers

Step-by-Step Instructions

Step 1: Understand Embeddings and Language Overlap

LLMs like GPT or CodeLlama represent every token as a vector in a high-dimensional embedding space. When you mix languages—especially in coding contexts—tokens from different languages can occupy similar regions due to overlapping semantics (e.g., common programming keywords like print()). This similarity can cause the model to produce tokens from a different language than expected.

For example, consider these embeddings:

If your prompt contains code, the model might anchor to a region where Chinese and Korean embeddings intersect, leading to a Korean response.

Step 2: Inspect the Embedding Space

We’ll use sentence-transformers to visualize token similarities. Run this Python script:

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = [
    "打印变量",           # Chinese
    "변수 출력",          # Korean
    "print variable",     # English
    "int main()"          # Code snippet
]

embeddings = model.encode(sentences)
similarities = np.dot(embeddings, embeddings.T)
print(similarities)

You’ll notice high similarity between Chinese and Korean programming phrases. This is the root cause of language switching.

Step 3: Detect Language Switch in Real Time

Build a detection function that monitors the assistant’s output language. We’ll use langdetect (or a simple character-range check). First, install it:

pip install langdetect

Then implement a wrapper for your assistant:

from langdetect import detect

def check_language(text):
    try:
        lang = detect(text)
    except:
        lang = 'unknown'
    return lang

# Example: when you send a Chinese prompt, check if response language changes
prompt = "如何在Python中打印变量?"   # Chinese
response = assistant.generate(prompt)  # your model call
lang_resp = check_language(response[:50])  # check first 50 chars
if lang_resp == 'ko':
    print("ALERT: Language switch to Korean detected!")

Step 4: Fix with Context Reinforcement

Prevent language switching by adding explicit language instructions in your system prompt. For example:

system_prompt = "You are a helpful coding assistant. Always respond in the same language as the user's last message. If the user writes in Chinese, reply in Chinese."
response = assistant.generate(system_prompt + "\n" + user_input)

Alternatively, use logit bias to suppress tokens from undesired languages. Here’s a snippet using Hugging Face transformers:

Decoding Cross-Lingual Responses: Why Your AI Assistant Switches from Chinese to Korean and How to Fix It
Source: towardsdatascience.com
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt")

# Identify Korean token IDs (example: range 50000-52000 for a Korean tokenizer)
# This step requires knowing your tokenizer's vocabulary mapping.
korean_ids = list(range(50000, 52000))  # placeholder
bias = torch.zeros(tokenizer.vocab_size)
bias[korean_ids] = -100.0  # reduce logits

outputs = model.generate(**inputs, logits_processor=[bias])
response = tokenizer.decode(outputs[0])

Step 5: Train Explicit Language Embedding

For a permanent solution, fine-tune the model with language-annotated data. Collect paired examples where the language tag is prepended. Example training data:

"[LANG_ZH] 打印变量" → "打印变量"
"[LANG_KO] 변수 출력" → "변수 출력"

Fine-tune using standard causal LM loss. This aligns the model’s outputs with the expected language tag. Use Trainer from Hugging Face:

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=4,
    num_train_epochs=3,
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=multi_lang_dataset,  # your custom dataset
)
trainer.train()

Common Mistakes

Here are pitfalls to avoid:

Summary

Language switching in coding assistants happens because code vocabulary merges embedding spaces across languages. By detecting the switch, reinforcing language context, and optionally fine-tuning, you can ensure consistent responses. This tutorial gave you a hands-on path from theory to implementation—now you can debug your AI assistant when it starts replying in Korean to your Chinese prompts.

Tags:

Recommended

Discover More

How to Build a B2B Document Extractor: Rule-Based vs. LLM ApproachesMastering the Priestess: A Complete Guide to Defeating Saros' Floating MenaceBohmian Mechanics: A Step-by-Step Guide to Restoring Reality in Quantum TheoryHeavy Rainfall from Cyclone Maila Triggers Catastrophic Landslides in Papua New GuineaKillswitch Proposal Offers Emergency Patch for Kernel Vulnerabilities