Reverse Engineering Pricing: How Big Data Discrimination Algorithms Work

Updated:
Reverse Engineering Pricing: How Big Data Discrimination Algorithms Work

The Trip.com case opened a public discussion about what developers have long suspected: travel platform algorithms don't just “find the best price” — they actively profile each user and return a different JSON response depending on dozens of signals. In this article, we will analyze the technical stack of such systems — from SDK-level fingerprinting to ML models that calculate your maximum willingness to pay.

This material is part of a series on algorithmic transparency. The context of the Trip.com case is in our analytical article on the SAMR 2026 antitrust investigation. Practical protection methods are in the technical guide “How to Bypass Travel Algorithms”.

📌 System Architecture: General Overview

Before diving into the details, it's important to understand the big picture. A modern dynamic pricing system in an OTA is not a single algorithm, but a chain of at least five independent subsystems, each of which receives its piece of data and passes the result further.

A typical request path from the moment a user opens the app to the moment they see the price:

  1. Signal Collection Layer — collecting signals from the device (fingerprint, geolocation, connection type)
  2. Identity Resolution Service — matching the current session profile with the long-term user profile
  3. Pricing Engine — calculating the base price taking into account supply and demand
  4. Personalization Layer — adjusting the price based on the specific user's ML profile
  5. Ranking Service — ordering results taking into account commission and business rules

This entire chain must fit within 200–400 ms — because this is considered an acceptable waiting time for a mobile application. This is why the architecture is built on microservices with aggressive caching and asynchronous processing.

Component interaction diagram:


[Client: iOS/Android/Web]
        |
        | HTTP/2 + TLS
        v
[API Gateway]  ←→  [Auth & Session Service]
        |
        | gRPC (internal network)
        v
[Orchestration Service]
    /       |        \
   /        |         \
[Signal   [Identity   [Pricing
Collection] Resolution] Engine]
   \        |         /
    \       |        /
     v      v       v
  [Personalization ML Service]
              |
              v
        [Ranking Service]
              |
              v
     [API Response Builder]
              |
              v
  {"hotels": [...], "price": X}

Note: the price a user sees in the JSON response is already the result of at least four services. None of them "discriminates" individually — but together they can return a different price for two people searching for the same hotel on the same night.

📌 Dynamic Pricing Architecture: Real-time Microservices

The Pricing Engine is the heart of the system. Its task is to set a base price before personalization makes its adjustments. This base price is already dynamic: it changes every minute or even every second depending on market conditions.

Data Sources for the Pricing Engine:

The system simultaneously processes several data streams. Historical demand data — aggregated booking data for similar periods in previous years, taking into account seasonality, holidays, and local events. Real-time demand signals — the current number of active sessions and searches, the frequency of requests to a specific hotel, the number of hotel page views in the last hour. Competitor monitoring — automatic scanning of competitor prices (Meituan, Fliggy, Booking.com) by parsing their public APIs or web pages; this is what became the subject of accusations in the Trip.com case.

Mathematical Model:

Most modern dynamic pricing systems are based on the concept of price elasticity — the sensitivity of demand to price changes. The basic formula:


Price_optimal = Base_cost × (1 + margin_target) × Demand_multiplier × Competitive_index

where:
  Base_cost       = cost of the room for the OTA (net rate from the hotel)
  margin_target   = platform's target margin (usually 10–25%)
  Demand_multiplier = current demand / average demand (1.0 = normal)
  Competitive_index = adjustment relative to competitor prices

But this is only the first level. The Demand_multiplier itself is the result of a separate ML model that forecasts demand based on dozens of factors: weather, local events, flights to the city, social media activity. For example, if the number of posts with a certain city's hashtag sharply increases on Instagram — the system interprets this as a signal of increased demand and increases the multiplier.

Microservice Architecture and Latency:

Each microservice in the chain adds latency. To fit within 200–400 ms, platforms use several strategies. Pre-computation — prices for popular dates and hotels are calculated in advance and cached in Redis or Memcached; upon request, the system simply returns the cached value without running the full chain. Async enrichment — the response is returned with a base price, and personalization is "added" asynchronously, updating the UI via WebSocket or on the next request. Feature store — user ML features (their price category, solvency score) are stored in a separate repository and updated not in real-time, but in batches — once every few hours.

It is precisely because of the feature store that the same person can see different prices in different sessions: if a batch update of the ML profile occurred between sessions — for example, due to new data about a click on a premium hotel — the system moved them to a different price category.

Source: Kitrum: AI-Powered Dynamic Pricing in Travel, AIMultiple: Dynamic Pricing Algorithms.

📌 Device Fingerprinting: Why iOS Costs More Than Android

This is the most debated and least publicly documented part of the system. Research from the University of Chicago and analysis by the Wall Street Journal showed: device type and operating system statistically correlate with willingness to pay. iPhone owners, on average, have higher incomes than owners of mid-range Android devices. The algorithm knows this.

What the Signal Collection Layer Gathers:

At the mobile SDK level, the system collects the following signals:


// iOS SDK collects:
{
  "idfv": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",  // Identifier for Vendor
  "device_model": "iPhone15,4",    // iPhone 15 Pro Max
  "os_version": "17.4.1",
  "screen_resolution": "2796x1290",
  "timezone": "Asia/Shanghai",
  "language": "zh-Hans-CN",
  "network_type": "wifi",          // wifi vs cellular
  "carrier": "China Mobile",
  "app_version": "8.56.0",
  "jailbreak_detected": false
}

// Android SDK collects:
{
  "gsf_id": "XXXXXXXXXXXXXXXX",    // Google Services Framework ID
  "device_model": "Redmi Note 12", // mid-range device
  "manufacturer": "Xiaomi",
  "os_version": "13",
  "screen_resolution": "2400x1080",
  ...
}

Note: iOS returns IDFV (Identifier for Vendor), Android — GSF ID (Google Services Framework Identifier). Both are stable identifiers that do not change when the application is restarted and are retained even after clearing the cache.

How the device model gets into the pricing logic:

Directly — never. No developer would write if (device == "iPhone15") price *= 1.15. That would be too obvious and easy to prove. Instead, the device model becomes one of the features in an ML model that is trained on real booking data.

The logic is as follows: the system learns from millions of transactions and identifies a pattern:


Users with iPhone 14 Pro and above + Wifi + location "Pudong, Shanghai"
  → conversion to booking at $150: 34%
  → conversion to booking at $180: 31%
  → delta insignificant → can show $180

Users with Redmi Note 11 + Cellular + location "Putuo, Shanghai"
  → conversion at $150: 28%
  → conversion at $180: 14%
  → delta significant → optimal price $150

No one "discriminates" by device. The model simply optimizes conversion — and finds that different devices correspond to different price sensitivities. The result is the same: owners of expensive iPhones see higher prices. Source: University of Chicago Law Review: Algorithmic Price Discrimination.

Browser fingerprinting for the web version:

On the web platform, instead of SDK identifiers, a combination of browser signals is used. The system collects User-Agent, screen resolution and color depth, list of installed fonts via Canvas API, WebGL renderer (allows determining GPU model), browser time zone and language, presence of an ad blocker. The combination of these parameters provides a unique "browser fingerprint" that is retained even after clearing cookies. The accuracy of modern fingerprinting systems reaches 99% for re-identification — meaning the system will recognize the same browser with 99% probability even after a complete data clear. Source: Group-IB: Device Fingerprinting.

Reverse Engineering Pricing: How Big Data Discrimination Algorithms Work

📌 Personalization via ML: How the Price "Pain Threshold" is Calculated

"Pain threshold" is an informal name for the concept of Willingness To Pay (WTP): the maximum price a specific user is willing to pay without abandoning the purchase. Defining this threshold for each user is the key task of the personalization layer.

What data forms the WTP profile:

Behavioral signals of the current session: number of repeated views of the same hotel (the more, the higher the WTP), time spent on the hotel page, scrolling to the "reviews" or "room details" section (a signal of serious interest), clicking "compare rooms" or "view all photos", opening the booking page without completing the purchase (abandoned checkout — a powerful signal).

Long-term signals from the profile: average check of previous bookings (if an account exists), class of hotels previously viewed, booking frequency (loyal customer vs. one-time), reaction to past discounts (took a 5% discount? — WTP is slightly lower).

External signals: IP address and geolocation — income in the area of residence, device model and its market value, mobile operator's tariff plan type (if available via SDK).

How it's technically implemented:

The most common approach is gradient boosting models (XGBoost, LightGBM) or neural networks trained on real conversion data. The model receives a feature vector and returns a score — a conditional "price category" for the user:

Input features vector:

[device_tier, session_depth, repeat_views, booking_history_avg,

geolocation_income_index, time_to_checkin, group_size, ...]

Output:

{

"price_tier": "premium", // or "standard", "budget"

"wtp_score": 0.82, // 0.0 = very price-sensitive, 1.0 = insensitive

"discount_sensitivity": 0.23, // how much a discount affects conversion

"urgency_factor": 0.61 // are there signs of an urgent search

}

This score is passed to the Pricing Engine, which applies a multiplier to the base price. For example, if wtp_score > 0.75, the system might show a price 8–15% higher than the base; if wtp_score < 0.3 — it might automatically show a discount or a "special offer".

Abandoned checkout as a trap:

One of the most debated patterns is the reaction to an abandoned booking. Studies show two opposing strategies. "Lure" strategy: after an abandoned booking, the system sends a push notification with a 5–10% discount — confirming that the "real" price was actually lower. "Pressure" strategy: instead, the system raises the price or shows "only 1 room left" — knowing that the user is already interested and less likely to go to a competitor. Which strategy to apply is decided by A/B testing at the service configuration level. Source: Personalized Pricing: History, Economics, and Impact of AI-Driven Price Discrimination.

📌 Ranking Bias: Technical Methods of Search Result Manipulation

Even if the price is "fair", the search result can be manipulated. Ranking Bias is a practice where the hotel ranking algorithm in the search results considers not only relevance to the user but also the commercial interests of the platform.

How ranking factors are officially explained:

Most OTAs publicly state that ranking depends on: search criteria match, rating and number of reviews, price relative to similar hotels, booking confirmation speed, conversion (CTR + booking rate). SpringerLink research confirms: a hotel in the first position receives many times more clicks than one in the fifth. 80–90% of bookings occur from the first two pages of results. Source: SpringerLink: A Re-rank Algorithm for Online Hotel Search.

What is not mentioned in public documentation:

Commission boost — hotels that agree to a higher commission receive an increased ranking score. This is openly acknowledged by some platforms as a "Preferred Partner Program", but the mechanism is not disclosed. Cloudbeds research confirms: "some channels allow opt-in to higher commissions to push a hotel higher in results — a practice called OTA bias". Source: Cloudbeds: OTA Ranking Optimization.

"Dimming" — reducing a hotel's visibility without explicit notification. A hotel that offers a lower price on its own website or refuses a promotion may receive: a reduced search rank, hiding of special icons and "Popular Choice" badges, reduction in the number of displayed photos in the card, exclusion from email newsletters and push recommendations. Source: MyLighthouse: OTA Ranking Factors.

Promotional participation weight — participation in platform discount promotions is an implicit ranking factor. Hotels that do not participate in "Flash Sale" or "Early Bird" campaigns systematically receive a lower rank in the algorithm. This is the essence of the accusations against Trip.com: "coercion to participate in promotions under threat of traffic reduction".

Technical implementation in API:

At the code level, the Ranking Service is usually implemented as a Learning-to-Rank (LTR) model — a type of supervised learning where the model learns to rank objects relative to each other. For each hotel, a ranking score is formed:

ranking_score = w1 * relevance_score

+ w2 * review_score

+ w3 * price_competitiveness

+ w4 * conversion_rate_historical

+ w5 * commission_tier // ← hidden commercial factor

+ w6 * promotion_participation // ← participation in platform promotions

+ w7 * platform_exclusivity // ← exclusivity of the deal

// Weight coefficients w1..w7 are platform property,

// not disclosed to partners

Hotel partners only see the final rank. The weight coefficients and factor composition are trade secrets. It is precisely because of this opacity that regulators demand Algorithm Transparency: the right of partners to know why their hotel dropped in rank.

💼 From Theory to Practice: What the Trip.com Case Showed

The Trip.com case is unique because, for the first time in Asia, a regulator gained access to the internal documentation of an AI pricing system and classified its operation as an antitrust violation.

Trip.com's "AI Price Adjustment Assistant" is precisely the Pricing Engine with Competitive Monitoring we described above, but with one critical addition: automatic execution without partner consent. The system not only monitored Meituan's prices — it automatically lowered the prices of partner hotels so that Trip.com always appeared cheaper. The hotel learned about the price reduction only after it had occurred — if it learned at all.

Technically, it looked approximately like this:

// Pseudocode for Trip.com AI Price Adjustment logic

function adjustHotelPrice(hotelId, currentPrice) {

const competitorPrice = scrapeCompetitorPrice(hotelId, "meituan");

if (currentPrice > competitorPrice) {

const newPrice = competitorPrice - DELTA; // e.g., -10 yuan

updateHotelPrice(hotelId, newPrice); // ← without hotel confirmation

logAdjustment(hotelId, currentPrice, newPrice, "auto_competitive");

}

}

// The function is called automatically every N minutes

// for all hotels that signed an agreement to participate

// in the "automatic pricing tool"

The problem is not in the algorithm's logic — it is technically correct. The problem is in the terms of the agreement: hotels were not explained that "subscribing to the tool" meant a complete transfer of control over pricing. And refusing the tool was punished by a rank reduction — which is the definition of coercion. Source: Pandaily: Trip.com AI Price Adjustment Assistant Shutdown.

Reverse Engineering Pricing: How Big Data Discrimination Algorithms Work

💼 Code Ethics: Can "Fairness" Be Programmed?

The question is not rhetorical. After the cases of Alibaba, Meituan, and now Trip.com, there is a serious discussion in the industry about how to technically implement "fair" pricing.

Problem 1: Fairness in ML — a mathematically complex task

In machine learning, there are several formal definitions of "fairness," and they are mathematically contradictory to each other. Individual fairness requires similar users to receive similar prices. Group fairness requires different demographic groups to pay, on average, the same. Counterfactual fairness requires the price not to depend on protected attributes (race, gender, place of residence). Achieving all three simultaneously is mathematically impossible. Any implementation of "fairness" is a compromise between these definitions.

Problem 2: Proxy discrimination

Even if "unfair" features (device model, geolocation) are removed from the model, other features can act as their proxies. Search time of day correlates with time zone and location. App language correlates with income level. Mobile operator tariff plan type correlates with demographic profile. The ML model will "find" correlations even after removing explicit discriminatory variables. This is the most complex technical problem of "fair pricing".

Practical approaches to Algorithm Transparency:

Several technical directions are actively being developed. Explainable AI (XAI) — tools like LIME or SHAP allow "explaining" the model's decision for a specific example: which features most influenced the price. This does not eliminate discrimination but makes it visible. Differential Privacy — a technique for adding controlled noise to training data, which reduces the model's ability to "memorize" individual patterns. Adversarial debiasing — training the model so that it cannot predict a protected attribute from the pricing outcome. Source: ScienceDirect: Algorithmic pricing — Implications for marketing strategy and regulation.

Regulatory response: from accusations to standards

After the Trip.com case, SAMR signals its intention to develop technical standards for pricing algorithms. These will likely include: mandatory algorithm audits for platforms with over 30% market share, the right of partners to explanations for price or rank changes, prohibition of automatic adjustments without partner confirmation, logging of all pricing decisions with the possibility of regulator verification.

💼 Regulatory Vector: What's Changing for Developers

If you are developing pricing or personalization systems — here's what you should already be considering in your architectural decisions.

EU AI Act (entered into force 2024):

Systems that use biometric data or behavioral patterns to "manipulate user behavior in a way that causes them harm" are classified as High Risk or Unacceptable Risk. Personalized pricing does not yet fall into the prohibited zone but already requires documentation and auditing for platforms in the EU.

Chinese "Algorithm Recommendation Rules" (2022, updated 2024):

Require platforms to: provide users with the option to opt out of personalized recommendations, explain the algorithm's operating principles upon request, and not use Big Data for discriminatory pricing ("殺熟" — "killing loyal customers" with higher prices). This last requirement became the legal basis for the Trip.com investigation.

❓ Frequently Asked Questions (FAQ)

Do all OTAs use personalized pricing?

The technical capability exists in most large OTAs (Booking.com, Expedia, Airbnb, Trip.com). How aggressively they use it varies. The most documented cases concern airline tickets (Delta, American Airlines) and travel platforms in China.

Is personalized pricing legal?

In most jurisdictions — yes, if it does not violate anti-discrimination laws and is not a result of abuse of a dominant position. China's "Algorithm Recommendation Rules" and the EU AI Act create new restrictions, but there is no complete ban.

What is "殺熟" (shā shú) and why is it important?

"Killing the loyal" is a Chinese term for the practice where a platform shows higher prices to regular customers, knowing that they are less likely to compare prices. Technically implemented through a WTP model: high booking frequency → high wtp_score → higher price. This practice is the subject of a specific prohibition in Chinese law.

How can I check if personalization is being applied to me?

The simplest method is to compare the price in an authorized and unauthorized state, from different devices, or via VPN with a different geolocation. If the difference is significant (more than 5–10%) — this is a signal of personalization. A detailed technical guide is in our article "How to Bypass Travel Algorithms".

Can a developer "build" a fair pricing algorithm?

Technically — partially. You can remove explicitly sensitive features, add fairness constraints to the loss function, and conduct regular demographic parity audits. But due to proxy discrimination and mathematical contradictions between definitions of fairness — complete "fairness" remains an open research problem.

✅ Conclusions

Dynamic pricing in modern OTAs is not a single algorithm, but an ecosystem of interconnected services, each of which individually appears harmless, but together they can form systematic price discrimination.

Three technical conclusions for developers:

  1. Device fingerprinting is no longer just fraud prevention. The same signals collected for identifying fraudsters are used to build a price profile. Architecturally, these systems often use shared infrastructure — which blurs the line between security and discrimination.
  2. "The algorithm decided" is not an argument. Regulators increasingly demand an explanation not only of the result but also of the mechanism. Audit trail and XAI are no longer nice-to-have, but an architectural requirement for platforms with significant market share.
  3. Partner consent is a technical requirement, not just a legal one. The Trip.com case showed: automation without explicit consent becomes a legal risk even if the system technically works correctly.

If you are interested in the practical side — how to protect yourself from these algorithms when booking — read our next material: "How to Bypass Travel Algorithms: A Technical Guide to Protecting Against Overpayments".

📎 Sources

  1. ScienceDirect: Algorithmic pricing — Implications for marketing strategy and regulation (2025)
  2. University of Chicago Law Review: Algorithmic Price Discrimination
  3. Personalized Pricing: History, Economics, and Impact of AI-Driven Price Discrimination (2025)
  4. Kitrum: How AI-Powered Dynamic Pricing Keeps Travel Companies Ahead (2025)
  5. AIMultiple: Dynamic Pricing Algorithms in 2026 — Top 3 Models
  6. Group-IB: Device Fingerprinting — Complete Guide (2026)
  7. Cloudbeds: 4 Steps to Optimizing Your Hotel's OTA Ranking (2025)
  8. MyLighthouse: 6 Ways to Increase Your Property's Ranking in OTA Searches
  9. SpringerLink: A Re-rank Algorithm for Online Hotel Search
  10. Pandaily: Trip.com to Shut Down AI Price Adjustment Assistant (2026)
  11. Hotel Management: AI and the End of Rate Parity (2025)
  12. Medium: The Mathematical Alchemy of AI-Driven Airline Pricing (2025)

Material prepared by the editorial team of WebСraft.org. Publication date: March 15, 2026. Pseudocode in the article is illustrative and does not reflect the actual code of any specific platform.

Останні статті

Читайте більше цікавих матеріалів

Як встановити Ollama на Mac, Windows і Linux: повний гайд 2026

Як встановити Ollama на Mac, Windows і Linux: повний гайд 2026

ChatGPT і Claude працюють через браузер — відкрив вкладку і пишеш. Ollama працює інакше: спочатку встановлюєш програму на комп'ютер, потім завантажуєш модель — і після цього AI працює локально, без інтернету і без підписок. Увесь процес займає 5–10 хвилин. Ця...

Bitchat, Briar і Meshtastic: три підходи до mesh-комунікацій без інтернету

Bitchat, Briar і Meshtastic: три підходи до mesh-комунікацій без інтернету

Коли інтернет відключають — навмисно чи через катастрофу — традиційні месенджери перестають працювати. Три проекти пропонують різні відповіді на одне питання: як спілкуватись без інфраструктури?Спойлер: Bitchat, Briar і Meshtastic — не конкуренти, а три архітектурні моделі з різними компромісами...

Як працює Bitchat: архітектура Bluetooth-mesh месенджера

Як працює Bitchat: архітектура Bluetooth-mesh месенджера

Більшість месенджерів побудовані за одною схемою: ваш пристрій → сервер компанії → пристрій співрозмовника. Bitchat робить це інакше — повідомлення передається безпосередньо між смартфонами через Bluetooth, без жодного сервера посередині.Спойлер: це можливо завдяки комбінації BLE mesh і протоколу...

Bitchat  месенджер без інтернету, який працює через Bluetooth-мережу

Bitchat месенджер без інтернету, який працює через Bluetooth-мережу

У липні 2025 року Джек Дорсі — засновник Twitter і компанії Block — оголосив відкритий месенджер, який працює без інтернету та без серверів. Він передає повідомлення через Bluetooth між пристроями поруч. Ця стаття пояснює, що це таке, і в яких ситуаціях це може бути корисним.📚 Зміст статті📌 Що...

Ollama у 2026 що це таке і чому розробники масово переходять на локальний AI

Ollama у 2026 що це таке і чому розробники масово переходять на локальний AI

ChatGPT і Claude — зручні інструменти. Але вони працюють у хмарі: твої запити обробляються на зовнішніх серверах, а доступ до них коштує $20 на місяць і вимагає інтернету. Ollama вирішує це інакше: модель запускається прямо на твоєму комп'ютері. Без підписки, без інтернету...

Як перевірити ціну готелю перед бронюванням: технічний гайд

Як перевірити ціну готелю перед бронюванням: технічний гайд

Важливо розуміти одразу: більшість коливань цін на туристичних платформах — це звичайна динамічна зміна попиту, а не обов'язково персоналізація під конкретного користувача. Ціни змінюються залежно від кількості вільних номерів, сезонності та активності інших покупців. Кроки з цього гайду допоможуть...