Constructing a resilient crypto trading database: Data from faulty WebSockets and slow-responding REST APIs

date
May 26, 2023
slug
crypto-trading-database-from-faulty-websockets-and-slow-rest-apis
status
Published
tags
distributed-system
web-scraping
system-design
event-driven-architecture
summary
Building historical trading data faces problems from the unreliability of crypto API endpoints, especially WebSockets. This hinders the retrieval of accurate high-frequency data, like pricing information. Our mission is to calculate a precise rolling 5-minute VWAP by ticker, correcting historical data for trade errors and ensuring accuracy
type
Post

Faulty WebSocket and slow REST APIs

While building the historical trading database by using data from external providers, I encountered common problems with receiving trading records from crypto API endpoints, particularly WebSocket endpoints, which are highly unreliable. This problem creates issues in pulling high-frequency data, such as pricing information, where we want to run real-time calculations that require accuracy. Our mission this time is to take in trade data and calculate a rolling 5-minute Volume Weighted Average Price (VWAP) by ticker, optimizing for accuracy. We do not just calculate the latest 5-minute rolling VWAP, but also correct historical time points once detect data inconsistency
We have two sources of crypto trading data from a same Data Provider:

WebSocket

Endpoint: ws://x.x.x.x:80/stream
Structure of each trade order

REST API

Endpoint: https://x.x.x.x:80/api
Maximum number of trade orders in reach query response is 100

Problem with unreliable connections

WebSocket
REST API
Returns many duplicated trade orders
No duplication, can be used as source-of-truth
Keep disconnecting sometime
Frequently takes so long to response (timeout error)
Return data might have invalid structure
Historical trading data: Overview of System Design with required APIs and components
Historical trading data: Overview of System Design with required APIs and components

Analysis and proposed approaches

Deal with unreliable WebSocket connection

Set up a WebSocket handler that continuously listens for updates from the WebSocket connection. Since the WebSocket is unreliable, it's important to handle connection drops, timeouts, and errors gracefully. Implement reconnection logic to reconnect if the connection is lost

Deal with slow-response REST API

Create retryable HTTP Client – the first solution • Retry mechanism: Implement a retry mechanism to handle slow-response REST API requests. This mechanism allows the client to retry failed requests, ensuring that data can still be queried even if the API responds slowly • Timeouts: Set appropriate timeouts for the REST API client to prevent indefinite waiting. Timeouts provide a mechanism to handle unresponsive or slow API calls effectively. • Backoff Strategy: Use a backoff strategy to control the retry interval between consecutive failed requests. This strategy helps avoid overwhelming the REST API and gives it time to recover • Logging and Monitoring: Incorporate logging and monitoring mechanisms to track the performance and health of the REST API client. Log important events and errors for debugging purposes The retryable-http library (in Go programming language) offers us all features. However, it remains insufficient in addressing our needs.
Retry fetching data with Message Queue – the second solution Once all retries have been attempted using a retryable HTTP Client, the request is not discarded outright. Instead, our system employs an alternative retry approach that leverages the power of Redis and RabbitMQ. The request is placed into a queue, awaiting processing by a dedicated message consumer. If, after each subsequent retry, the request still fails, it is returned to the queue for another round of execution. This iterative retry process persists until the request either succeeds or reaches the maximum number of configured retries specified within the system
Publish delayed message with RabbitMQ: Before applying the second retry solution to the request, we introduce a deliberate delay and setup process instead of immediately placing it in the queue. This pause ensures that there is a suitable interval before attempting the retries again, following the exhaustion of retries with the first solution
Two solutions of retryable API query requests
Two solutions of retryable API query requests

Data Inconsistency

The REST API's trading data serves as the authoritative source. Upon receiving trades, they undergo validation and are compared with records within the local database for the corresponding time window. During this comparison, if any inconsistencies arise among trades sharing the same Trade ID, the "integrity validation" module alerts the system to flag the associated ticker until all conflicts are resolved. Notifications regarding inconsistent data are published to a queue and subsequently consumed by a dedicated message consumer There are several potential scenarios that can lead to data inconsistency: • Trade Details Mismatch: In this situation, the details of trades received from the REST API do not match the corresponding records in the local database. • Difference in the Number of Trades: Another possibility is that the number of trades from the REST API data differs from the number of trades recorded in the local database (same time window). This discrepancy could occur due to a missing trade in the local database, or the presence of redundant trades caused by trade order reversals
Flagging the token ticker once detect data inconsistency and start resolving conflicts
Flagging the token ticker once detect data inconsistency and start resolving conflicts
 
Next Read:
 

© tonybka 2023 - 2025