The 3 Levels of Data Aggregation in Chainlink Price Feeds

Smart contracts define the ways in which user funds can be handled by an on-chain protocol, while the oracle is the source of off-chain information that definitively decides how those funds will ultimately be transferred. External events and the on-chain contract logic come together to form a complete contractual process, which makes the oracle mechanism just as critical to proper execution of a smart contract as its underlying code.

DeFi protocols often depend on market data to trigger on-chain events, particularly the price of assets, but also other types of information like total crypto market cap, FX rates, and the reserve balances of asset-backed tokens. These price feeds are used by the smart contract to execute important on-chain actions involving user funds, like whether or not to liquidate a collateralized loan, the fair market exchange rate for a synthetic asset swap, or when to rebalance a portfolio that uses an automating trading strategy.

Chainlink Price Feeds have become the most widely used price oracles in the DeFi market, already securing billions of dollars in value for leading and emerging protocols such as Aave, Synthetix, and Yearn. Chainlink Price Feeds have been purpose-built to provide DeFi applications with the maximum amount of price oracle security, reliability, and data quality. These properties are generated through a variety of design choices like decentralization at the oracle node and data source levels, selection of secure node operators and premium data sources, provable on-chain performance and reliability, and crypto-economic incentives for security. To get a much more in-depth look into Chainlink Price Feeds, read our deep dive on the importance of data quality for DeFi smart contracts.

In this article, we examine the data quality and oracle security of Chainlink Price Feeds by focusing on the three types of aggregation that take place for each price update: at the data source, node operator, and oracle network level. By understanding the multiple layers of redundancy purposely built into each Chainlink Price Feed, it becomes clear why they currently secure a large share of the DeFi landscape.

The basic flow of how crypto price data gets on-chain for DeFi

Data Source Aggregation

The first component of a Chainlink Price Feed is the actual data sources being used by the Chainlink oracles to obtain price data. Raw price data generally comes from off-chain centralized exchanges (e.g. Coinbase, Binance) and on-chain decentralized exchange protocols (e.g. Uniswap, Kyber). Data aggregators (e.g. BraveNewCoin, CoinGecko) collect raw exchange data from across these exchanges to generate refined datasets, such as weighting it by volume, liquidity, and time differences, as well as removing outliers, filtering fake exchange volume, monitoring for exchange downtime, etc. The key to having a reliable source of price data is full market coverage, wherein a price point represents a refined aggregate of all trading environments as opposed to a single exchange or even small group of exchanges so as to prevent data manipulation vulnerabilities and/or volume shift inaccuracies.

To ensure a high degree of tamper-resistance and reliability, Chainlink Price Feeds exclusively pull data from premium data aggregators. This means that each data source represents a refined, volume-adjusted price point aggregated from all centralized and decentralized exchanges, making it inherently resistant to numerous attack vectors like flash loans or abnormal deviations.

Node Operator Aggregation

The second component of a Chainlink Price Feed is the on-chain response of each of the individual oracle node operators. These node operators consist of professional DevOps teams who have experience operating mission-critical blockchain infrastructure that already secures billions of dollars in on-chain value. They are responsible for running the Chainlink core software that’s used to source and broadcast external market data on the blockchain.

Node operators within Chainlink Price Feeds source price data from multiple independent data aggregators and take the median (middle) value between them, mitigating outliers and API downtime. This means that not only does each individual data source reflect an aggregated price point from all trading environments, but each individual node’s response represents an aggregate from multiple data sources, further preventing any single source from being a point of failure.

Chainlink node operators take a median value from multiple data aggregators

Oracle Network Aggregation

The third component of a Chainlink Price Feed is the oracle network as a whole. An oracle network defines how the collection of nodes work together to create a single reference data point on-chain, which generally involves aggregating the responses of all the individual nodes. The most common form of aggregation is taking the median of the reported values once a predefined number of nodes have responded. Ultimately, aggregation can take on many forms and can be performed either on-chain or off-chain depending on the throughput and cost of the underlying blockchain network.

Chainlink Price Feeds aggregate the responses of numerous security-reviewed node operators and take a median, requiring a predefined threshold to respond in order to trigger an on-chain price update (e.g. minimum of 14 out of 21 in the example below). This type of oracle network aggregation ensures that the oracle network as a whole maintains high uptime and resistance to manipulation in its delivery of data on-chain, even in the unlikely scenario that a few nodes or data sources were to go offline or attempt to perform malicious activity.

The ETH/USD Chainlink Price Feed oracle network

By incorporating multiple layers of aggregation in the data source, node operator, and oracle network layers of Chainlink Price Feeds, DeFi applications receive industry-grade security and reliability on the price data they reference when deciding how to manage user funds. It is for this reason that Chainlink Price Feeds have become the most widely used source of secure on-chain price data in the DeFi economy, securing billions in on-chain value.

With security and reliability architected into all layers of the Chainlink Network, dApps consuming Chainlink Price Feeds can be assured that their contracts will always execute as expected, setting a solid foundation from which to scale to secure more value for users.

If you’re a developer and want to quickly get your application connected to Chainlink Price Reference Data, visit the developer documentation and join the technical discussion in Discord. If you want to schedule a call to discuss the integration more in-depth, reach out here.

Website |  Twitter | Reddit | YouTube | Telegram | Events | GitHub | DeFi