Crypto 101 with Vin: Data Oracles

6 min readFeb 23, 2022

To build truly useful applications and services, more often than not, there will be a need to get data from an external source outside the blockchain or to send data to some external recipient. In essence data oracles allow individuals, autonomous or otherwise, to interact with data from systems outside their own.

Some Real World examples…

A good analogy is payment integrations in software. A lot of companies need ways to send requests for payment and receive the payment. They typically won’t have the resources to build out their own solution to get payment from banks so they use some kind of third party between themselves and the banks they need payment from.

In this analogy the company is providing a product/service who needs data to see if a payment can be processed making them data consumers. The payment gateway is an oracle that sits between the company and the customer’s payment processor (PayPal, Visa, etc.) who acts as a data source.

Now let’s step away from payments since the blockchain can already take care of that. Another system of data we can look at is the US National Weather Service. The national weather service can be considered a data oracle. They start by collecting various points of weather data from different types of sensors and tools placed across the United States and deposit it into a huge repository of weather data. Those sensors and measurement tools are analogous to the data sources which is just a database. Weather services like AccuWeather, The Weather Channel, or your local news service need accurate real time data to make their forecasts and predictions. These services are the data consumers. The data repository maintained by the Weather Service is what gets used.

Oracles in Crypto

What Do Oracles Do?

They give smart contracts the capability to communicate with data sources not available on a blockchain. For short, smart contracts are pieces of code on the blockchain that can do a variety of tasks given the correct conditions and parameters. Some of these tasks are dependent on data not visible directly on a blockchain.

An Example: Algorithmic Stable Coins

In decentralized finance there is a need to have crypto-assets that are 1-to-1 with certain real world assets like USD. Achieving this stable price can be achieved by using a reserve/collateral that backs each stable coin with 1 USD. Algorithmic stable coins do not do that. They rely on external price feeds as an input to an algorithm that determines how to contract or expand supply of the stable coin. I won’t explain the math behind that. This has to occur so that the price of the stable coin can actually remain at $1.

Use Cases:

Algorithmic stable coins
On-Chain Insurance Policy issuance & Evaluation
Decentralized Exchange rates
Collateralized loans
Synthetic Assets
Randomness in Games and NFT minting

There are more ways to use oracles than listed here, but this is just to illustrate how widely used they are.

How Do Oracles Work?

The following is a high level overview of a smart contract requesting data

From Chainlink whitepaper: “A hybrid smart contract (3) consists of two complementary components: an on-chain component SC (1) , resident on a blockchain, and an off-chain component exec (2) that executes on a DON.”

Smart contracts need to make a request to an oracle.
The request isn’t sent directly to the oracle, but rather a protocol that decides which specific oracles are needed.
Once the oracles are selected, they process the data and report it back to another system that validates the data
Validation includes ensuring that the data is accurate and untampered by comparing against data from multiple sources if possible
Once the data is validated it gets sent back to the original blockchain where the request originated from

Types of Oracles

This isn’t exactly a stringent list of properties that Oracles have, but it covers the overarching properties that can generally describe one.

Software vs Hardware: Is the data the result of devices that relay information about the real world or is it a result of online sources, databases, and servers.

Inbound vs Outbound: Is data being sent off the blockchain or onto the blockchain?

Centralized vs Decentralized: Are there many sources of data or few?

The Oracle Dilemma

By now you should have a general understanding of the why, what, and how of Oracles. As with all things, nothing is without it’s issues so some extra considerations have to be made.

How can we be sure an oracle is providing truthful data?

There isn’t an easy way about this without first knowing some history about an oracle. We’ll explore a solution that uses reputation as an indicator.

An oracle network can assign a default reputation value from 1–10. 1 being the least reliable and 10 being the most reliable. Let’s say all oracles when registering with the network get a reputation of 5. As time goes on and an Oracle consistently provides accurate data when compared against other sources, it’s reputation goes up. Therefore the reputation value can be used in calculating if data from an oracle can be trusted even if it differs slightly from other sources.

The reverse can also happen. If an Oracle consistently returns inaccurate data, its reputation can drop and if it gets sufficiently bad, it may even get dropped from the network. The more often an Oracle gets called upon, the more it can get paid incentivizing it to keep providing accurate data and keep its reputation up.

What if the data is centralized?

This is an issue because this means that there would be very few data sources to compare against and the reputation idea we used earlier becomes moot. The oracle becomes a point of contention since it unevenly distributes more power to whoever maintains the centralized data source. The use of centralized data becomes unavoidable when looking at certain real world data sets.

Social Trend analytics is a very good example of this issue in the current time. Lets say you are building a decentralized service that helps identify emerging social trends. Because most social media is maintained by companies in 2022, you would have to rely on media like Twitter, Instagram, or Google search analytics. Because few companies have access to a majority of the data, they become a bottleneck. They have the ability censor certain data on their platforms making it hard to verify against other sources.

You can try to build decentralized social media and use that, but it’s not as easy when we consider certain data that would be hard to collect without some initiative funded by government or the like such as the weather example we used. All we can do is trust that data unless we can afford to and are willing to collect and share the data ourselves.

In conclusion,

This problem falls into the same category as the blockchain trilemma and future innovation will continually try to increase the availability of trustworthy data. As more services and applications turn towards smart contracts, data oracles will inevitably see more usage. Blockchain tech absolutely needs reliable oracles to thrive.

Thanks for reading and check in next time.

Kakavarna.eth — Automation Engineer(9–5), crypto investor, and gamer with opinions