Squid Rate AI

Squid Rate AI is a DeFi analytics tool designed to predict supply and borrow rates on Compound V2 by using utilizing Squid SDK and GizaTech SDK for developing ZKML models.

  • 0 Raised
  • 421 Views
  • 0 Judges

Gallery

Description

Rationale   

The decentralized finance (DeFi) sector is transforming traditional financial systems by enabling transparent and efficient financial transactions on blockchain. Among the prominent protocols, Compound V2 allows users to lend and borrow cryptocurrencies. However, the dynamic interest rates for supply and borrow operations depend heavily on market conditions, which makes predicting them crucial for maximizing returns and minimizing risks. Initially, this project relied on GizaTech's dataset to build predictive models. Now, by leveraging Squid's SDK to gather real-time blockchain transaction data from tokens like WBTC and USDT, the dataset has been significantly enriched, leading to improved prediction performance. This real-time data incorporation addresses the need for accurate, timely predictions, giving users a competitive edge in managing DeFi investments.

The Solution   

Squid Rate AI combines advanced machine learning models with enriched blockchain datasets to predict supply and borrow rates on Compound V2. Using GizaTech's Zero-Knowledge Machine Learning (ZKML) framework, the solution enhances privacy while ensuring accurate predictions. By integrating Squid's SDK, Squid Rate AI leverages real-time blockchain data on WBTC and USDT transactions, which enriches the dataset and enables more precise forecasting. These predictions allow users to make informed decisions, optimize their portfolios, and manage risks in the ever-evolving DeFi market. Squid Rate AI offers reliable insights, supporting users with predictive analytics that leverage both historical and real-time data.


Value Proposition   

Squid Rate AI provides DeFi users with accurate predictions of Compound V2 supply and borrow rates by combining Compound v2 dataset with real-time blockchain data. The use of Squid's SDK to gather token-specific transaction data ensures that the predictions are more relevant and timelier. This enriched dataset enhances the precision of machine learning models, helping users optimize their lending and borrowing strategies. Whether for risk management, portfolio optimization, or algorithmic trading, Squid Rate AI empowers users to stay ahead of market fluctuations and make data-driven decisions that maximize their returns.


Technologies Used   

  • Squid SDK: Enables the collection of real-time blockchain transaction data for tokens like WBTC and USDT, enriching the dataset for better prediction accuracy.
  • GizaTech's ZKML Framework: Ensures secure, privacy-preserving predictions through Zero-Knowledge Machine Learning, which safeguards user data while enhancing model performance.
  • GizaTech's Dataset (Compound V2): The DeFi protocol providing dynamic supply and borrow rates for cryptocurrency lending and borrowing.
  • Pytorch: Used to train and forecast the supply and borrow rates based on historical and real-time data from GizaTech datasets and Squid SDK.


Squid SDK Data Extraction

The following contract addresses were passed to the EvmBatchProcessor constructor to extract data which were specifically aimed to enrich the Compound V2 dataset from GizaTech's Dataset. Considering the volume of the dataset, USDT and WBTC was focused and primarily used during the data analysis, but other tokens were also considered during the mining process.

export const contractAddresses = 
{Matic: '0x7D1AfA7B718fb893dB30A3aBc0Cfc608AaCfeBB0'.toLowerCase(), 
WBTC: '0x2260FAC5E5542a773Aa44fBCfeDf7C193bc2C599'.toLowerCase(), 
USDT: '0xdAC17F958D2ee523a2206206994597C13D831ec7'.toLowerCase(), 
USDC: '0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48'.toLowerCase(), 
WETH: '0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2'.toLowerCase(), 
DAI: '0x6B175474E89094C44Da98b954EedeAC495271d0F'.toLowerCase()};


Engineering the Extracted Dataset (WBTC and USDT)

The total traded amount in each day for each token was aggregated by the following Market Activity timeframes to enhance the information that can be contained from the original Compound V2 Dataset.

  1. Asia-Pacific Market Activity (00:00 UTC - 08:00 UTC):
    1. This time frame captures the trading activity primarily from major financial hubs in the Asia-Pacific region, including Tokyo, Hong Kong, Shanghai, and Australia.
  2. European Market Activity (07:00 UTC - 16:00 UTC):
    1. This window includes trading hours in key European financial centers such as London, Frankfurt, and Paris, where significant market activity occurs.
  3. US Market Activity (13:00 UTC - 21:00 UTC):
    1. This period aligns with the active trading hours in the United States, covering major markets like New York and Chicago.


Data Synchronization

To synchronize the timeframes of the Squid's extracted dataset for the two tokens (USDT and WBTC) with the Compound V2 dataset timeframe, we first extracted both the oldest and latest timestamps from the extracted dataset for each token:

Oldest Timestamps: {'USDT data': Timestamp('2017-11-28 15:38:10+0000', tz='UTC'), 'WBTC data': Timestamp('2023-08-26 16:30:59+0000', tz='UTC')}

Latest Timestamps: {'USDT data': Timestamp('2024-08-07 18:11:59+0000', tz='UTC'), 'WBTC data': Timestamp('2024-07-31 05:25:11+0000', tz='UTC')}

We then extracted the global oldest and latest timestamps for each token, which resulted to the following:

Global Oldest Timestamp: 2023-08-26 16:30:59+00:00
Global Latest Timestamp: 2024-07-31 05:25:11+00:00

Both tokens dataset must synchronize on the same timeframe so that we can avoid having null values when we integrate it with the Compound V2 dataset. We then use the Global Oldest Timestamp and Global Latest Timestamp we then use it to truncate the original Compound V2 dataset to these bounds. The tradeoff is that we will be losing more information on the original dataset to capture a wider timeframe, but we could focus more on a specific timeframe on how both the supply and borrow rates works.


Enriched vs. Non-Enriched Results (Supply Rate Model)

Based on the jupyter notebook from the code repository here's the comparison of the models (supply rate focused) that were trained on enriched Compound v2 dataset and the non-enriched one.

Original Compound v2 Dataset Results:

Training RMSE: 0.37868252396583557
Training R-squared (R²) score: 0.8565995572197411
Validation RMSE: 0.663108229637146
Validation R-squared (R²) score: 0.7677438088501785
Test RMSE: 1.4575222730636597
Test R-squared (R²) score: -5.9971057203683245

Enriched Compound v2 Dataset with (Squid Extracted Data) Results:

Training RMSE: 0.1319979578256607
Training R-squared (R²) score: 0.9825765381387938
Validation RMSE: 0.5409678220748901
Validation R-squared (R²) score: 0.8454242812656314
Test RMSE: 0.8830604553222656
Test R-squared (R²) score: -1.5684374082940638


Percentage Improvement from Original to Enriched Dataset

  • Training RMSE Improvement:

    • 65.16%
  • Training R² Improvement:

    • 14.71%
  • Validation RMSE Improvement:

    • 18.41%
  • Validation R² Improvement:

    • 10.12%
  • Test RMSE Improvement:

    • 39.40%
  • Test R² Improvement:

    • 73.85%


Note: The R² score on the test set may not be ideal, but this is largely due to the data synchronization constraints with the Squid Extracted Dataset, limiting the timeframe we could capture. The primary goal of this analysis was to demonstrate whether integrating Squid's data could enhance the model's predictive performance. The results show that it does, as seen by the significant improvements in both training and validation scores when comparing the original and enriched datasets. This suggests that, in the future, as more data is extracted over time, the model's predictive capabilities can be further enhanced by capturing a broader and more comprehensive timeframe during training.


Giza WorkSpaces Action

Training of Supply Rate Prediction Model Enriched with Extracted Data from Squid

The enriched trained supply rate model (built with PyTorch) will be converted to ONNX format. This allows for seamless conversion of the model into Cairo, the language used for generating proof of computation for model inferences. The model must be converted and deployed to an endpoint before running inferences using the Giza CLI. The same process can be done will also be done with the borrow rate model.


The following code snippet shows how we can generate a zero-knowledge proof from the inferences/predictions of our transpiled and deployed model using the GizaModel module from the giza actions library.

prediction_result, proof_id = prediction_model.predict(input_feed={"input": data_for_prediction}, verifiable=True #Set to true to generate proof id for your model)


We can now verify the proof using the following command on the Giza CLI:

giza verify --proof-id PROOF_ID


Code Repository:

Squid Rate AI Repository

Squid Rate AI Jupyter Notebook


About the Developers

Team Lead

"I am into Robotics, Electronics, Programming, Data Science, Data Analysis, Business Intelligence, Machine Learning, Deep Learning, and ZKML Solutions & Applications"


Recent Awards:- Connext AI Solutions Hackathon 1st Place (2024) (Developed an AI chatbot using finance and payroll documents of the company.)  See More

- Gizathon Top AI Action 2nd Place (2024) See More

- Starknet Infra Hackathon Overall Best Project (2023) See More

- FWD Regional Insurtech Data Hackathon Top 10 Finalist (2022)  (Developed an ML model for Insurance Products Preferences of Customers) See More

- Build the Future Hackathon First Placer (2022) See More

- Fishackathon (Wild Fisheries) Finalist (2022) See More

- Galileo Hackathon Philippines First Placer (2021) See More

- Planetary Health Hackathon First Placer (2021) See More

- Taikai Top 100 Builders (Rank 26) See More

Developer


I am Mark Lloyd Cuizon, a driven and dedicated tech enthusiast, with a strong foundation in AI, machine learning, data analysis, and software development. My passion lies in applying cutting-edge technology to solve real-world problems, particularly in the fast-evolving world of AI. I thrive in dynamic environments where I can leverage my strong problem-solving skills, keen attention to detail, and adaptability. 


Prominent Hackathon Participations

  • Connext AI Solutions Hackathon Participant (2024): Developed a chatbot based on payroll and finance documents.
  • Starknet Winter Hackathon Participant (2024): Developed Prescription Insight - Utilizing AI to assess substance abuse risk in patients via confidential analysis of demographics, traits, and history, leveraging GizaTech's Orion and AI actions for secure ML models.



Attachments