# Tide Performance Specification ## Infrastructure requirements We need to build an ETL pipeline (Extract, Transform, Load) - Extract: Multiple data sources: - On-chain data (transactions + logs) - Web2 performance data (Potentially via external APIs) - Tide Analytics DB (Postgres) - Tide backend DB (Postgres) - Transform: Aggregation and filtering layer - perform snapshots (date_trunc aggregation) - join - Load: make a real-time cached API available (?) ## Base queries - total wallets over time - daily active wallets -> snapshot - weekly active wallets -> snapshot - transactions - wallet retention cohort ## MSP (minimum sellable product) 1. Add a contract. (chain? address?) 2. Start fetching **on-chain contract interactions** up to X time ago and store them somewhere (DB? cache?) 3. Automatic cron job setup to update such data every X time 4. The output will be our "on-chain" data source, that we could integrate with others such as our Analytics DB etc. 5. if we need to do more queries or aggregations, we can do them on the readily-existing data on our db instead of having to refetch everything every time (it can be slow af, + rate limit issues, + costs, etc) 6. transform/load: try to use ETL tools to build an API. Otherwise build simple APIs that fetch from our DB and our frontend can use. 7. Add campaign to track - We need platform, start/end date, address/chain of campaign (if the reward is off-chain, this could be a problem) ### example using the ShroomSDK API: ```sql select date_trunc('day', txs.block_timestamp) as day, txs.from_address from core.fact_transactions as txs where to_address = '0x1111111254eeb25477b68fb85ed929f73a960582' order by block_timestamp desc ... ``` ### Ability to filter data (not in the MSP) - ideally, we need to be able to filter the data across many dimensions like time, retention, other on-chain stats like users of other protocols, labels, minimum balances, etc... - use the already existing analytics indexer to save data regarding users of such protocol (balances, tokens, labels, etc) - we can use the `internal-ref` field to distinguish these users in our analytics setup (for instance we can use the tracked address as internal ref or something similar) - Filter by: - users with a net worth > 100 usd - users that also use X protocol (advanced) --- ## Flipside Crypto queries / Exploration ### Total wallets over time ```sql with transactions as ( select date_trunc('day', block_timestamp) as day, count(from_address) as user_count from arbitrum.core.fact_transactions where to_address = LOWER('0xb4315e873dBcf96Ffd0acd8EA43f689D8c20fB30') group by day order by day desc ) select day, sum(user_count) over (order by day asc rows between unbounded preceding and current row) as total_wallets_over_time from transactions order by day desc limit 7 ``` ### Daily active wallets ```sql -- users: last week, hourly snapshots select date_trunc('hour', block_timestamp) as tx_date, count(distinct from_address) as users from arbitrum.core.fact_transactions where to_address = LOWER('0xb4315e873dBcf96Ffd0acd8EA43f689D8c20fB30') and block_timestamp >= CURRENT_DATE - interval '7 day' group by tx_date order by tx_date desc ``` ### Weekly active users ```sql -- users: last month, daily snapshots select date_trunc('day', block_timestamp) as tx_date, count(distinct from_address) as users from arbitrum.core.fact_transactions where to_address = LOWER('0xb4315e873dBcf96Ffd0acd8EA43f689D8c20fB30') and block_timestamp >= CURRENT_DATE - interval '30 day' group by tx_date order by tx_date desc ``` ### Transactions in the past week ```sql -- transactions: last week, hourly snapshots select date_trunc('hour', block_timestamp) as tx_date, count(*) as users from arbitrum.core.fact_transactions where to_address = LOWER('0xb4315e873dBcf96Ffd0acd8EA43f689D8c20fB30') and block_timestamp >= CURRENT_DATE - interval '7 day' group by tx_date order by tx_date desc ``` ### Wallet retention Cohort ```sql with transaction as ( select from_address, block_timestamp from ${chainName}.core.fact_transactions where to_address = LOWER('${contractAddress}') ), user_first_tx as ( select MIN(date_trunc('week', block_timestamp)::date) as cohort_week, from_address as address from transaction group by address having cohort_week >= date_trunc('week', CURRENT_DATE) - interval '77 day' ), user_activity as ( select from_address as address, DATEDIFF('week', uft.cohort_week, date_trunc('week', t.block_timestamp)::date) as week_number from transaction t join user_first_tx uft on t.from_address = uft.address group by 1, 2 having week_number < 12 ), cohort_size as ( select uft.cohort_week, count(*) as num_users from user_first_tx uft group by 1 order by 1 ), retention_table as ( select uft.cohort_week, ua.week_number, count(*) as num_users from user_activity ua join user_first_tx uft on ua.address = uft.address group by 1, 2 ) select rt.cohort_week, cs.num_users as total_users, rt.num_users as week_users, rt.week_number, rt.num_users::float * 100 / cs.num_users as percentage from retention_table rt join cohort_size cs on rt.cohort_week = cs.cohort_week where rt.cohort_week is not null order by 1, 4 ``` ### Daily transactions ```sql SELECT tx_hash, block_number, block_timestamp, origin_function_signature, from_address, to_address, input_data FROM optimism.core.fact_transactions WHERE to_address = LOWER('0xB0D502E938ed5f4df2E681fE6E419ff29631d62b') AND block_timestamp > date_trunc('day', CURRENT_DATE) - interval '1 day' AND status = 'SUCCESS' ``` ## Galxe Queries Galxe has a useful API with a playground at the link https://graphigo.prd.galaxy.eco/, which allow us to experiment with some graphql queries ### Get NFT Holders ```graphql query nftHolders { campaign(id: "GC1KnUyJof") { nftHolderSnapshot { holders(block: 28056624, first: 1000, after: "6641147") { totalCount list { id holder } } } } } ``` Here, `block` stands for the block number of the chain. `first` is the number of 'first' results to return, with a maximum cap of 1000 users. `after` is a `tokenId` of the NFT collection. ### Dox any Galxe participant ```graphql query ciao { campaign(id: "GCzVZUmiqX") { id numberID numNFTMinted participants @skip(if: false) { participants(first: 1000, after: "-1", download: false) { list { id address email solanaAddress aptosAddress discordUserID twitterUserID twitterUserName accessToken __typename } } participantsCount } } } ``` Here, `after` is just a progressive number of results. Therefore, `after: "0"` will skip the first result etc. Try this on the playground to see how many info you can get. ## Campaign Stats This query returns the following example data: ![](https://hackmd.io/_uploads/HkVALFkSh.png) ```sql WITH transactions AS ( SELECT tx_hash, from_address, block_timestamp, ( case when block_timestamp <= CURRENT_DATE - interval '30 day' then 'pre_campaign' when block_timestamp > CURRENT_DATE - interval '30 day' and block_timestamp <= CURRENT_DATE - interval '15 day' then 'mid_campaign' when block_timestamp > CURRENT_DATE - interval '15 day' then 'post_campaign' end ) as timeframe FROM arbitrum.core.fact_transactions WHERE to_address = LOWER('0x53Bf833A5d6c4ddA888F69c22C88C9f356a41614') ), badge_txs as ( select tx_hash, from_address, block_timestamp, timeframe from transactions where from_address IN ( -- Here we should add the list of address '0x93aaa973b33b9564b54e0480b4200799eac171cd', '0x60ed00da69f9f66803e7f2c99e414d3b138be0d6' ) ) SELECT tx.timeframe, count(DISTINCT tx.from_address) as users_count, count(distinct btx.from_address) as badge_count, count(distinct tx.tx_hash) as txs, count(distinct btx.tx_hash) as holders_txs, holders_txs * 100 / txs as perc from transactions tx join badge_txs btx on tx.timeframe = btx.timeframe group by tx.timeframe ``` This instead is the same data, but aggregated by day instead: ```sql WITH transactions AS ( SELECT tx_hash, from_address, block_timestamp, date_trunc('day', block_timestamp) as day FROM arbitrum.core.fact_transactions WHERE to_address = LOWER('0x53Bf833A5d6c4ddA888F69c22C88C9f356a41614') ), badge_txs as ( SELECT tx_hash, from_address, block_timestamp, day FROM transactions WHERE from_address IN ( -- Here we should add the list of address '0x93aaa973b33b9564b54e0480b4200799eac171cd', '0x60ed00da69f9f66803e7f2c99e414d3b138be0d6' ) ) SELECT tx.day, count(DISTINCT tx.from_address) as users_count, count(distinct btx.from_address) as badge_count, count(distinct tx.tx_hash) as txs, count(distinct btx.tx_hash) as holders_txs FROM transactions tx LEFT JOIN badge_txs btx on tx.day = btx.day group by tx.day ORDER BY day asc ```