The LLamas Graze: Frok.com episode 1

Note: This is a community-generated transcript. Some parts of the text have been rephrased for better context. Full recording here: https://www.youtube.com/watch?v=4qj23DS55zU --- ![frok ascii](https://hackmd.io/_uploads/B1rbeDE4xe.jpg) *Cogsy: Welcome to The Llamas Graze with Knows from Frok. This has been a long time coming. I think the last time we talked was about six to eight months ago, and it was a marathon two-and-a-half to three-hour session. I didn't even bother editing it; I just threw the whole thing up on YouTube. This time, we've decided to break it up into several episodes. Hopefully, in the next week or two, we'll do a live test of the swapping mechanism, and I'll record that to use for promo clips on YouTube.* *First, welcome, and thank you for coming back. I want to start by giving you a chance to tell us about the Frok project, as many people are new to TheLLamas.io and Frok.com. What is Frok, before we dive into the changes since we last spoke?* # **What is Frok?** It's been more than a year since I was last on "The Graze"—not even six or eight months, more like 11 months, give or take. A lot has changed since then. I'll try to keep it brief because there's a long history to what Frok was before, but I'll cut right into what it is today. I see a few existing FROK holders here, but also some new faces. Essentially, to summarize it in two sentences, Frok aims to deliver **variable levels of intelligence at a variable price**. To give you an idea of what it is, imagine you have a really tough problem—say, a huge, complex codebase with math involved.... you've been fighting with Cursor agents and different models for two or three hours…. you're exhausted…. and maybe there are time constraints.... or perhaps you need to write a really good essay for your university. The “variable levels” from the user's perspective mean something like this: You have $5 worth of FROK tokens. You can say to Frok: "Here are 10 different documents—it could be code files or context from a university project—and you need to do this and this. Here's five bucks." The moment you send it all to Frok, you're effectively renting a **multi-million dollar infrastructure for a minute or two**. On those GPUs, various AI models are running, trying to achieve the best quality to solve your problem. # **Frok's Differentiator: Pay-for-Premium Inference** *Cogsy: In comparison to currently popular AI models or services like Claude and ChatGPT, they have a flat pricing model where you pay a subscription fee and never get to bump your quality. You don't have the control to increase the quality of how much inference (computing power) they're pushing toward your answer; you're stuck in their pipeline, and they have to load balance it on their side (to cater to other users’ inference requests at that point in time). What Frok is going to do is allow you to **pay for premium inference**.* I'll get into a few other details, but when it comes to consumer products, a $20 subscription, or even ChatGPT Pro at $200 a month, means you don't have any opportunity to decide how much computing power should be used or how much intelligence should be applied to a particular problem. You can provide the model a hint and say, "Don't overcomplicate this; it's a simple task, so you don't have to think for two minutes about it," or maybe it's a really hard one. This is a passive way to control it to some extent. There are also the API services, which operate on a different model. The consumer chat product isn't the same model that's sold through the API. From this perspective, you could say, “alright, I'll build a mini-agent for my particular problem, and then it just uses their API to chain prompt and orchestrate different steps, say, in the process of writing an essay or solving a particular bug.” You could do that, but it would still cost you comparatively a lot of money for what you get. The key factor that distinguishes Frok is this: **If you have a hard problem, all those hundreds or thousands of steps are done in one place**. It's not going back and forth like with Cursor. With Cursor, you have your codebase, or you can even write essays in Cursor — I do a lot of writing there — the agent goes back and forth, edits a few files, updates are reflected on your interface, and then some stuff is sent again to Cursor's servers, from there to Anthropic, Google, or OpenAI, and it goes back and forth, back and forth. Sometimes these agents run for five to ten minutes. # **Faster And Better Output Quality** *Cogsy: One of the things you said earlier, regarding the load balancing option, was almost like decentralized load balancing, where you're able to pay for what you want to use at that time and what you need, without doing 75 iterations on the same chatbot hoping for the answer you're looking for. You mentioned using an Ethereum-type fix with a gas market, where you pay as you need, or the "tip" concept where you bump up your gas to put yourself first in the pool.* We'll get into that in a moment. I just want to make sure the baseline idea is understood. As I said, you have this very complex problem, and the distinction (when using Frok compared to other solutions) is this: When you send a batch of files and instructions to the model, if there are 100 or 1000 steps that need to be done because it's a hard problem, all those steps and iterations happen **at the same location where the inference is also happening**. Doing it this way not only saves latency between, say, Europe and the US (where the servers are) and avoids network overhead of going back and forth, but you can also do batch inference on GPUs. You can produce 100 answers to the same question at roughly the same cost. It doesn't matter to the GPU if you have one prompt and output one answer or if you do batch inference; it might cost maybe twice as much in electricity and time to generate 100 outputs. **You can compress hundreds to thousands of steps very densely in one place and process it all there\!** The levels of quality you can achieve if you do that surpass everything. Take the best models—Gemini 2.5 Pro, DeepSeek, whatever. If you send something out and get something back, throwing $5 at it isn't normal, but you wouldn't be able to get that level of quality. If you prompt once through a pro subscription (e.g. Gemini / Claude models), the maximum they spend on your problem is maybe 5 to 10 cents in compute. The quality boost doesn't scale linearly with how much you are paying but a slight difference in intelligence levels can make a huge difference for you working on that problem, and also in the intermediary steps. To be fair to Anthropic/OpenAI/Google, we are not saying, "Look, we're beating all the benchmarks, outperforming the best models from Anthropic, OpenAI, and whatnot for a dollar." Because they are very likely more efficient in terms of, I would say, if you put a human to work on a problem for one hour with Opus 4 versus paying $10 for Frok to deliver that in two minutes. In other words: (You may get more bang for your buck with Gemini/Claude/ChatGPT and Frok may cost you more in input/output token costs. But Frok can give you better output quality since you as the user decide how much compute firepower you want from the model and you get the result at a much faster token-per-second speed) *Cogsy: It’s a unique fix to be able to throw that kind of money at the GPU or compute space and say, "Instead of giving you a nickel, we're going to give you $5.... go solve this." Of course, you're going to blow everyone out of the water, and that's the user's choice to spend that.* Absolutely. # **Improved User Experience** For those who use Cursor, Frok also solves a UX issue where you have three, four, or five different LLM models, and you have to test them over a few days to figure out what each one does best, then switch back and forth as you progress. The ideal UX would be a single slider (where you decide the level of output quality you desire). Maybe you use Frok for a week and get a feel for it; Choosing a 2x factor on the slider gives you this level of quality…. a 100x factor gives you this and this level. Then you can reconfigure, say, three or four different buttons or levels, and as you chat and type, whatever your problem is, you just press different buttons, and different costs and levels of intelligence will be applied. # **The Case For Pricing Inference Using Ethereum Gas Model** Now, to touch back on what you said about gas and transition to the economics part, building something like this, or offering it, is a challenge that already exists for Anthropic and OpenAI at this very moment. Anthropic has the Opus 4 model, and it may sound crazy, but if you try to use the Anthropic models when the USA is asleep, they perform much better. The token-per-second speed is faster, and if you really pay attention, the quality is also a little bit better. What you can derive from that is the struggle they have: a ton of customers want to use the models, but they have to distribute the load in real-time while keeping a fixed price. It's tough to predict. For some users, faster token-per-second speeds or quality would be worth paying more, and for others, it would be too expensive. So they have to pinpoint one price, and that's one problem. The other problem is distributing the compute load across various data centers and clusters within a data center’s GPUs. There's a lot of complexity that goes into that while also managing pricing. You have the same issue not just for the providers offering this inference, but also for the builders using those APIs to build products on top of it (pick any AI agent that exists out there within the crypto space). The provider has to price the API usage somehow and apply different levels of intelligence by, for example, deciding how many chain prompts they want to apply if a user prompts a crypto AI agent: "I want to stake some USDC tokens in an Aave pool." How many steps do you need? Do you spend three prompts and risk the agent failing to figure out how to do that... Or do you try ten prompts, but then the latency degrades to 20 seconds, which is bad... So the provider has to balance a lot of things! # **Elastic Inference, Pay-Per-Use Model** So, for us, we're turning this problem around a bit by taking an example from Ethereum. Ethereum also has to balance a very limited block space; 10 million gas per block is the maximum, or the ideal maximum the chain should produce, and there's an upper limit of 20 million gas. The longer the chain stays above 10 million gas consumption, the higher the gas fees go up. The idea there is that once the fees go up, people start using the chain less, and the load gets reduced to the optimal target. That way, everybody at any point has the chance to send in some transaction and get it processed, although at a variable price. One minute it could cost 5 cents to swap on mainnet, and then there's a lot of volatility, MEV bots doing arbitrage selling, and a lot of transactions, and then the fees go up. Using the previous Aave staking example applied to this temporary high-gas-cost scenario, it's not like you're left unable to unstake your USDC tokens and dump them. It would just cost, say, $2 instead of 10 cents during that volatile period. This is one perk of this balancing mechanism that Ethereum has, which we are trying to apply here. The other perk is, and it's the same as in the Ethereum case.... Say you bought $500 worth of $FROK tokens for inference usage but then some power users come around with a lot more $FROK tokens than you and they're willing to pay a lot of money to solve their own complex problems by using a 200x factor on the slider, resulting in them occupying, say, 60-70% of the compute capacity on the backend side of things. Now, for all the other users like yourself who are less willing to pay this temporary increase in real-time inference cost, you could say this is kind of an inconvenience that gets created during that moment (unless they're willing to pay, say, 10x the price), which makes this model bad. But if you look at Ethereum during periods of high gas: - The gas fee paid in $ETH gets burned, resulting in reduction of $ETH supply (deflationary). - Thanks to actual USD currency that went into purchasing that $ETH (suggesting $ETH price appreciation), you could say in some sense every $ETH holder is benefiting from the fact that somebody is (consuming) a lot of gas. The same way if somebody prompts for $100 a shot every minute or so, and the gas fees in $FROK tokens are going up, you are out-priced in that moment, but you know that $100 $FROK are being ‘burned’ every minute (non inflationary). *Cogsy: As a FROK holder, you're getting the benefit of price appreciation due to the ‘burn’, (reducing its circulating) supply. So it might be an inconvenience for you for two minutes not to be able to run a prompt effectively, but if you're a ($FROK) holder, you're happy because $FROK tokens are 'burned'.* Yes, the tokens are being 'burned', while you also know that ETH went in (to the ETH-FROK LP pool) to purchase those $FROK tokens. That way, things are being balanced, and we also make sure that everybody can use it at any point in time (if they are willing to pay the higher fee). As we mentioned, there's currently no service on the web that allows you to rent, say, 64 GPUs for 2 minutes to solve your problem. It doesn't exist, except for ours. *Cogsy: The other thing we're going to see over the next six to eight months is that the service we're getting from the load balancing from ChatGPT is going to get even worse as more people start using LLMs. We're going to have even worse degradation of service, as you say. So it makes this even more important to do now, before we hit mass adoption on chatbot-type things, that it's going to get worse before it gets better.* Absolutely. And I assume they might already be working on similar variable pricing. There's nothing that prevents them from doing so. It's just that, if they implement something like this and say you are out-priced at some point because users are overusing the models and prices go up, you're not being compensated for it. This is because the inference usage is payable only in USD, so during periods of high usage, what they cannot do is reward you in US dollars when you're being out-priced. Either they go down the Web3 route (by offering their own gas token similar to ETH or FROK), which I think is very unlikely, and use the same model, or they implement a suboptimal version of this variable price compute. I also doubt there will ever be an opportunity for you to tap into or shoot out those $5 to $10 prompts. It's a niche that I also think they cannot afford, not just because of the suboptimality they have with traditional finance systems, and also the case that even if you don't deposit a thousand bucks, but you say, "Okay, you allow Anthropic to deduct $50 as soon as your balance hits five bucks," this withdrawal from your credit card costs at a minimum 20 to 50 cents, plus a percentage that goes to Visa, Mastercard, and the banks, and the clearing facilities—all the parties involved—which cuts into the profit margin. *Cogsy: There's already so much slippage in the way they currently have the pricing set up, where if you had a panic or an emergency credit card charge, they wouldn't be able to batch it in like they do with all of their subscription regular charges for a discount, and they would be eating 3% to 6% for a special credit card charge. It doesn't work that way. OpenAI's business model is not set up to do surge pricing, whereas Frok is built from the beginning to address this. Web3 (crypto) has a much better way to solve this, and that's what you're coming up with.* # **Frok Swap: A Crucial Onboarding Tool** *Cogsy: Before we talk about Frok Swap, did you want to discuss anything more about tokenomics?* I think for today, we've sufficiently covered it. Perhaps in the next episode, we can dive deeper. As I said, this variable gas is very similar to Ethereum; just think of every transaction as a prompt, and block space is basically the capacity of the compute pool in the background. *Cogsy: It's a really cool way to lay it out and describe it, and it's a very unique fix to something that we need. So, what we've been seeing most recently from you in chat has been all about Frok Swap. Can you talk about why Swap was needed to help Frok launch?* Yes, you could say it was somewhat of a detour over the last few months, but by now, it's 100% a worthwhile detour. Building a system application, consumer dApp, whatever you call it, is one part; the other part is getting people to use it—them going onto the page, connecting their wallet, purchasing some tokens. This is as much an important step as providing solutions for problems I elaborated on in the last few minutes. I think from a builder's perspective, it shouldn't be ignored. A lot of builders out there are focusing on their products, but they should also dedicate at least the same level of focus on making sure, or thinking about, how their products can be adopted by putting themselves into the shoes of a new user who discovered a protocol. In a lot of cases, even recent reels of dApps, the experience is: you go onto the site, and then there's a registration button, and very often there's not even "connect your wallet"; it's "email and password." These protocols label themselves as Web3 and dApps, but it has nothing to do with Web3 at all, except maybe buying the token from that project, which offers no utility; it just exists. So you're bagholding a token, and then adjacent to that, there's a project where you can use an email and password to use a chatbot. It's really weird. In other cases, you have a button that points to Uniswap, which for a non-Web3 native, it’s definitely weird. *Cogsy: Yeah, they're going to send me to an external site to go buy a token that I don't have any idea what to do with or where it even is.* Yes, it's confusing. Perhaps I can paint a short clip in your head of what I think should be happening for the ideal dApp. You go into the dApp, you can connect any type of wallet. As soon as you connect, you have a complete overview of all your holdings across all the different chains, and you see all your tokens and their respective prices. If it is a central part of the dApp that a token has to be used in order to use the product, then the dApp has to lay out all the rails for the customer to get started. So for Frok, this would be: 1. Connect your wallet \>\> you see your balances; 2. You pick a source token like ETH or USDC; 3. Then you pick a destination token, e.g. $FROK \>\> within two seconds you get different quotes for aggregation. The best quote is already automatically selected (maybe some pre-selection button for $20); 4. You click on the quote you want; 5. You click on a button to sign a single message which says, "Yeah, I'm accepting the terms and conditions, I'm ready to spend that amount of ETH for that amount of $FROK tokens, and I'm also putting them into the escrow contract." **This signature (step 5\) also logs you in**. So you're signing ONE signature, and then as soon as you click on “Sign TX”, then “Confirm”, basically two or three seconds go by. Within those three seconds: - The dApp view for the Chat UI is being loaded - The transaction is being mined - Your entire portfolio (including your interests) is being analyzed, e.g., if you hold CRV / CVX tokens, Frok will give you four example questions on the next view related to Curve and Convex (that you might be interested to ask Frok) All of this happens within three seconds, and then, boom, you see the dApp’s UI with the example questions, and you can get started right away. People say crypto is complex, but if you just look at the number of clicks and the time that passes, it's 100% more efficient and convenient than, say, OpenAI. You know, sign in, sign up with Google, then your data, then confirm, then next, email verification, then payment. *Cogsy: You have to confirm it, get your phone, try not to get SIM-swapped in the process. So, I think what you just said is that Frok Swap is a necessary tool for an onboarding process. It's an onboarding tool. Is there also going to be a credit card option for people to be able to buy $FROK with a credit card to be able to start using it?* For the start, no, but solutions from Coinbase exist, so we're looking into that direction, if you're already on Base L2 chain, and you know Coinbase is huge in the US but also globally. *Cogsy: There'll be a Coinbase dApp that you can plug into the front page on Frok to be able to load their credit card information in, and they won't have to go to Stripe or some other, you know, another clicking onto a different window; it'll all still be within Frok Swap.* Stripe is doing a lot of things with crypto lately, but I haven't looked into that. But there are ways to kind of include it. So there's the crypto rail that you can use, but also enabling classic credit card or Apple Pay as a way to purchase the tokens, but certainly not for the launch…. (maybe in the weeks after the launch). # **Frok Swap Capabilities and User Experience** *Cogsy: What can you swap into $FROK from the Frok Swap main page? Can you pull from a bunch of different chains and use the different tokens to be able to just claw everything in to be able to use it for your gas token? (i.e. swapping all dust tokens into $FROK)* As long as there's liquidity for the source token, you're getting aggregation from KyberSwap, OpenOcean, Matcha or 0x, and from Odos. We also have an in-house aggregation coming, but that's only for USDC, wrapped ETH, and cbBTC to $FROK. That will be done in-house. As long as you can swap the source tokens on all the different aggregators, then it will be possible to swap them on the Frok interface as well. As for bridging tokens from other chains, whatever is possible with the current technology that allows you to bridge any token from any chain to Ethereum on Base, that will work for Frok Swap as well. And all that will be shown on the main Frok.com website when you first connect your wallet. *Cogsy: One of the cool edge things that I think is going to make this special is that when you first connect, Frok is going to analyze your Web3 presence and say, "Oh, you've got $3 worth of Avalanche tokens that are sitting over there that we could just pull those tokens in by bridging and swapping them to $FROK." You can almost clean up all your old tokens that are scattered around (across different chains), bring them all in, swap them to $FROK, and then start using it immediately.* *It's about a smooth and fast onboarding process, because you have one shot at these users, it's got to be good from the start when they first get on the site.* *For anybody who's listening to the call and hasn't looked at the site, go to Frok.com. I pulled it up on my widescreen monitor, and it's really badass, and the site looks great. You've done really good work on it.* It will be even better. For the Swap, it will be accessible as a standalone thing as well (via swap.frok.com), so you don't have to use or ask the model to perform the swap. You have an overview of your balances; you don't have to search for the source token; you can just click on it, and you can only click on the tokens you own. There are about 6,000 tokens that are supported, and the coverage is really, really great there. I can guarantee that there will be no new tokens that are missed, except maybe if they launched two days ago then maybe not, but it has a pretty wide coverage across all the chains including Solana. There's a lot of new technology that went into making it butter-smooth. I would even dare to say it's even smoother than you clicking around in your system that doesn't use the internet, like system settings, navigating through your phone, through your Windows/Mac computer. The page is also a Progressive Web App (PWA). The first time you go on swap.frok.com, the entire code is …. fetched from the service one time, and then the next time you go into that page, it loads from your browser's cache, so it loads within 100 milliseconds. Then your entire holdings across all the chains load all within a second, including the prices, and the quotes take a maximum of two seconds as well. The new version I'm working on right now (soon to be released) will be so snappy. Even if you don't proactively pay attention to those things, there's a really strange positive feeling, like it feels a bit too fast because it's unusual to see something being so responsive. This is, I would say another perk that we will maintain also for the Chat app. Everything will feel butter-smooth, as you're used to with Apple. # **Future Plans and Q\&A** *Cogsy: I wanted to give you a chance to say we can wrap it now if you'd like. We got a little bit over 35 minutes in. I want to keep it short for you…. I really think that this is a great intro and a first episode to be able to catch people up to where we are currently. We have a lot of questions for the next episode, obviously want to be able to do a live demo of the swapping mechanism, and be able to show us what you've been talking about. I'm excited to see that, and it's really been an awesome journey. So thank you for coming on and keeping us up to date. Any closing thoughts?* No, but I'm open to small Q\&A... *Cogsy: I have one if you want me to do a quick one. What about future funding? Do you have plans to do any raises or did you have plans to go to any VC stuff? Has any of that ever popped up?* Yes, so we have runway. It suffices, but we definitely want to bump up the development capacity. We already have one developer joining us. I would say it's kind of coincidental over the last weeks getting him full-time on board, and then three or four more people. So when it comes to that, I guess that may be accompanied by some efforts to get some funding. *Cogsy: So that's the first real hire that you guys have had, ever, right? It's just been the two of you until now, so that's exciting to get some more talent on board, and they must, they must be really good to make the cut with you. I can't even imagine trying to interview with those, I would definitely fail.* He's a good friend of mine. We've been in touch over the last months, and it kind of naturally happened. So he was helping on a lot of things, and then more and more of it happened, and then at some point, he kind of naturally slid in (joining the team full-time), but it wasn't like a proactive effort on that part. *Cogsy: That's awesome…. great to hear. Congratulations on that one.* # **Frok Swap Launch Timeline** *Cogsy: Wen launch?* So for the Legionnaires, I hope in the next 24 hours we can get the next version out, and this would be the one where I tell anyone, please try to break the swap, try to click buttons very fast, see if there's some bugged states. Just go wild on that UI and really try to break it, clicking all the buttons, closing, opening the tab, and whatnot. If everything works fine, and it should, I hope, then we can get it out a day later. *Cogsy: That's exciting, and it’s coming very soon. Anybody listening on Frok's or the Llamas' Discord, we'll have everything cross posted for the launch of the open testing of Frok Swap.* There's also a proper release (for general public). For now, just internal testing for the Legionnaires, but after that, it will be live on swap.frok.com for everybody to use. *Cogsy: Alright, thank you, Knows. Appreciate your time, and we will look forward to episode two.* Excited as well. Thank you to the listeners for joining and chiming in. I guess then to the next graze. *Cogsy: Alright, have a good night.*