# AI in a Day: GPT-3, Stable Diffusion, Embeddings, Fine-Tuning
1
00:00:00,000 --> 00:00:17,00000:00:00,000 --> 00:00:17,000
Hello everybody. Welcome, welcome to this AI stream where I will go through kind of what I’ve learned, what I’ve been up to in AI, building little experiments here and there, trying to wrap my head around what’s going on.大家好。歡迎,歡迎來到這個人工智慧流,在這裡我將經歷我所學到的東西,我在人工智慧方面所做的事情,在這裡和那裡建立小的實驗,試圖把我的腦袋包裹在發生了什麼。
2
00:00:17,000 --> 00:00:25,00000:00:17,000 --> 00:00:25,000
I find the best way to do that is just build small projects. So yeah, if people have questions, feel free to drop them in the chat.我發現最好的辦法就是建立小項目。所以是的,如果人們有問題,可以隨時在聊天中提出來。
3
00:00:25,000 --> 00:00:39,00000:00:25,000 --> 00:00:39,000
I will, I sort of planned, I don’t know, spend 15 minutes or something going through my journey the last month or two, and then I’m curious what questions people have and I’ll go through all the code so you can see how some of this stuff works behind the scenes.我將,我有點計畫,我不知道,花15分鐘或東西去通過我的旅程在過去的一個月或兩個,然後我很好奇人們有什麼問題,我會去通過所有的程式碼,所以你可以看到這一些東西是如何在幕後工作。
4
00:00:39,000 --> 00:00:44,00000:00:39,000 --> 00:00:44,000
Awesome. Let me share my screen and get started.真棒。讓我分享我的螢幕並開始吧。
5
00:00:44,000 --> 00:00:52,00000:00:44,000 --> 00:00:52,000
I’ve been downloading this new tool. Actually, let’s see if it worked. What would you like to do? Create a new program. Oh my gosh, it worked.我一直在下載這個新工具。實際上,讓我們看看它是否有效。你想做什麼?建立一個新的程序。哦,我的天哪,它成功了。
6
00:00:52,000 --> 00:01:01,00000:00:52,000 --> 00:01:01,000
Okay, so there’s this new thing. So I’m writing, I want to write a script to download a bunch of tweets to kind of do ask my book for Twitter user, for example.好的,所以有了這個新東西。所以我在寫,我想寫一個指令碼,下載一堆推文,以一種做問我的書的Twitter使用者,例如。
7
00:01:01,000 --> 00:01:18,00000:01:01,000 --> 00:01:18,000
So anyway, there’s this new app called GPT Labs. UPG? I don’t know what that stands for. Anyway, I saw it. So I just, I started it literally two minutes ago before the stream started and I just typed in my Python, I want a language, or sorry, a program in Python,因此,無論如何,有這個新的應用程式叫GPT實驗室。UPG?我不知道那代表什麼。總之,我看到了它。所以我只是,我在流開始前兩分鐘就開始了,我只是在我的Python中輸入了,我想要一種語言,或者對不起,一個Python的程序。
8
00:01:18,000 --> 00:01:26,00000:01:18,000 --> 00:01:26,000
and I want this program to download all of Bala.js’s, I think, I just guess it is handled, as a CSV with one tweet per row. Right?我想讓這個程序下載所有Bala.js的,我想,我只是猜測它被處理了,作為CSV,每行有一條推特。對嗎?
9
00:01:26,000 --> 00:01:42,00000:01:26,000 --> 00:01:42,000
So basically there’s all these tweets on the internet and there’s like a Twitter API, so I could, I would basically have to create a developer app on Twitter and then write a Python script to loop through some endpoint at Infinidum until it gets all the tweets and stores that, writes it to a CSV file.因此,基本上在網際網路上有所有這些推文,並且有像Twitter的API,所以我可以,我基本上要在Twitter上建立一個開發者應用程式,然後寫一個Python指令碼,通過Infinidum的一些端點循環,直到它得到所有的推文,並儲存,把它寫到CSV文件。
10
00:01:42,000 --> 00:01:53,00000:01:42,000 --> 00:01:53,000
And so anyway, that would take an hour probably or two for me to do. Not hard, but I would just, most of that time is just spent Googling, like, how do I do this with other requests API in Python or something? Right?因此,無論如何,這將需要一個小時可能或兩個我做的。並不難,但我只是,大部分時間都花在了Google上,比如,我如何用Python或其他請求的API來做這個?對嗎?
11
00:01:53,000 --> 00:02:10,00000:01:53,000 --> 00:02:10,000
And so anyway, let’s see what it did. Let’s see what it spit out. Import. So this is presumably a library to hit Twitter’s API and this is the CSV library that I’m familiar with because I used it, used it to build ask my book.所以不管怎樣,讓我們看看它做了什麼。讓我們看看它吐出了什麼。匯入。因此,這大概是一個打Twitter的API的庫,這是我熟悉的CSV庫,因為我用它,用它來建立問我的書。
12
00:02:10,000 --> 00:02:22,00000:02:10,000 --> 00:02:22,000
It creates that, that’s pretty simple. It creates a list object that’s going to hold all the tweets, which will be these kinds of objects from Tweepy, these tweet objects.它建立了這個,這很簡單。它建立了一個列表對象,它將持有所有的推文,這將是Tweepy的這些類型的對象,這些推文對象。
13
00:02:22,000 --> 00:02:34,00000:02:22,000 --> 00:02:34,000
So it’s going to first do, it’s going to do an initial request for the maximum count 200. This is pretty impressive to be honest. I love these comments. I’m really just reading the comments.因此,它首先要做的是,它要做一個最大計數200的初始請求。說實話,這是很令人印象深刻的。我喜歡這些評論。我真的只是在看評論。
14
00:02:34,000 --> 00:02:44,00000:02:34,000 --> 00:02:44,000
I’m reading the code and saying it, but I realized like it’s literally just telling you that. So I don’t even need to talk necessarily. Good for if you’re deaf. Oh, Julie, you have a question. Awesome.我讀著程式碼說,但我意識到好像它真的只是在告訴你這個。所以,我甚至不需要一定要說話。如果你是聾子就好了。哦,朱莉,你有一個問題。真棒。
15
00:02:44,000 --> 00:02:50,00000:02:44,000 --> 00:02:50,000
How do I raise your, allow you to speak? Let’s do this.我怎樣才能提高你的,讓你說話?讓我們來做這個。
16
00:02:50,000 --> 00:02:56,00000:02:50,000 --> 00:02:56,000
Allowed to talk. You’re allowed to talk. Am I doing something wrong?允許說話。允許你說話。我是不是做錯了什麼?
17
00:02:56,000 --> 00:03:01,00000:02:56,000 --> 00:03:01,000
Sup, Julie? Can you hear me? Oh, she left.喂,朱莉?你能聽到我嗎?哦,她走了。
18
00:03:01,000 --> 00:03:07,00000:03:01,000 --> 00:03:07,000
Oh no, she’s here. She accidentally raised her hand perhaps. I’ll keep going.哦,不,她在這裡。她可能不小心舉起了手。我會繼續的。
19
00:03:07,000 --> 00:03:14,00000:03:07,000 --> 00:03:14,000
If people have questions, actually, let me open up the, there is a Q&A widget. So yeah, use the Q&A widget or use the chat. I just find, yeah. Anyway, I’ll keep going.如果人們有問題,實際上,讓我打開,有一個問答小工具。所以,是的,使用問答小工具或使用聊天。我只是發現,是的。無論如何,我會繼續下去的。
20
00:03:14,000 --> 00:03:17,00000:03:14,000 --> 00:03:17,000
Okay. Get the tweets. That makes sense to me.好的。獲取推文。這對我來說很有意義。
21
00:03:17,000 --> 00:03:31,00000:03:17,000 --> 00:03:31,000
And then you, okay, so this is a new function, but I assume this basically takes this list that is retrieved and puts it in this list and you get the last ID.然後你,好的,所以這是一個新的函數,但我假設這基本上是把這個被檢索的列表,並把它放在這個列表中,你得到最後的ID。
22
00:03:31,000 --> 00:03:40,00000:03:31,000 --> 00:03:40,000
And no, this is actually a new thing called UPG Patrick, which is similar, probably based off the same exact API as writeCodex from OpenAI.不,這實際上是一個叫做UPG Patrick的新東西,它是類似的,可能是基於OpenAI的writeCodex的完全相同的API。
23
00:03:40,000 --> 00:03:51,00000:03:40,000 --> 00:03:51,000
And then basically while there are no, basically keep doing this until basically there’s going to be an API request that doesn’t return any tweets because you’ve reached the end.然後基本上雖然沒有,基本上一直這樣做,直到基本上會有一個API請求,不返回任何推文,因為你已經達到了終點。
24
00:03:51,000 --> 00:04:01,00000:03:51,000 --> 00:04:01,000
Keep doing this. And then once you have, oh yeah. So then once you have all of, once you’ve run through this function, that basically means that this thing is now full of tweets, tweet objects.繼續這樣做。然後一旦你有,哦,是的。所以,一旦你有了所有的,一旦你運行了這個函數,基本上就意味著這個東西現在已經充滿了推文,推文對象。
25
00:04:01,000 --> 00:04:14,00000:04:01,000 --> 00:04:14,000
And then basically this goes through that and instead of these tweet objects, which are just made up by TweetBee, it turns them into these dictionaries in Python or tuples or whatever they are.然後基本上這個會經過這個,而不是這些tweet對象,這些對象只是由TweetBee構成的,它把它們變成Python中的這些字典或圖元或任何東西。
26
00:04:14,000 --> 00:04:23,00000:04:14,000 --> 00:04:23,000
And that contains the ID of the tweet, the time it was tweeted, and oh yeah, 2D array, I guess.這裡面包含了推文的ID,推文的時間,還有哦,對了,二維陣列,我猜。
27
00:04:23,000 --> 00:04:29,00000:04:23,000 --> 00:04:29,000
And then the tweet itself, what’s the text of the tweet? And then it would write this to a CSV.然後是推文字身,推文的文字是什麼?然後它會把這個寫到CSV中。
28
00:04:29,000 --> 00:04:39,00000:04:29,000 --> 00:04:39,000
Yeah, that’s exactly, that’s exactly right. So I guess there’s, that’s pretty crazy. Yeah, that probably would have saved me. That’s an hour of my time. I don’t know.是的,這正是,這正是正確的。所以我想有,這是相當瘋狂的。是的,這可能會節省我。那是我一個小時的時間。我不知道。
29
00:04:39,000 --> 00:04:43,00000:04:39,000 --> 00:04:43,000
Let’s say I’m an engineer, get paid a hundred bucks an hour, a hundred dollars in value right there. Pretty cool.比方說,我是一個工程師,每小時得到一百塊錢的報酬,一百塊錢的價值就在那裡。相當不錯。
30
00:04:43,000 --> 00:04:48,00000:04:43,000 --> 00:04:48,000
Anyway, that was just a tangent because I had it open. I wanted to talk about my sort of journey.無論如何,這只是一個切入點,因為我已經打開了它。我想談一談我的那種旅程。
31
00:04:48,000 --> 00:04:52,00000:04:48,000 --> 00:04:52,000
Patrick says, wouldn’t it be great to pipe the comments to a speech synthesis AI?帕特里克說,把這些評論用管道輸送給語音合成AI不是很好嗎?
32
00:04:52,000 --> 00:05:05,00000:04:52,000 --> 00:05:05,000
Yeah, in theory. And one of the cool things is AI has become multimodal. This is what makes it general instead of just specific, which is pretty cool because it means, for example, I vastly prefer text, reading docs and trying stuff and copy pasting and all that stuff.是的,在理論上。而其中一件很酷的事情是,人工智慧已經變得多模態。這就是讓它變得普遍而不只是特殊的原因,這非常酷,因為這意味著,例如,我非常喜歡文字,閱讀文件,嘗試東西,複製貼上和所有這些東西。
33
00:05:05,000 --> 00:05:15,00000:05:05,000 --> 00:05:15,000
But some people prefer audio, right? Yeah. Like without actually having a human having to do the translation, you can now pick your format, which I think is pretty cool. I asked my book as an example.但有些人喜歡音訊,對嗎?是的。就像實際上不需要人去做翻譯,你現在可以選擇你的格式,我認為這很好。我問我的書作為一個例子。
34
00:05:15,000 --> 00:05:22,00000:05:15,000 --> 00:05:22,000
Anyway, I wanted to go through my sort of AI journey. I know some people have seen some of this kind of stuff. And so I think it would provide some good context.無論如何,我想通過我的那種人工智慧之旅。我知道有些人已經看到了一些這樣的東西。因此,我認為這將提供一些良好的背景。
35
00:05:22,000 --> 00:05:34,00000:05:22,000 --> 00:05:34,000
So I think this is the first moment that I think a lot of people’s minds got blown, which was GPD3 came out and they released this thing where you could effectively type into a text box and get a result.因此,我認為這是第一個時刻,我認為很多人的頭腦被炸開了,這是GPD3出來,他們發佈了這個東西,你可以有效地輸入一個文字框,得到一個結果。
36
00:05:34,000 --> 00:05:45,00000:05:34,000 --> 00:05:45,000
And the way that this works simply is that it basically makes a guess at based on the prior texts, like what would be the next word or token based on a snapshot of the internet.而這種工作方式簡單地說就是,它基本上是根據先前的文字進行猜測,比如說,根據網際網路的快照,下一個詞或標記會是什麼。
37
00:05:45,000 --> 00:05:52,00000:05:45,000 --> 00:05:52,000
Right. So for example, if I say one, two, the AI is going to be pretty good at guessing that the next thing is three, right? Or A, B.對。因此,例如,如果我說一,二,人工智慧會很好地猜測下一個東西是三,對嗎?或者是A,B。
38
00:05:52,000 --> 00:05:59,00000:05:52,000 --> 00:05:59,000
So you can, that’s really simple. Obviously that complexity ramps up pretty quickly in language. So yeah, that, but that’s simply what it does.所以你可以,這真的很簡單。顯然,這種複雜性在語言中上升得非常快。所以,是的,那,但那只是它所做的。
39
00:05:59,000 --> 00:06:13,00000:05:59,000 --> 00:06:13,000
And I think what was so powerful about this to me was that it allows, made AI accessible. It’s kind of no code AI, right? Whereas before you would have to write a bunch of Python or download some model or I don’t even know, but this is like really makes sense to me because it’s like just,我認為這對我來說是非常強大的,因為它允許,使人工智慧可以使用。這是一種沒有程式碼的人工智慧,對嗎?而以前你必須寫一堆Python或下載一些模型或我甚至不知道,但這對我來說真的很有意義,因為它就像。
40
00:06:13,000 --> 00:06:22,00000:06:13,000 --> 00:06:22,000
Oh, I just have to send sort of an API request with this string, right? All of this text. And then it’ll just come, it’ll return more texts, right? Which is the answer.哦,我只需要用這個字串傳送一種API請求,對嗎?所有的這些文字。然後它就會來,它會返回更多的文字,對嗎?這就是答案。
41
00:06:22,000 --> 00:06:35,00000:06:22,000 --> 00:06:35,000
It’s this very simple call response question, answer thing, which is how it kind of APIs work. Right? So for example, if I wanted to write, let’s say I wanted an app that made me more excited, right?這是一個非常簡單的呼叫響應問題,答案的事情,這是它的那種API的工作方式。對嗎?因此,舉例來說,如果我想寫,比方說我想寫一個讓我更興奮的應用程式,對嗎?
42
00:06:35,000 --> 00:06:50,00000:06:35,000 --> 00:06:50,000
So there’s a text box that’s you type into the text box and then you there’s another text box, which comes out with the result. You just have to say the following is a conversation with, or the following is a set of, you might not even need this actually.所以有一個文字框,就是你在文字框中輸入,然後你有另一個文字框,裡面出來的是結果。你只需要說下面是一個對話,或者下面是一組,實際上你可能甚至不需要這個。
43
00:06:50,000 --> 00:07:09,00000:06:50,000 --> 00:07:09,000
Let’s it would be something like this where you’d say input output or the following is a list of inputs and outputs where the first input, where the input is happy and the output is super happy. Right?就像這樣,你會說輸入輸出或者下面是一個輸入和輸出的列表,其中第一個輸入,輸入是快樂的,輸出是超級快樂的。對嗎?
44
00:07:09,000 --> 00:07:25,00000:07:09,000 --> 00:07:25,000
And then you might even want to be more specific. You say the list of five pairs of inputs and outputs, right? And then you, or let’s say two pairs just to simplify this. And then the input would be the user’s input, right? Whatever the user put would be like, let’s say user input or something.然後你可能還想更具體一些。你說列表中的五對輸入和輸出,對嗎?然後你,或者讓我們說兩對,只是為了簡化這個。然後輸入將是使用者的輸入,對嗎?無論使用者放什麼都會像,讓我們說使用者輸入或什麼。
45
00:07:25,000 --> 00:07:38,00000:07:25,000 --> 00:07:38,000
And then the output, and you would just leave this blank, right? So this is like what almost what you would send to the API. And then the API would just return whatever the answer is, which would like, let’s say the user, or let’s say this is what I would send to the string that I would actually send from my local script would be the string.然後是輸出,你就把這個留空,對嗎?所以,這幾乎就是你要傳送給API的東西。然後API將只是返回任何答案,這就像,比方說使用者,或者比方說這就是我將傳送的字串,我實際上將從我的本地指令碼傳送的字串。
46
00:07:38,000 --> 00:07:47,00000:07:38,000 --> 00:07:47,000
And then my local script would be something like, my name is Sahil Lavigne and I am the founder of Gumroad.然後我的本地指令碼會是這樣的:我的名字是Sahil Lavigne,我是Gumroad的創始人。
47
00:07:47,000 --> 00:07:57,00000:07:47,000 --> 00:07:57,000
And then the thing would basically spit back the rest of whatever it would guess, right? Which might, and it might make some mistakes because it might think, oh, like super is supposed to be added.然後這個東西基本上會把它所猜測的其餘部分吐出來,對嗎?這可能,而且它可能會犯一些錯誤,因為它可能認為,哦,像超級應該被新增。
48
00:07:57,000 --> 00:08:09,00000:07:57,000 --> 00:08:09,000
And then the user would be the super founder of Gumroad. It’s like something like that. But this is like really cool to me because it’s freeform. It’s like very dynamic. I think a lot of people think about programming, like this very like hard and fast math-y sort of thing.然後這個使用者就會成為Gumroad的超級創始人。這就像這樣的事情。但這對我來說真的很酷,因為它是自由形式的。它就像非常動態的。我認為很多人認為程式設計,就像這種非常像硬而快速的數學的那種東西。
49
00:08:09,000 --> 00:08:22,00000:08:09,000 --> 00:08:22,000
There’s no math here, at least not that you see it’s this much fuzzier thing. And there’s a great blog post about this, I believe called Software 3.0, which I highly recommend. I think it’s this blog post. No, it’s Software 2.0, which everyone should read.這裡沒有數學,至少你沒有看到它是這種更模糊的東西。關於這一點,有一篇很好的博文,我相信叫《軟體3.0》,我強烈推薦。我想就是這篇博文。不,是軟體2.0,每個人都應該閱讀。
50
00:08:22,000 --> 00:08:35,00000:08:22,000 --> 00:08:35,000
But he talks about this sort of, we’re used to writing code in this like very monistic way, right? Where you say print one plus one and it prints. All of a sudden you’re saying like input. This is a conversation between a math teacher and a math student.但他談到了這種,我們習慣於用這種非常單一的方式寫程式碼,對嗎?你說列印一加一,它就列印出來了。突然之間,你就說要輸入了。這是一個數學老師和一個數學學生之間的對話。
51
00:08:35,000 --> 00:08:45,00000:08:35,000 --> 00:08:45,000
The math teacher says, what is one plus one? The math of the student replies to. Obviously you wouldn’t use API. But after that, it’s very, math is not fuzzy, right?數學老師說,什麼是一加一?學生的數學回答到。很明顯,你不會用API。但在那之後,就很,數學不是模糊的,對嗎?
52
00:08:45,000 --> 00:08:57,00000:08:45,000 --> 00:08:57,000
So generally it’s not a good fit, but for plenty of other domains like music or art that are fuzzier, all of a sudden you can have a similar interaction and get some responses, which is like pretty, pretty cool.所以一般來說,這不是一個很好的選擇,但對於很多其他領域,如音樂或藝術,是比較模糊的,突然你可以有一個類似的互動,並得到一些回應,這就像相當,相當酷。
53
00:08:57,000 --> 00:09:06,00000:08:57,000 --> 00:09:06,000
And so that’s, he talks about the math behind like, what does that actually look like and mean and what’s actually happening behind the scenes. But pretty simply, that’s what’s the transition that I think is pretty cool.因此,這就是,他談論背後的數學,就像,這實際上是什麼樣子,意味著什麼,實際上是在幕後發生的。但很簡單,這就是我認為非常酷的過渡。
54
00:09:06,000 --> 00:09:16,00000:09:06,000 --> 00:09:16,000
Ooh, someone else raised their hand. So let’s see if I can see that list and allow you to talk. Where are you? Oh, you unraised your hand.哦,還有人舉手了。因此,讓我們看看我是否能看到那個名單,並允許你說話。你在哪裡?哦,你沒有舉手。
55
00:09:16,000 --> 00:09:24,00000:09:16,000 --> 00:09:24,000
Anyway, so I saw this and I was like, this is cool. I experimented with it like a bunch of people did and nothing really, I don’t know, like I think it’s cool.無論如何,所以我看到這個,我很喜歡,這很酷。我像一堆人一樣做了實驗,沒有什麼真正的,我不知道,就像我認為它很酷。
56
00:09:24,000 --> 00:09:30,00000:09:24,000 --> 00:09:30,000
I think I started using Copilot, but I didn’t have anything that I really wanted to build. I didn’t feel like I was like, oh, I can go build something.我想我開始使用Copilot,但我沒有任何東西,我真的想建立。我並不覺得我像,哦,我可以去建立一些東西。
57
00:09:30,000 --> 00:09:50,00000:09:30,000 --> 00:09:50,000
And then fast forward sort of two years, I think Dolly came out last year and, and then I saw this from Peter levels, which I thought was like really cool and like a really functional use case of using Dolly, which is like similar in a sense to GPT-3 images instead of text, but this sort of similar fuzziness.然後快進排序的兩年,我想Dolly去年出來了,而且,然後我看到這個從彼得水平,我認為這是喜歡真的很酷,像一個真正的功能使用案例,使用Dolly,這就像在某種意義上類似於GPT-3圖像而不是文字,但這種類似的模糊性。
58
00:09:50,000 --> 00:10:01,00000:09:50,000 --> 00:10:01,000
And then really what changed the game was stable diffusion, right? So Dolly was released by, for folks that don’t know, Dolly was, is kind of a GPT-3 for images released by OpenAI, which is really awesome.然後真正改變遊戲的是穩定的擴散,對嗎?所以多莉是由,對於不知道的人來說,多莉是,是一種由OpenAI發佈的圖像的GPT-3,這真的很厲害。
59
00:10:01,000 --> 00:10:10,00000:10:01,000 --> 00:10:10,000
Everyone’s seen this, I’m sure. And the kind of the joke is that it’s called OpenAI, but it’s not actually open source, right? It’s open in certain ways, but not in others.每個人都看過這個,我相信。而那種笑話是,它被稱為OpenAI,但它實際上不是開放原始碼的,對嗎?它在某些方面是開放的,但在其他方面不是。
60
00:10:10,000 --> 00:10:20,00000:10:10,000 --> 00:10:20,000
And then really what changed the game was I believe late September, there was the stable diffusion release and the stable diffusion release. Let’s see when that came out.然後真正改變遊戲的是我相信九月下旬,有穩定的擴散版本和穩定的擴散版本。讓我們看看那是什麼時候出來的。
61
00:10:20,000 --> 00:10:32,00000:10:20,000 --> 00:10:32,000
I guess August 22nd. So sorry, I was off by a month. And so it’s probably not, I would say like from my, in my eyes, not as good quote unquote as Dolly in terms of like the images that it produces being photorealistic, et cetera.我猜是8月22日。所以對不起,我偏離了一個月。因此,它可能不是,我會說像從我,在我的眼睛,不如引用unquote作為多莉在喜歡的圖像,它產生是逼真的,等方面。
62
00:10:32,000 --> 00:10:41,00000:10:32,000 --> 00:10:41,000
But the real core difference is that it’s open source, right? Or it parts of it, I think, are at least sort of open source. We did some binary kind of model, but what open source means is you can look inside of it.但真正的核心區別是,它是開放原始碼的,對嗎?或者說它的部分內容,我想,至少是開放原始碼的。我們做了一些二進制的模型,但開放原始碼的意思是你可以看它的內部。
63
00:10:41,000 --> 00:10:49,00000:10:41,000 --> 00:10:49,000
And what that means is you can spin up your own version of Dolly by just almost like unlearning. Imagine a brain.而這意味著你可以通過幾乎像解除學習來旋轉出你自己的多莉版本。想像一下,一個大腦。
64
00:10:49,000 --> 00:10:55,00000:10:49,000 --> 00:10:55,000
This is, imagine if the brain was like a crap cake or something like that, right? Where you have like layers of understanding that build up over time.這是,想像一下,如果大腦就像一個垃圾蛋糕或類似的東西,對嗎?在那裡你有像一層層的理解,隨著時間的推移而建立起來。
65
00:10:55,000 --> 00:11:05,00000:10:55,000 --> 00:11:05,000
Like let’s say for example, like on a very base level, you have the laws of physics and then you have like the periodic table of elements and maybe at some level of abstraction, you have ping pong, right?比如說,就像在一個非常基本的層面上,你有物理定律,然後你有像元素週期表,也許在某個抽象的層面上,你有乒乓球,對嗎?
66
00:11:05,000 --> 00:11:15,00000:11:05,000 --> 00:11:15,000
Which is based on all these things, but you don’t have to actually think about those sorts of things, right? If you want to make up a new game, like you can take ping pong and say it’s like ping pong, except really big and with tennis balls.這是基於所有這些東西,但你不需要實際考慮這些東西,對嗎?如果你想編造一個新的遊戲,比如你可以把乒乓球拿出來,說它就像乒乓球一樣,只不過非常大,而且是用網球。
67
00:11:15,000 --> 00:11:23,00000:11:15,000 --> 00:11:23,000
And we’re going to call it, we don’t use a table. So we’re just take the table out off. And it’s just, we call it tennis now, right? And it’s the origin of tennis for folks who don’t know.而我們要叫它,我們不使用桌子。所以我們只是把桌子拿掉了。它只是,我們現在叫它網球,對嗎?對於不知道的人來說,這就是網球的起源。
68
00:11:23,000 --> 00:11:31,00000:11:23,000 --> 00:11:31,000
And you don’t have to learn the laws of physics to do that. You can just change the rules at a very high level of abstraction and come up with a very new game.而你不需要學習物理定律就可以做到這一點。你只需在一個非常高的抽象層次上改變規則,就能想出一個非常新的遊戲。
69
00:11:31,000 --> 00:11:39,00000:11:31,000 --> 00:11:39,000
And that’s what open source allows in this way, which is you can peel back the onion, the one, two, three, four layers of the crap cake and then change that.而這正是開源所允許的方式,也就是你可以剝開洋蔥,一、二、三、四層的垃圾蛋糕,然後改變它。
70
00:11:39,000 --> 00:11:43,00000:11:39,000 --> 00:11:43,000
And then all of a sudden you have a new model that can do new things.然後突然間你有了一個新的模型,可以做新的事情。
71
00:11:43,000 --> 00:11:46,00000:11:43,000 --> 00:11:46,000
And that’s really what enables all of this kind of stuff, right?而這才是真正實現所有這些東西的原因,對嗎?
72
00:11:46,000 --> 00:11:57,00000:11:46,000 --> 00:11:57,000
Like you’ve seen here, this is a model, for example, like imagine Dolly, but trained on, I assume in this case, like a lot of like catalog imagery, right?就像你在這裡看到的,這是一個模型,例如,就像想像中的多莉一樣,但在訓練時,我假設在這種情況下,就像很多像目錄圖像,對嗎?
73
00:11:57,000 --> 00:12:09,00000:11:57,000 --> 00:12:09,000
That you would see from like Ikea or other brands. Right. And it’s anyway, so that really got my attention because it was like all of a sudden I can, if I can think of something vertical use case, I could start experimenting with this.你會看到像宜家或其他品牌的產品。對。總之,這真的引起了我的注意,因為它就像突然間我可以,如果我可以想到一些垂直的用例,我可以開始試驗這個。
74
00:12:09,000 --> 00:12:19,00000:12:09,000 --> 00:12:19,000
And then I saw this, which I’m sure a lot of people have seen, and I think this grabbed a lot of people’s attention, which is like this Pokemon generator, which I believe we can actually try.然後我看到了這個,我相信很多人都看到了,我想這抓住了很多人的注意力,這就像這個口袋妖怪發生器,我相信我們真的可以嘗試。
75
00:12:19,000 --> 00:12:28,00000:12:19,000 --> 00:12:28,000
And this is a Yoda Pokemon. What happens if you put my name in or something weird stuff might happen.而這是一個尤達小精靈。如果你把我的名字放進去會發生什麼,或者可能發生一些奇怪的東西。
76
00:12:28,000 --> 00:12:37,00000:12:28,000 --> 00:12:37,000
Oh, did I disable chat? Oh, you’re going to have to ask questions then, huh? I’m sorry. Oh, wait, I just enabled it. So people can chat now. Sorry about that.哦,我停用了聊天功能嗎?哦,那你就得問問題了,嗯?我很抱歉。哦,等等,我剛剛啟用了它。所以人們現在可以聊天了。很抱歉,這一點。
77
00:12:37,000 --> 00:12:46,00000:12:37,000 --> 00:12:46,000
Sorry, everybody. So this is me as a Pokemon, like water sawhill, maybe at least show you that it’s like having some impact, right? The prompt.對不起,各位。所以這是我作為寵物小精靈,像水鋸山,也許至少向你展示了它像有一些影響,對嗎?的提示。
78
00:12:46,000 --> 00:12:53,00000:12:46,000 --> 00:12:53,000
But imagine similar to that kind of fuzziness, right? I can just type a prompt and get it, get an answer, which I think is pretty cool.但是,想像一下類似於那種模糊不清的東西,對嗎?我可以直接輸入一個提示,然後得到它,得到一個答案,我認為這是非常酷的。
79
00:12:53,000 --> 00:12:59,00000:12:53,000 --> 00:12:59,000
So let’s see what this comes up with. A water sawhill. Awesome. People are using the chat. Great. Yeah, this is pretty cool.那麼,讓我們看看這個得出的結果是什麼。一個水鋸山。真棒。人們在使用聊天工具。很好。是的,這是非常酷的。
80
00:12:59,000 --> 00:13:13,00000:12:59,000 --> 00:13:13,000
It looks like a face almost, right? The eyes and the nose and the mouth. Weird, right? It’s learned this somehow, right? Like it has picked up this pattern where you have eyes on the top, you have this kind of nose object and then like some sort of mouth.它看起來幾乎像一張臉,對嗎?眼睛、鼻子和嘴巴。很奇怪,對嗎?它是以某種方式學會的,對嗎?就像它已經學會了這種模式,你的眼睛在上面,你有這種鼻子的物體,然後像某種嘴巴。
81
00:13:13,000 --> 00:13:27,00000:13:13,000 --> 00:13:27,000
And there’s some symmetry. Obviously it loves, it has definitely picked up on that kind of symmetry. But anyway, that was really cool. I’m like that, I think to a lot of people is mind boggling, right? Like now it’s almost like we don’t even think about it because we’ve seen it with a bunch of these kinds of examples.而且有一些對稱性。很明顯,它喜歡,它肯定已經接受了那種對稱性。但無論如何,這真的很酷。我想,對很多人來說,這是令人難以置信的,對嗎?就像現在,它幾乎就像我們甚至不考慮它,因為我們已經看到它與一堆這樣的例子。
82
00:13:27,000 --> 00:13:38,00000:13:27,000 --> 00:13:38,000
But like that, like this image, like you can’t find something too similar to this on the internet. It’s crazy, right? It would be like googling like water, water fish and like finding nothing.但是,像這樣,像這樣的圖像,就像你在網際網路上找不到與此太相似的東西。這很瘋狂,對嗎?這將是像Google搜尋像水,水魚和像找到什麼。
83
00:13:38,000 --> 00:13:50,00000:13:38,000 --> 00:13:50,000
It’s just like a thing that doesn’t happen, but we get used to it so quickly. Anyway, that really got my attention. And so that’s then the cool thing that he did is he actually open sourced all this, right? So he released this, you can go find this from that tweet.這就像一件沒有發生的事情,但我們很快就習慣了。總之,這真的引起了我的注意。因此,他所做的很酷的事情是他實際上開放了所有的資源,對嗎?所以他發佈了這個,你可以從那條推特上找到這個。
84
00:13:50,000 --> 00:14:00,00000:13:50,000 --> 00:14:00,000
And so that’s my, what I do always is I just copy, like my goal is just to exactly do this person did, and then I can modify my approach and come up with something else.所以這就是我的,我所做的總是我只是複製,像我的目標只是完全做這個人做的,然後我可以修改我的方法,想出別的東西。
85
00:14:00,000 --> 00:14:14,00000:14:00,000 --> 00:14:14,000
And in theory, my view of it is like, all I need to do is find a different set of images that are labeled, right? In his case, he had to find these Pokemon images. Let me see if I can find the actual.而在理論上,我的看法是,我需要做的就是找到一組不同的、有標籤的圖像,對嗎?在他的案例中,他必須找到這些口袋妖怪的圖像。讓我看看我是否能找到實際的。
86
00:14:14,000 --> 00:14:18,00000:14:14,000 --> 00:14:18,000
He actually, he has a good, where is he? He has some threads somewhere.他實際上,他有一個很好的,他在哪裡?他有一些執行緒的地方。
87
00:14:18,000 --> 00:14:25,00000:14:18,000 --> 00:14:25,000
But basically you have to find these images. Why does he not, where does he talk about this?但基本上你必須找到這些圖像。為什麼他不,他在哪裡談到這個?
88
00:14:25,000 --> 00:14:34,00000:14:25,000 --> 00:14:34,000
Oh yeah, here it is. I think this one. Yes. So he talks about what he had to do, right? Which is he had to get these images and then label them all.哦,是的,就是這裡。我認為這一個。是的。因此,他談到了他必須做的事情,對嗎?也就是他必須得到這些圖像,然後把它們都貼上標籤。
89
00:14:34,000 --> 00:14:45,00000:14:34,000 --> 00:14:45,000
And he has a CSV file, let’s say, of images and then the labels for these images, right? And there’s, I think there’s 200 Pokemon or something like that, or 1000 Pokemon that he did.他有一個CSV文件,比方說,圖片和這些圖片的標籤,對嗎?而有,我想有200個小精靈或類似的東西,或1000個小精靈,他做的。
90
00:14:45,000 --> 00:14:50,00000:14:45,000 --> 00:14:50,000
So it’s like, okay, cool. All I have to do is find like a thousand tagged images of some other thing.所以,這就像,好吧,酷。我所要做的就是找到像其他東西的一千張標籤圖像。
91
00:14:50,000 --> 00:14:57,00000:14:50,000 --> 00:14:57,000
And then I can, in theory, have a similar UI here that would allow me to generate whatever. Right. And so that was cool.然後我可以,在理論上,在這裡有一個類似的使用者介面,可以讓我生成任何東西。對。所以這很酷。
92
00:14:57,000 --> 00:15:02,00000:14:57,000 --> 00:15:02,000
And then I basically copied this tutorial and I spent quite a while trying to figure out how to do it.然後我基本上複製了這個教學,我花了相當長的時間,試圖找出如何做它。
93
00:15:02,000 --> 00:15:10,00000:15:02,000 --> 00:15:10,000
And I ended up spinning up this Lambda Labs instance. And so there’s a, I did this locally initially. I did it with Pokemon.最後我啟動了這個Lambda實驗室的實例。所以有一個,我最初在本地做了這個。我是用口袋妖怪做的。
94
00:15:10,000 --> 00:15:20,00000:15:10,000 --> 00:15:20,000
So if you just copy, literally just copy paste this code, just go through this, you’ll end up with this, and then you can very easily locally run these commands in your console and get these images spit out.所以,如果你只是複製,從字面上看只是複製貼上這段程式碼,只是通過這個,你最終會得到這個,然後你可以非常容易地在本地的控制台運行這些命令,並得到這些圖像的吐出。
95
00:15:20,000 --> 00:15:31,00000:15:20,000 --> 00:15:31,000
And then once I did that, I was like, okay, that’s cool. I need a set of images that I can use that are tagged something by something else.然後一旦我這樣做了,我就覺得,好吧,這很酷。我需要一組我可以使用的圖像,這些圖像是由其他東西標記的。
96
00:15:31,000 --> 00:15:46,00000:15:31,000 --> 00:15:46,000
And so I had this idea of going to the Reddit progress page subreddit, and I downloaded a thousand of these similar, like that script of scraping, apologies tweets or whatever.因此,我有這樣的想法,去Reddit的進度頁subreddit,我下載了一千個類似的,像那個指令碼的搜刮,道歉的推文或其他。
97
00:15:46,000 --> 00:16:02,00000:15:46,000 --> 00:16:02,000
Similarly wrote a script and, all of the thousands of a thousand of these images and labeled them with the titles of these posts. And you can tell they’re like labeled right where sorry, if these are NSFW or whatever they’re labeled, right.類似地寫了一個指令碼,並且,所有這些圖片的千分之一,並將它們與這些帖子的標題貼上標籤。你可以告訴他們就像標記的地方,對不起,如果這些是NSFW或任何他們標記的,對。
98
00:16:02,000 --> 00:16:07,00000:16:02,000 --> 00:16:07,000
Where they’re like, there’s female, 30 years old, 5’2, et cetera.在那裡,他們喜歡,有女性,30歲,5'2,等等。
99
00:16:07,000 --> 00:16:17,00000:16:07,000 --> 00:16:17,000
So I was like, okay, I can, there’s some, obviously I have no idea what this model is going to turn out. It’s going to be weird, but at least there’ll be, there should be some significance.所以我當時想,好吧,我可以,有一些,顯然我不知道這個模型會變成什麼。它將會很奇怪,但至少會有,應該有一些意義。
100
00:16:17,000 --> 00:16:26,00000:16:17,000 --> 00:16:26,000
And so that’s what I did. And so I built this thing and I’ll show you the code that I wrote to do it because it’s on GitHub.因此,這就是我所做的。因此,我建立了這個東西,我會告訴你我寫的程式碼,因為它是在GitHub上。
101
00:16:26,000 --> 00:16:27,00000:16:26,000 --> 00:16:27,000
Yeah, right here.是的,就在這裡。
102
00:16:27,000 --> 00:16:41,00000:16:27,000 --> 00:16:41,000
And so I took that repo, right. And I just cloned it and I started working on my own, the changes that I would need, right. For example, and I write my own readmes, even though no one’s going to read them just so my computer gets destroyed or something.因此,我採取了該 repo,對。我只是克隆了它,然後我開始做我自己的工作,我需要的變化,對。例如,我寫我自己的readmes,即使沒有人要讀它們,只是為了讓我的電腦被摧毀或什麼。
103
00:16:41,000 --> 00:16:43,00000:16:41,000 --> 00:16:43,000
I can spin back up.我可以旋轉回來。
104
00:16:43,000 --> 00:16:46,00000:16:43,000 --> 00:16:46,000
But basically I took his readme and I just started adding my own stuff. Right.但基本上我採用了他的readme,然後我開始新增我自己的東西。對。
105
00:16:46,000 --> 00:16:57,00000:16:46,000 --> 00:16:57,000
So for example, he was, he didn’t talk too much about, okay, you need to like actually label these things. And so I wrote a script called prep images that takes all those images and converts it into the right format.因此,例如,他是,他沒有談論太多,好吧,你需要像實際標籤這些東西。所以我寫了一個指令碼,叫做prep images,把所有這些圖片轉換成正確的格式。
106
00:16:57,000 --> 00:17:02,00000:16:57,000 --> 00:17:02,000
I wrote a conversion script, things like, for example, if you want to split the images, like all of this stuff you have to do, right.我寫了一個轉換指令碼,事情,例如,如果你想分裂的圖像,像所有這些東西你必須做的,對。
107
00:17:02,000 --> 00:17:09,00000:17:02,000 --> 00:17:09,000
So this is all stuff that like has nothing to do with AI. You’re just, you just need a thousand images that are labeled nicely.所以這都是與人工智慧無關的東西。你只是,你只是需要一千個被很好地標記的圖像。
108
00:17:09,000 --> 00:17:21,00000:17:09,000 --> 00:17:21,000
And so you test it like this, and then I was using Lambda Labs. Lambda Labs is imagine AWS, but for GPUs, basically is a kind of a simple way to say GPU instances.所以你這樣測試,然後我在用Lambda Labs。Lambda Labs是想像中的AWS,但對於GPU來說,基本上是一種簡單的方式,說是GPU實例。
109
00:17:21,000 --> 00:17:33,00000:17:21,000 --> 00:17:33,000
You would just similar to spin up an instance, you would spin up a GPU and then you can SSH into it and basically use it like a local computer, except it’s in the cloud and has a gazillion times more power than your local thing.你將只是類似於旋轉一個實例,你將旋轉一個GPU,然後你可以SSH到它,並基本上使用它像一個本地電腦,除了它是在雲和有一個 gazillion倍的功率比你的本地東西。
110
00:17:33,000 --> 00:17:48,00000:17:33,000 --> 00:17:48,000
So I had to figure out how to do that and then train the model. And so you basically run this thing and then it spends like 24 hours training and you, it spits back this checkpoint, which is basically like a binary file that has all of this.所以我必須弄清楚如何做到這一點,然後訓練這個模型。因此,你基本上運行這個東西,然後它花了24小時訓練,你,它吐回這個檢查點,這基本上是像一個二進制文件,有所有的這些。
111
00:17:48,000 --> 00:17:55,00000:17:48,000 --> 00:17:55,000
It’s like a brain, like as how I think about it is it’s the thing that is the API that you can ask questions and then you have a script.它就像一個大腦,就像我怎麼想的,它就是那個你可以提出問題的API的東西,然後你有一個指令碼。
112
00:17:55,000 --> 00:18:07,00000:17:55,000 --> 00:18:07,000
So I basically took this script and ran it. And so I took that prompt and I made my own prompt, which is based off my own weight loss and said, male, 30 years old, 5’7, et cetera.所以我基本上把這個指令碼運行了一遍。因此,我把那個提示和我自己的提示,這是基於我自己的減肥,並說,男性,30歲,5'7,諸如此類。
113
00:18:07,000 --> 00:18:16,00000:18:07,000 --> 00:18:16,000
And it, I want to wonder what is this? And let me, let me pause my share and see if I can actually find an image that I sent someone.而它,我想知道這是什麼?讓我,讓我暫停我的分享,看看我是否真的能找到一個圖像,我給別人。
114
00:18:16,000 --> 00:18:27,00000:18:16,000 --> 00:18:27,000
Cause they’re pretty, pretty grotesque to be honest. Let’s see if I can find one, but again, you don’t know if this is going to be any good until I’ll show you a lot less than grotesque one.因為說實話,它們非常、非常怪異。讓我們看看我能不能找到一個,但還是那句話,你不知道這是否會有什麼好處,直到我給你看一個不那麼怪誕的東西。
115
00:18:27,000 --> 00:18:33,00000:18:27,000 --> 00:18:33,000
Yeah. So here’s one that, yeah, so this is one and literally that, that text generated this image.是的。所以這裡有一個,是的,所以這是一個,從字面上看,那個文字產生了這個圖像。
116
00:18:33,000 --> 00:18:40,00000:18:33,000 --> 00:18:40,000
And obviously there’s nothing useful here. I don’t know what you would do here, but it’s like crazy to me that like it picked up quite a few things.而且顯然這裡沒有什麼有用的東西。我不知道你會在這裡做什麼,但對我來說,這就像瘋了一樣,就像它撿到了相當多的東西。
117
00:18:40,000 --> 00:18:49,00000:18:40,000 --> 00:18:49,000
Like I would say, for example, if you put in someone who’s female 85, who’s 500 pounds going down to 200, it’ll create a very different image.就像我說的,例如,如果你把一個人誰是女性85,誰是500磅下降到200,它會建立一個非常不同的形象。
118
00:18:49,000 --> 00:19:00,00000:18:49,000 --> 00:19:00,000
And it won’t be, it’ll be like this in the sense that it’s like weird and has artifacts and is actually not that specific to like the specific weight or whatever, but it’s clearly there, right?而且它不會,它會像這樣的意義上,它像怪異的,有人工製品,實際上不是那麼具體的像具體的重量或什麼,但它顯然是存在的,對嗎?
119
00:19:00,000 --> 00:19:06,00000:19:00,000 --> 00:19:06,000
Where it’s able to pick up certain things like age overall, maybe, maybe it’s to the hundreds of pounds or something like that.在那裡,它能夠拿起某些東西,如年齡的整體,也許,也許它是到數百英鎊或類似的東西。
120
00:19:06,000 --> 00:19:17,00000:19:06,000 --> 00:19:17,000
The sort of the tolerance level. But anyway, I was pretty proud of this. Like I spun up a model and I now have a thing that could, if I wanted to, I could build a little website that could allow people to upload an image and get this weird thing out.的那種容忍度。但無論如何,我對這個很自豪。就像我旋轉了一個模型,我現在有一個東西可以,如果我想,我可以建立一個小網站,可以讓人們上傳圖片,把這個奇怪的東西拿出來。
121
00:19:17,000 --> 00:19:26,00000:19:17,000 --> 00:19:26,000
Right. And so that was like the first experiment that I ran. Let me see what else I have here that yeah, using, using this example code base.對。因此,這就像我運行的第一個實驗。讓我看看我這裡還有什麼,是的,使用,使用這個示例程式碼庫。
122
00:19:26,000 --> 00:19:31,00000:19:26,000 --> 00:19:31,000
And I was pretty, pretty happy with that, but wasn’t, I didn’t think there was any real market opportunity.我對這一點相當,相當滿意,但不是,我不認為有任何真正的市場機會。
123
00:19:31,000 --> 00:19:47,00000:19:31,000 --> 00:19:47,000
There was no real value in that to the market. And so I was like, okay, what else can I build? And then I had this idea of using basically taking my book, which is about 52,000 words and basically trying to figure out if I could.這對市場沒有真正的價值。所以我就想,好吧,我還能建立什麼?然後我就有了這個想法,基本上把我的書,也就是大約52,000字的書,基本上試圖弄清楚我是否可以。
124
00:19:47,000 --> 00:19:54,00000:19:47,000 --> 00:19:54,000
Initially it was looking at it through this lens and this lens, I guess I should explain is called fine tuning is the approach, right?最初是通過這個鏡頭看它,這個鏡頭,我想我應該解釋一下,叫做微調是方法,對嗎?
125
00:19:54,000 --> 00:20:04,00000:19:54,000 --> 00:20:04,000
Which is you take an existing model and you retrain it based on a very limited set of data. And that way it’s good at producing answers to that kind of thing. Right.這是你採取一個現有的模型,你重新訓練它基於一個非常有限的資料集。這樣一來,它就能很好地產生對那種事情的答案。對。
126
00:20:04,000 --> 00:20:17,00000:20:04,000 --> 00:20:17,000
And so my initial approach was like, oh, I can just like fine tune a model on text on my book. Like I can have a bunch of questions and then I can have a bunch of answers and then I can have a model that sort of is able to do that.所以我最初的方法是,哦,我可以在我的書上的文字上微調一個模型。就像我可以有一堆問題,然後我可以有一堆答案,然後我可以有一個模型,能夠做到這一點。
127
00:20:17,000 --> 00:20:26,00000:20:17,000 --> 00:20:26,000
And I have trained that and it just was terrible. It’s not good enough yet. I think it needs to be trained on way more questions. And then I found another approach, which is embeddings.而我已經訓練過了,只是很糟糕。它還不夠好。我認為它需要對更多的問題進行訓練。然後我發現了另一種方法,那就是嵌入法。
128
00:20:26,000 --> 00:20:33,00000:20:26,000 --> 00:20:33,000
And so there’s fine tuning, which is this one, which is has some pros, has some cons. And then there’s embeddings.所以還有微調,也就是這個,它有一些優點,也有一些缺點。然後還有嵌入。
129
00:20:33,000 --> 00:20:41,00000:20:33,000 --> 00:20:41,000
And embeddings is the thing that I got really excited about because it’s a really good fit for specifically questions and answers.嵌入是我非常興奮的事情,因為它真的很適合於具體的問題和答案。
130
00:20:41,000 --> 00:20:47,00000:20:41,000 --> 00:20:47,000
And the way to think about embeddings is it’s complicated and actually came out in January or something.而思考嵌入的方式是它很複雜,實際上是在1月或什麼時候出來的。
131
00:20:47,000 --> 00:20:53,00000:20:47,000 --> 00:20:53,000
But the way to think about embeddings is if we go back to this prompt, at least the way that I think about it is.但思考嵌入的方式是,如果我們回到這個提示,至少我認為它的方式是。
132
00:20:53,000 --> 00:21:03,00000:20:53,000 --> 00:21:03,000
Right. This is the thing that we’re familiar with and it’s really good generally. But if I asked it a question like question, what is the minimalist?對。這是我們熟悉的東西,一般來說,它真的很好。但是,如果我問它一個問題,如問題,什麼是最小的人?
133
00:21:03,000 --> 00:21:16,00000:21:03,000 --> 00:21:16,000
Entrepreneur. About answer, it would probably it would basically just do the equivalent of a Google search, like it wouldn’t be specific to who sawhills it supposed to be answering this.企業家。關於答案,它可能會它基本上只是做相當於Google搜尋,像它不會具體到誰鋸齒它應該是回答這個。
134
00:21:16,000 --> 00:21:25,00000:21:16,000 --> 00:21:25,000
It’s supposed to have some content from the book itself. And so what embeddings does is it basically allows you to add stuff like to the prompt.它應該有一些來自書本身的內容。因此,什麼嵌入做的是它基本上允許你新增的東西,如到提示。
135
00:21:25,000 --> 00:21:35,00000:21:25,000 --> 00:21:35,000
And do it in a very smart way instead of I can’t just take the entire book and put it in. This is the book, The Minimalist Entrepreneur 52,000 Words. Question and answer.並以一種非常聰明的方式來做,而不是我不能把整本書都放進去。這本書就是《極簡主義企業家52000字》。問與答。
136
00:21:35,000 --> 00:21:40,00000:21:35,000 --> 00:21:40,000
Right. That’s that would be ideal, actually, because then in theory, like I don’t even have to use embeddings.對。這就是這將是理想的,實際上,因為然後在理論上,像我甚至不需要使用嵌入物。
137
00:21:40,000 --> 00:21:46,00000:21:40,000 --> 00:21:46,000
It’s just insanely simple. And I’m I assume that’s eventually where we’ll get at some point.這實在是太簡單了。我想這就是我們最終會得到的地方,在某些時候。
138
00:21:46,000 --> 00:21:51,00000:21:46,000 --> 00:21:51,000
But for now, there’s like a limit on how much sort of you can stuff into the prompt. It also costs money.但是現在,你能把多少東西塞進提示中是有限制的。它也要花錢。
139
00:21:51,000 --> 00:21:54,00000:21:51,000 --> 00:21:54,000
Right. If you have even if it is possible, you might want to be more efficient with it.對。如果你有即使它是可能的,你可能想更有效地使用它。
140
00:21:54,000 --> 00:22:05,00000:21:54,000 --> 00:22:05,000
And so what basically what embeddings allows you to do is take like the best or like the most relevant words of those 52,000 and stuff those into the prompt.因此,基本上什麼嵌入允許你做的是採取像最好的或像最相關的單詞,這些52,000和塞入提示。
141
00:22:05,000 --> 00:22:12,00000:22:05,000 --> 00:22:12,000
Right. So, for example, you could say saw and this is and I actually I’ll just show you now, like kind of the code example.對。所以,例如,你可以說看到,這是,實際上我現在就給你看,像那種程式碼的例子。
142
00:22:12,000 --> 00:22:17,00000:22:12,000 --> 00:22:17,000
Let me this one. Right. So this is like a local Python script, which basically does what I was saying before.讓我這一個。對。所以這就像一個本地的Python指令碼,它基本上做了我之前說的事情。
143
00:22:17,000 --> 00:22:29,00000:22:17,000 --> 00:22:29,000
And you can see here. Oh, yeah, I’ve been testing some stuff, but basically it takes you can this is this is an important line, which is it takes the most relevant document sections.而且你可以看到這裡。哦,是的,我一直在測試一些東西,但基本上它把你可以這是這是一個重要的行,這是它把最相關的文件部分。
144
00:22:29,000 --> 00:22:39,00000:22:29,000 --> 00:22:39,000
Right. So it pulls these from this embeddings CSV and from the context embeddings, which come from the book, which is just like pages out CSV somewhere.對。因此,它從這個嵌入CSV和上下文嵌入中提取這些內容,這些內容來自於書,這只是像頁出CSV的地方。
145
00:22:39,000 --> 00:22:48,00000:22:39,000 --> 00:22:48,000
Where is that context embeddings? Oh, yeah, here it is. So it downloads these files. And anyways, yeah, basically fetches them smartly.語境嵌入在哪裡?哦,是的,就在這裡。所以它下載這些文件。不管怎麼說,是的,基本上是聰明地取走它們。
146
00:22:48,000 --> 00:23:01,00000:22:48,000 --> 00:23:01,000
So it would be like doing a really simple Google search, except it’s just better. Right. So if I wanted to, if my level of engineering, like I’m not the best engineering engineer on the planet, what I would do is I would write a really simple like search.因此,這就像在做一個非常簡單的Google搜尋,只是它更好。對。所以,如果我想,如果我的工程水平,比如我不是這個星球上最好的工程工程師,我會做的是我將寫一個非常簡單的類似搜尋的東西。
147
00:23:01,000 --> 00:23:09,00000:23:01,000 --> 00:23:09,000
Right. If I was coming up with my own A.I., I would basically say I would write a really simple script that would maybe try to pull out nouns from the question.對。如果我想出我自己的人工智慧,我基本上會說我會寫一個非常簡單的指令碼,也許會嘗試從問題中拉出名詞。
148
00:23:09,000 --> 00:23:17,00000:23:09,000 --> 00:23:17,000
Right. So, for example, let’s say the question is, like, how do you define community? I would have to come up with some sort of script that would try to pull out the nouns.對。所以,舉例來說,假設問題是,比如,你如何定義社區?我得想出一些指令碼來,試圖拉出名詞。
149
00:23:17,000 --> 00:23:25,00000:23:17,000 --> 00:23:25,000
And then I would try I would probably search on that fifty two thousand words and I would like probably find all the instances of those words.然後我會嘗試,我可能會在那五萬兩千個詞上搜尋,我想可能會找到這些詞的所有實例。
150
00:23:25,000 --> 00:23:32,00000:23:25,000 --> 00:23:32,000
And then I would pull maybe one hundred words before one hundred words after and then like stuff that into the prompt. And that would be my hacky version.然後我可能會拉出前一百個詞,後一百個詞,然後把它們塞進提示中。這將是我的駭客版本。
151
00:23:32,000 --> 00:23:39,00000:23:32,000 --> 00:23:39,000
And that’s the software one point. Right. Where it’s very it’s completely deterministic. Right. It’s just looking for equals. Right.這就是軟體的一個點。對。它是非常它是完全決定性的。對。它只是在尋找相等的東西。對的。
152
00:23:39,000 --> 00:23:46,00000:23:39,000 --> 00:23:46,000
Can’t do any fuzziness, like, for example, what if I didn’t use the word community, but I use the word group. Right. Or something like that.不能做任何模糊的事情,比如說,如果我沒有使用社區這個詞,但我使用了群體這個詞,怎麼辦?對。或者類似這樣的事情。
153
00:23:46,000 --> 00:23:54,00000:23:46,000 --> 00:23:54,000
And so that’s where the magic of A.I. comes in and embeddings comes in. It’s like doing that, which is it’s like asking the question based on this question.因此,這就是人工智慧的神奇之處,也是嵌入的神奇之處。它就像這樣做,也就是它就像根據這個問題來問問題。
154
00:23:54,000 --> 00:24:01,00000:23:54,000 --> 00:24:01,000
Right. Let’s say this question was question equals, let’s say, what is community according to you?對。比方說,這個問題是問題等於,比方說,什麼是社區根據你?
155
00:24:01,000 --> 00:24:06,00000:24:01,000 --> 00:24:06,000
Right. It would basically send that to the book or a file that resembles the book.對。它基本上會把這個問題傳送到書上或類似於書的文件上。
156
00:24:06,000 --> 00:24:15,00000:24:06,000 --> 00:24:15,000
It’s basically the book, but turned into this mathematical representation of all the concepts, talking about all the other concepts in the book and says, hey, based on this question, what are the five?它基本上是一本書,但變成了所有概念的這種數學表示,談論書中的所有其他概念,並說,嘿,基於這個問題,什麼是五?
157
00:24:15,000 --> 00:24:22,00000:24:15,000 --> 00:24:22,000
I believe that’s how this code works, which this might just get all the top, let’s say, 50 most relevant sections.我相信這就是這個程式碼的工作原理,這這可能只是得到所有頂級的,比方說,50個最相關的部分。
158
00:24:22,000 --> 00:24:30,00000:24:22,000 --> 00:24:30,000
Right. So based on how I fed, how I created the data and the way I did it specifically is that every page is basically a section.對。因此,根據我的喂養方式,我是如何建立資料的,我具體做的方式是,每個頁面基本上是一個部分。
159
00:24:30,000 --> 00:24:39,00000:24:30,000 --> 00:24:39,000
So basically, what are the five or what are the most the five most relevant pages from the book and then go through them until there’s no space left in the prompt?因此,基本上,什麼是五個或什麼是最的五個最相關的頁面從書,然後去通過他們,直到沒有空間離開的提示?
160
00:24:39,000 --> 00:24:44,00000:24:39,000 --> 00:24:44,000
And then the prompt itself is also engineered.然後提示本身也是經過設計的。
161
00:24:44,000 --> 00:24:49,00000:24:44,000 --> 00:24:49,000
Right. So, for example, let me just pull in what I actually wrote.對。所以,比如說,讓我把我實際寫的東西拉進來。
162
00:24:49,000 --> 00:24:55,00000:24:49,000 --> 00:24:55,000
This is basically the prompt, right, which is sawhille living is the founders here, gummer, blah, blah, blah, blah, blah.這基本上是提示,對,這是Sawhille生活是這裡的創始人,Gummer,胡說,胡說,胡說,胡說。
163
00:24:55,000 --> 00:25:01,00000:24:55,000 --> 00:25:01,000
Context that may be useful. I try to come up with and I’m sure this is like super MVP.語境,可能是有用的。我試圖想出,我相信這就像超級MVP。
164
00:25:01,000 --> 00:25:07,00000:25:01,000 --> 00:25:07,000
Right. So this is like kind of part of what’s really fun is you can there’s like a kind of a new kind of engineering that didn’t exist before.對。因此,這就像一種什麼是真正有趣的部分是你可以有像一種新的工程,以前不存在。
165
00:25:07,000 --> 00:25:11,00000:25:07,000 --> 00:25:11,000
And anyway, so I came up the simple line. These are questions and answers by him.而無論如何,所以我想出了簡單的線。這些都是他的問題和答案。
166
00:25:11,000 --> 00:25:15,00000:25:11,000 --> 00:25:15,000
Please keep your answers to three sentences minimum. Speaking complete sentences, stop speaking.請把你的答案保持在三句話以上。說出完整的句子,停止說話。
167
00:25:15,000 --> 00:25:22,00000:25:15,000 --> 00:25:22,000
This is like you’re just trying stuff out. Right. Like one of the cool things is like all this stuff will eventually get blogs blog about like people will figure out what what works well.這就像你只是在嘗試東西。對。Like one of the cool things is like all this stuff will eventually get blogs blog about like people will figure out what what works well.
168
00:25:22,000 --> 00:25:29,00000:25:22,000 --> 00:25:29,000
Right. Right now, we’re all like digging for gold in a way. And then context that may be useful pulled from the minimal song from this is where these five pages would come from.對。現在,我們都在某種程度上像在掘金。然後從最小的歌曲中拉出可能有用的上下文,這就是這五頁的來源。
169
00:25:29,000 --> 00:25:34,00000:25:29,000 --> 00:25:34,000
So it would say like page one and then it would just say, right.所以它會說像第一頁,然後它就會說,對。
170
00:25:34,000 --> 00:25:41,00000:25:34,000 --> 00:25:41,000
Something like that page 57. And you can just imagine five of these and then question and then that would come.像這樣的第57頁。你可以想像一下,有五個這樣的問題,然後問題就會出現。
171
00:25:41,000 --> 00:25:47,00000:25:41,000 --> 00:25:47,000
What is community according to you? Or I actually hear there would be examples.什麼是社區,根據你?或者我實際上聽到會有例子。
172
00:25:47,000 --> 00:25:53,00000:25:47,000 --> 00:25:53,000
So you go back to the code. There’s like a bunch of these examples that I stuck to. Right. So there’s all these examples.因此,你回去的程式碼。有像一堆這樣的例子,我堅持到。對。所以有所有這些例子。
173
00:25:53,000 --> 00:26:02,00000:25:53,000 --> 00:26:02,000
So it would be like this. And so I come come I try to stop it with examples that kind of resemble my voice or whatnot.所以會是這樣的。於是我就來了,我試圖用那種類似於我的聲音或什麼的例子來阻止它。
174
00:26:02,000 --> 00:26:07,00000:26:02,000 --> 00:26:07,000
And then it would be something like this. Right.然後就會像這樣了。對。
175
00:26:07,000 --> 00:26:11,00000:26:07,000 --> 00:26:11,000
And then, boom, this is would be the kind of the answer we could get spit back out.然後,嘣,這將是我們可以得到的答案的一種吐出。
176
00:26:11,000 --> 00:26:16,00000:26:11,000 --> 00:26:16,000
And then that would return that to the user, which is exactly what happens.然後,這將返回給使用者,這正是所發生的。
177
00:26:16,000 --> 00:26:21,00000:26:16,000 --> 00:26:21,000
And so if you go back to the code base, I can explain now you have all the concepts.因此,如果你回到程式碼庫,我可以解釋現在你有所有的概念。
178
00:26:21,000 --> 00:26:26,00000:26:21,000 --> 00:26:26,000
Now, it’s just going through it. You go to this website, ask my book dot com.現在,它只是去通過它。你去這個網站,問我的書點com。
179
00:26:26,000 --> 00:26:32,00000:26:26,000 --> 00:26:32,000
Which is just this input. You ask this question. It’s the minimalist entrepreneur is a book.這只是這個輸入。你問這個問題。這是極簡主義企業家是一本書。
180
00:26:32,000 --> 00:26:38,00000:26:32,000 --> 00:26:38,000
So you don’t have to listen to my voice twice over. And what does this do? This is just a page.因此,你不必聽我的聲音兩次了。而這是做什麼的?這只是一個頁面。
181
00:26:38,000 --> 00:26:44,00000:26:38,000 --> 00:26:44,000
Really simple. Let me view source. We better ignore all the style.真的很簡單。讓我查看來源。我們最好忽略所有的樣式。
182
00:26:44,000 --> 00:26:51,00000:26:44,000 --> 00:26:51,000
But basically, it’s just a form. Right. It’s just this form. It’s just a very you don’t even need JavaScript in theory.但基本上,它只是一個表單。對。它只是這種形式。它只是一個非常你甚至在理論上不需要JavaScript。
183
00:26:51,000 --> 00:26:55,00000:26:51,000 --> 00:26:55,000
And all it does is it takes that question and it sends it to the back end.而它所做的就是把這個問題送到後端去。
184
00:26:55,000 --> 00:26:59,00000:26:55,000 --> 00:26:59,000
There’s also this I’m feeling lucky button that basically just repopulates it with a bunch of stuff.還有這個 "我感覺很幸運 "的按鈕,基本上就是用一堆東西重新填充它。
185
00:26:59,000 --> 00:27:04,00000:26:59,000 --> 00:27:04,000
And there’s some this is all you can look at this code. Literally, if you copy paste, it should work like it’s not it should be.還有一些這是所有你可以看這個程式碼。從字面上看,如果你複製貼上,它應該像它的工作不是它應該。
186
00:27:04,000 --> 00:27:09,00000:27:04,000 --> 00:27:09,000
It’s not minified or anything. So you can see my ugly code. And then it hits that back end.它沒有被最小化或任何東西。所以你可以看到我的醜陋的程式碼。然後它打到那個後端。
187
00:27:09,000 --> 00:27:13,00000:27:09,000 --> 00:27:13,000
And so let’s go to the back end views and see what happens.因此,讓我們去後端檢視,看看會發生什麼。
188
00:27:13,000 --> 00:27:20,00000:27:13,000 --> 00:27:20,000
So it basically goes and hits this. Def ask back end.所以它基本上是去打這個。Def ask back end.
189
00:27:20,000 --> 00:27:26,00000:27:20,000 --> 00:27:26,000
And it does exactly like that script. Imagine like I took that local script, because once I had that local script running and working,而且它的動作和那個指令碼完全一樣。想像一下,我把那個本地指令碼,因為一旦我有那個本地指令碼運行和工作。
190
00:27:26,000 --> 00:27:32,00000:27:26,000 --> 00:27:32,000
all I got to do is host that script on the Internet. Right. And basically hit that script and it will return and then show it to the user.我所要做的就是把這個指令碼放在網際網路上。對。基本上點選那個指令碼,它就會返回,然後把它顯示給使用者。
191
00:27:32,000 --> 00:27:36,00000:27:32,000 --> 00:27:36,000
And so it hits and I get this question asked. I it doesn’t have a question at the end.因此,它擊中了我,我得到了這個問題問。我它在最後沒有一個問題。
192
00:27:36,000 --> 00:27:39,00000:27:36,000 --> 00:27:39,000
A question mark at the end, I add one. And then this is a cache. Right.最後有一個問號,我加了一個。然後這就是一個快取了。對。
193
00:27:39,000 --> 00:27:44,00000:27:39,000 --> 00:27:44,000
So if the question has been asked exactly like it’s been asked before, it just auto returns it.所以,如果這個問題已經被問過了,就像以前被問過的那樣,它只是自動返回。
194
00:27:44,000 --> 00:27:51,00000:27:44,000 --> 00:27:51,000
Otherwise, it does all that stuff that I talked about, which is it get it fetches the pages and then it fetches those embeddings,否則,它就會做我說過的所有事情,也就是它獲取頁面,然後獲取這些嵌入。
195
00:27:51,000 --> 00:27:55,00000:27:51,000 --> 00:27:55,000
which is that abstract math representation. And then it answers that question.這是一個抽象的數學表示。然後它回答這個問題。
196
00:27:55,000 --> 00:28:02,00000:27:55,000 --> 00:28:02,000
Right. And the way it answers that question is exactly that same thing, which is if you just go to answer question, answer query with question,對。而它回答這個問題的方式也是一樣的,就是如果你只是去回答問題,用問題來回答查詢。
197
00:28:02,000 --> 00:28:06,00000:28:02,000 --> 00:28:06,000
it does the prompt and this is the open AI completion.它就會進行提示,這是開放的人工智慧完成。
198
00:28:06,000 --> 00:28:11,00000:28:06,000 --> 00:28:11,000
All I do is I just send that prompt and some parameters and return whatever returns.我所做的只是傳送那個提示和一些參數,然後返回任何回報。
199
00:28:11,000 --> 00:28:17,00000:28:11,000 --> 00:28:17,000
Right. And if you go to the prompt construction, which is this the def construct prompt, it’ll just take you right here and you’ll see that part.對。如果你去看提示結構,也就是這個def construct提示,它就會帶你到這裡,你會看到這部分。
200
00:28:17,000 --> 00:28:23,00000:28:17,000 --> 00:28:23,000
Right. Which is basically based on those pages, which is that CSV file and those embeddings,對。這基本上是基於那些頁面,也就是那個CSV文件和那些嵌入物。
201
00:28:23,000 --> 00:28:29,00000:28:23,000 --> 00:28:29,000
which is another CSV file and the question, which is a string of the question the user asked, do that thing.00:28:23,000 --> 00:28:29,000 這是另一個CSV文件和問題,這是一個字串,使用者問的問題,做那件事。
202
00:28:29,000 --> 00:28:33,00000:28:29,000 --> 00:28:33,000
Right. Go through those pages or go through that, use the embeddings, like basically.對。通過這些頁面或通過那個,使用嵌入,像基本上。
203
00:28:33,000 --> 00:28:43,00000:28:33,000 --> 00:28:43,000
Use the embeddings to figure out what pages are most relevant and then based on how much space, which is 500 tokens, I think, stuff it,使用嵌入來找出哪些頁面是最相關的,然後根據多少空間,也就是500個標記,我想,把它塞進去。
204
00:28:43,000 --> 00:28:50,00000:28:43,000 --> 00:28:50,000
and then add these, add that sort of prompt engineering bit and return that string.然後新增這些,新增那種提示工程位,並返回那個字串。
205
00:28:50,000 --> 00:28:55,00000:28:50,000 --> 00:28:55,000
And then it comes back with that string. And I’m not done yet because there’s one more API, which was that voice.然後它就會返回那個字串。而我還沒有完成,因為還有一個API,就是那個聲音。
206
00:28:55,000 --> 00:29:00,00000:28:55,000 --> 00:29:00,000
Right. And which is pretty simple, which is I just take that answer, which is a string, right?對。這是很簡單的,就是我只是把那個答案,也就是一個字串,對嗎?
207
00:29:00,000 --> 00:29:06,00000:29:00,000 --> 00:29:06,000
The answer is like the middle is about the little or whatever. And then I hit this other API with a bunch of parameters.這個答案就像中間的小或什麼。然後我打這個其他的API與一堆的參數。
208
00:29:06,000 --> 00:29:09,00000:29:06,000 --> 00:29:09,000
Doesn’t really matter. They’re just just copy pasted from their docs.其實並不重要。他們只是從他們的文件中複製貼上出來的。
209
00:29:09,000 --> 00:29:14,00000:29:09,000 --> 00:29:14,000
The only things that matter are basically that I’m hitting the right voice, basically using my voice.唯一重要的是,基本上我打的是正確的聲音,基本上用我的聲音。
210
00:29:14,000 --> 00:29:20,00000:29:14,000 --> 00:29:20,000
And then I just have the string of the answer and it just spits back this response and then which is a JSON object.然後我只有答案的字串,它只是吐出這個響應,然後這是一個JSON對象。
211
00:29:20,000 --> 00:29:25,00000:29:20,000 --> 00:29:25,000
And then I just basically just spits it back the URL of the audio that I would need to play.然後我基本上只是吐出了我需要播放的音訊的URL。
212
00:29:25,000 --> 00:29:30,00000:29:25,000 --> 00:29:30,000
And then I return that as JSON. So I have the question, I have the answer, and then I have the URL of the audio.然後我把它作為JSON返回。所以我有問題,我有答案,然後我有音訊的URL。
213
00:29:30,000 --> 00:29:37,00000:29:30,000 --> 00:29:37,000
And then you go back here in this code, you’ll see. Basically, get the audio.然後你回到這段程式碼中,你會看到。基本上,獲得音訊。
214
00:29:37,000 --> 00:29:46,00000:29:37,000 --> 00:29:46,000
And then get the audio element and then basically play it, use it to play that, basically take that URL, stuff it into that audio element and play play, right?然後得到音訊元素,然後基本上播放它,用它來播放那個,基本上把那個URL,塞進那個音訊元素,然後播放,對嗎?
215
00:29:46,000 --> 00:29:51,00000:29:46,000 --> 00:29:51,000
Start playing that audio at 50 percent volume because no one needs to hear my voice.開始以50%的音量播放那個音訊,因為沒有人需要聽到我的聲音。
216
00:29:51,000 --> 00:29:56,00000:29:51,000 --> 00:29:56,000
And that’s it. That’s really the that’s all that this is. And if you look at any lines of code, is this right?而這就是了。這就是真正的,這就是所有這一切。如果你看一下任何一行的程式碼,這是對的嗎?
217
00:29:56,000 --> 00:30:03,00000:29:56,000 --> 00:30:03,000
This is obviously there’s a little bit more and I’ll go through the scripts as well. But this is like the index files, like what, like 100 lines of code or so.這顯然是有一點點多,我也會通過指令碼。但這就像索引文件,像什麼,像100行程式碼左右。
218
00:30:03,000 --> 00:30:11,00000:30:03,000 --> 00:30:11,000
There’s the views, which is about and I’ll show you there’s a little bit more complexity here, but if you really look at this, it’s probably another 100 lines of code or something like that.還有檢視,這是關於,我會告訴你這裡有一點更複雜,但如果你真的看這個,它可能是另一個100行的程式碼或類似的東西。
219
00:30:11,000 --> 00:30:19,00000:30:11,000 --> 00:30:19,000
And that’s it. Pretty simple. And then I wrote some other scripts, like, for example, I have to take the I have a PDF, the manuscript PDF.而這就是它。很簡單。然後我又寫了一些其他的指令碼,比如說,我得把我有一個PDF,手稿的PDF。
220
00:30:19,000 --> 00:30:26,00000:30:19,000 --> 00:30:26,000
I have to convert that into that CSV. So I wrote some code to do that. Just extract the pages, about 50 lines of code.我必須把它轉換為CSV。所以我寫了一些程式碼來做這個。只是提取頁面,大約50行程式碼。
221
00:30:26,000 --> 00:30:33,00000:30:26,000 --> 00:30:33,000
And I’m sure this can be done in 10. It’s not the most efficient. Then I similarly I take that CSV and I have to generate those embeddings.而且我確信這可以在10行內完成。這不是最有效的。然後,我同樣我採取CSV和我必須生成這些嵌入。
222
00:30:33,000 --> 00:30:37,00000:30:33,000 --> 00:30:37,000
This is even simpler, which is literally completely copy pasted from OpenAI.這就更簡單了,這簡直就是完全從OpenAI上複製貼上過來的。
223
00:30:37,000 --> 00:30:47,00000:30:37,000 --> 00:30:47,000
The only thing I change is like making sure I pull the right file. And then I have this script, which is the thing that I wrote that basically using the embeddings and the pages CSV.我唯一改變的是確保我拉到了正確的文件。然後我有這個指令碼,這是我寫的東西,基本上使用嵌入和頁面CSV。
224
00:30:47,000 --> 00:30:54,00000:30:47,000 --> 00:30:54,000
And then once I had that and I was like, OK, cool, I have a local script that I can ask questions. Now I just have to get it onto the Internet.然後一旦我有了這個,我就想,好的,很酷,我有一個本地指令碼,我可以問問題。現在我只需要把它放到網際網路上。
225
00:30:54,000 --> 00:31:04,00000:30:54,000 --> 00:31:04,000
And really simple to do that. I just literally just this. I just follow this documentation and then I replaced the index with my own code and wrote that.而要做到這一點真的很簡單。我只是從字面上看,只是這樣。我只是按照這個文件,然後我用我自己的程式碼替換了索引,並寫了這個。
226
00:31:04,000 --> 00:31:10,00000:31:04,000 --> 00:31:10,000
Basically took that prompt and that Python script and kind of put it in the back end.基本上,我把那個提示和Python指令碼放在了後端。
227
00:31:10,000 --> 00:31:17,00000:31:10,000 --> 00:31:17,000
Yeah, that’s about it. Took me about 10 hours, I think, all into build. Let me see what other things I have to show. I think I showed everything.是的,這就是它。花了我大約10個小時,我想,所有的建立。讓我看看我還有什麼東西要展示。我想我展示了所有的東西。
228
00:31:17,000 --> 00:31:24,00000:31:17,000 --> 00:31:24,000
So, yeah, this is the example that OpenAI provides. That is how to basically the example they use is questions about the Olympics.所以,是的,這就是OpenAI提供的例子。這就是如何基本上他們使用的例子是關於奧運會的問題。
229
00:31:24,000 --> 00:31:29,00000:31:24,000 --> 00:31:29,000
And so they download a bunch of they create embeddings off of Wikipedia.所以他們下載了一堆他們在維基百科上建立的嵌入物。
230
00:31:29,000 --> 00:31:34,00000:31:29,000 --> 00:31:34,000
So basically, instead of creating embeddings off Wikipedia, I have to create them off of a PDF of my book.因此,基本上,而不是建立嵌入維基百科,我必須建立他們對我的書的PDF。
231
00:31:34,000 --> 00:31:38,00000:31:34,000 --> 00:31:38,000
Right. A little different. But the end output is the same, which you have a CSV of all the data.對。有一點不同。但最終的輸出是一樣的,你有一個CSV的所有資料。
232
00:31:38,000 --> 00:31:44,00000:31:38,000 --> 00:31:44,000
Right. Basically that the embeddings can be trained on. And then similarly, you can see how they do this kind of prompt engineering.對。基本上,嵌入可以被訓練。然後類似地,你可以看到他們如何做這種提示工程。
233
00:31:44,000 --> 00:31:48,00000:31:44,000 --> 00:31:48,000
And you can do stuff like this. Right. You could say answer the question as truthfully as possible.而且你可以做這樣的事情。對。你可以說儘可能真實地回答這個問題。
234
00:31:48,000 --> 00:31:54,00000:31:48,000 --> 00:31:54,000
And if the question is if the answer is not contained within the text below, a response saying, I don’t know.而如果問題是如果答案不包含在下面的文字中,則回答說,我不知道。
235
00:31:54,000 --> 00:31:58,00000:31:54,000 --> 00:31:58,000
So, for example, a bunch of people ask me like this is it. Ask my book is currently very broad.因此,例如,一堆人問我像這是它。問我的書目前很廣泛。
236
00:31:58,000 --> 00:32:03,00000:31:58,000 --> 00:32:03,000
And I wanted it to be broad because I’m curious of the kinds of questions that people will ask this.而我希望它是廣泛的,因為我很好奇的問題,人們會問這個的種類。
237
00:32:03,000 --> 00:32:06,00000:32:03,000 --> 00:32:06,000
But I actually think it should be pretty specific. I think ideally and you could totally do this.但實際上我認為它應該是相當具體的。我認為理想的情況下,你完全可以這樣做。
238
00:32:06,000 --> 00:32:13,00000:32:06,000 --> 00:32:13,000
I think someone asked, like, can you can this link to the page that it gets the information from?我想有人問,比如,你能不能把這個連結到它獲取資訊的頁面上?
239
00:32:13,000 --> 00:32:21,00000:32:13,000 --> 00:32:21,000
100 percent. It could do that. Right. Because ultimately, at the end of the day, all you have to think about is.百分之百。它可以做到這一點。對。因為最終,在一天結束的時候,你所要考慮的是。
240
00:32:21,000 --> 00:32:34,00000:32:21,000 --> 00:32:34,000
What are you putting in this? Right. So if you are able to take something and put it in here to fit above the fold, then in theory, GPT three should be able to help you out.你在這裡面放了什麼?對。所以,如果你能夠採取的東西,並把它在這裡,以適應以上的摺疊,那麼在理論上,GPT三應該能夠幫助你了。
241
00:32:34,000 --> 00:32:47,00000:32:34,000 --> 00:32:47,000
So, for example, let’s say. I wanted to return the pages, right? I wanted to I wanted it to say, if you’re interested in learning more, reading more, this is in page.所以,舉個例子,我們說。我想返回頁面,對嗎?我想我想它說,如果你有興趣瞭解更多,閱讀更多,這是在頁面。
242
00:32:47,000 --> 00:32:59,00000:32:47,000 --> 00:32:59,000
Let’s say this is in Chapter three. Ideally, let’s go for the ideal. Right. Which would be I would say community is a group of people with a shared interest, purpose or goal.讓我們說這是在第三章。理想情況下,讓我們去爭取理想。對。這將是我想說的社區是一群有共同興趣、目的或目標的人。
243
00:32:59,000 --> 00:33:13,00000:32:59,000 --> 00:33:13,000
If you’re interested in learning more. Check out pages five to 50, two to seven, chapter three.如果你有興趣瞭解更多。請查看第五至五十頁,第二至七頁,第三章。
244
00:33:13,000 --> 00:33:23,00000:33:13,000 --> 00:33:23,000
We need first or whatever. Right. Something like that. You would basically just have to say, OK, how do like how do I get GPT three to do that?我們需要首先或什麼。對。類似這樣的事情。你基本上只需要說,OK,如何像我如何讓GPT三做?
245
00:33:23,000 --> 00:33:30,00000:33:23,000 --> 00:33:30,000
Right. Like how would I get it to respond? And one would be all of these examples. You’d probably want to provide that. Right.對。就像我怎麼能讓它做出反應?而其中一個就是所有這些例子。你可能想提供這個。對。
246
00:33:30,000 --> 00:33:38,00000:33:30,000 --> 00:33:38,000
So you probably wanted to train it with a pattern. Right. And so you would go through and you would you would do this realistically.所以你可能想用一個模式來訓練它。對。因此,你會去通過,你會你會做這個現實的。
247
00:33:38,000 --> 00:33:48,00000:33:38,000 --> 00:33:48,000
You would get the right answer and then you would you would say, OK, this is page 28. To 29, chapter two, profits, even first.你會得到正確的答案,然後你會你會說,OK,這是第28頁。到29,第二章,利潤,甚至第一。
248
00:33:48,000 --> 00:34:00,00000:33:48,000 --> 00:34:00,000
And. Boom, all of a sudden it will start doing it, it will be wrong because it hasn’t it doesn’t actually know like the chapters necessarily, or maybe it’ll be surprisingly right.嘣,突然間它就開始做了,它會出錯,因為它還沒有它實際上不知道像章節的必然性,或者也許它會出奇的正確。
249
00:34:00,000 --> 00:34:05,00000:34:00,000 --> 00:34:05,000
I’ve been impressed sometimes with how good it is, but I think at least it’ll start to respond with stuff.有時我對它的好印象很深,但我想至少它將開始用東西來回應。
250
00:34:05,000 --> 00:34:13,00000:34:05,000 --> 00:34:13,000
They’ll start to say if you’re interested in learning more, check out pages and we’ll just try. The cool thing about AI is it doesn’t have imposter syndrome.他們會開始說,如果你有興趣瞭解更多,查看頁面,我們就會嘗試。人工智慧最酷的地方是它沒有冒名頂替綜合症。
251
00:34:13,000 --> 00:34:20,00000:34:13,000 --> 00:34:20,000
So it’ll try and it will spit something back and it might be stupid, really bad. Right. And then you’d be OK.所以它會嘗試,它會吐出一些東西回來,它可能是愚蠢的,非常糟糕。對。然後你就會沒事了。
252
00:34:20,000 --> 00:34:28,00000:34:20,000 --> 00:34:28,000
It doesn’t have it doesn’t know it has to you have to it’s not fair. Like you can’t make someone take a test without giving them allowing them to study.它沒有它不知道它必須你必須這是不公平的。就像你不能讓別人參加考試而不給他們允許他們學習。
253
00:34:28,000 --> 00:34:32,00000:34:28,000 --> 00:34:32,000
Right. Or think about it like an open book test. So you have to get smart about the embeddings.對。或者把它想成是一個開卷考試。所以你必須對嵌入的東西變得聰明。
254
00:34:32,000 --> 00:34:42,00000:34:32,000 --> 00:34:42,000
And I think that’s where kind of the engineering would come from, which is you have that manuscript PDF and you have to get it get the right context into the prompt.我認為這就是那種工程將來自,這是你有PDF的手稿,你必須得到它得到正確的上下文到提示。
255
00:34:42,000 --> 00:34:49,00000:34:42,000 --> 00:34:49,000
And so what I would do is I would basically go back to that script and so go back to.所以我會做的是,我基本上會回到那個指令碼,所以回到了。
256
00:34:49,000 --> 00:35:01,00000:34:49,000 --> 00:35:01,000
PDF to pages content, right? This is a very simple script. It basically opens up the PDF and then it goes through each page and it extracts it.PDF到頁面內容,對嗎?這是一個非常簡單的指令碼。它基本上打開了PDF,然後它通過每一頁,並提取它。
257
00:35:01,000 --> 00:35:16,00000:35:01,000 --> 00:35:16,000
And all it extracts basically is it basically puts it into a CSV where the first column is page seven and the second column is the text and then the number of tokens, which I don’t think I actually use, but I copied it from the Internet.它所提取的內容基本上是把它放到CSV中,第一列是第七頁,第二列是文字,然後是令牌的數量,我想我實際上沒有使用,但我從網上複製了它。
258
00:35:16,000 --> 00:35:22,00000:35:16,000 --> 00:35:22,000
So that’s there. And so it’s it’s like this. Right. But it won’t be able to do things like that or the chapters.所以那是有的。所以它是它是這樣的。對。但它不會做這樣的事情或章節。
259
00:35:22,000 --> 00:35:31,00000:35:22,000 --> 00:35:31,000
And so you would basically I would have to be clever and say page one and then I would have to basically make this script smarter and smarter so that it says chapter one.所以你基本上我得聰明點,說第一頁,然後我基本上得讓這個指令碼越來越聰明,這樣它就會說第一章。
260
00:35:31,000 --> 00:35:38,00000:35:31,000 --> 00:35:38,000
Or you could have you could say page one, chapter one page. And so every page is tagged with a thing.或者你可以有你可以說第一頁,第一章頁。因此,每一頁都被標記為一個東西。
261
00:35:38,000 --> 00:35:47,00000:35:38,000 --> 00:35:47,000
And then if you provide examples that are referencing this, it should be able to guess and say, oh, yeah, OK, cool.然後,如果你提供的例子是參考這個,它應該能夠猜到並說,哦,是的,好的,很酷。
262
00:35:47,000 --> 00:35:52,00000:35:47,000 --> 00:35:52,000
It’s I’m answering this question based on this. And so I should be able to respond. Oh, Ted, what’s up?它是我根據這個問題來回答這個問題。所以我應該能回答。哦,特德,怎麼了?
263
00:35:52,000 --> 00:35:55,00000:35:52,000 --> 00:35:55,000
Let me ask answer some other questions and I’ll let you chat.讓我問回答一些其他的問題,我讓你聊。
264
00:35:55,000 --> 00:35:59,00000:35:55,000 --> 00:35:59,000
MIM says, how does the embeddings know what pages are most relevant to the answers?MIM說,嵌入物如何知道哪些頁面與答案最相關?
265
00:35:59,000 --> 00:36:06,00000:35:59,000 --> 00:36:06,000
So how does it know? Is a good question. That’s like the magic. Let’s see what they say.那麼,它是如何知道的?是個好問題。這就像魔術一樣。讓我們看看他們怎麼說。
266
00:36:06,000 --> 00:36:23,00000:36:06,000 --> 00:36:23,000
It’s a dense information, dense representation of the semantic meaning of a piece of text, each embedding as a vector of floating point numbers such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format.這是一個密集的資訊,密集地表示一段文字的語義,每個嵌入都是一個浮點數的向量,這樣向量空間中兩個嵌入之間的距離與原始格式的兩個輸入之間的語義相似度相關。
267
00:36:23,000 --> 00:36:33,00000:36:23,000 --> 00:36:33,000
So basically it’s compression. Right. And so instead of let’s say you have 50 images instead of having 50 massive images, you have 50 labels of the images.所以基本上是壓縮。對。因此,讓我們說你有50張圖片,而不是有50張大量的圖片,你有50張圖片的標籤。
268
00:36:33,000 --> 00:36:37,00000:36:33,000 --> 00:36:37,000
Right. So it’s kind of like that, which is you don’t have 52000 words.對。所以它有點像,這是你沒有52000字。
269
00:36:37,000 --> 00:36:44,00000:36:37,000 --> 00:36:44,000
You have like this concept connects to this concept and this concept connects to this concept. And it’s just yeah, it’s just much more broad.你有像這個概念連接到這個概念,這個概念連接到這個概念。它只是是的,它只是更廣泛。
270
00:36:44,000 --> 00:36:51,00000:36:44,000 --> 00:36:51,000
I’m sure Ted can actually answer that way better. So let me see if he wants to give it a shot. Ted, what’s up?我敢肯定,特德實際上可以更好地回答這種方式。因此,讓我看看他是否願意給它一個鏡頭。泰德,怎麼了?
271
00:36:51,000 --> 00:36:54,00000:36:51,000 --> 00:36:54,000
Hey, how’s it going? Can you hear me? Yeah.嘿,情況怎麼樣了?你能聽到我嗎?是的。
272
00:36:54,000 --> 00:36:59,00000:36:54,000 --> 00:36:59,000
So what was embeddings? I was typing the question to you in the chat box, so I didn’t hear what the prior question is.那麼,什麼是嵌入?我是在聊天框裡向你打的問題,所以我沒有聽到前面的問題是什麼。
273
00:36:59,000 --> 00:37:05,00000:36:59,000 --> 00:37:05,000
I was wondering if you want to take a stab at answering what embeddings, how embeddings work, what’s magical about them.我想知道你是否想試著回答一下什麼是嵌入,嵌入是如何工作的,它們有什麼神奇之處。
274
00:37:05,000 --> 00:37:11,00000:37:05,000 --> 00:37:11,000
Oh, yes. I think a great way you described it is think of it as building your own search engine is great.哦,是的。我認為你描述它的一個好方法是把它看作是建立你自己的搜尋引擎是偉大的。
275
00:37:11,000 --> 00:37:19,00000:37:11,000 --> 00:37:19,000
I think another in terms of how to think about what they actually are, you can also think of it as like the MRI of a neural network when it’s thinking about a thing.我認為另一個在如何思考他們實際上是什麼,你也可以把它看作是像神經網路的核磁共振,當它在思考一件事。
276
00:37:19,000 --> 00:37:24,00000:37:19,000 --> 00:37:24,000
So one way to talk to GPT-3 is to give it some input and then look at what it says back.因此,與GPT-3對話的一種方式是給它一些輸入,然後看看它的回話。
277
00:37:24,000 --> 00:37:31,00000:37:24,000 --> 00:37:31,000
But another way to talk to it is to give it some input and then take an MRI of its brain right in the middle of it, thinking about your input.但另一種和它對話的方式是給它一些輸入,然後在它的大腦中做一個核磁共振成像,思考你的輸入。
278
00:37:31,000 --> 00:37:39,00000:37:31,000 --> 00:37:39,000
And then the reason why that’s interesting is that gives you back a bunch of numbers. But those numbers have this really useful property, which is it.然後,這很有趣的原因是,它給你帶回了一堆數字。但這些數字有這個非常有用的屬性,那就是它。
279
00:37:39,000 --> 00:37:48,00000:37:39,000 --> 00:37:48,000
The numbers, if you think of the numbers as coordinates in some big space, like you’re plotting them on a graph, similar things tend to be nearby similar things.這些數字,如果你把這些數字看作是某個大空間裡的坐標,就像你把它們畫在一張圖上一樣,相似的東西往往就在相似的東西附近。
280
00:37:48,000 --> 00:38:03,00000:37:48,000 --> 00:38:03,000
And so when it’s thinking about your question, the set of numbers it’s giving you back, if you were to plot that out in a giant graph, those dots on the graph, it’s going to locate your question into a similar spot on that graph as where it located the answer to your question.因此,當它思考你的問題時,它給你的一組數字,如果你在一個巨大的圖形中繪製出來,圖形上的那些點,它將把你的問題定位在圖形上的類似位置,作為它定位你的問題的答案。
281
00:38:03,000 --> 00:38:11,00000:38:03,000 --> 00:38:11,000
Because that’s how, you know, when you think about it metaphorically, that’s like what your brain should be doing. And so it’s just a way to like peek into the MRI scan.因為這就是,你知道,當你思考它的比喻,這就像你的大腦應該做的。所以這只是一種方式,就像偷看核磁共振掃描。
282
00:38:11,000 --> 00:38:19,00000:38:11,000 --> 00:38:19,000
And there’s some good things about it. There’s some tricky things about it, but it gives you the answer to what other things do I already know about that are nearby this is how I think about it.有一些關於它的好東西。有一些關於它的棘手的事情,但它給你的答案,我已經知道什麼其他的事情,是附近這是我如何思考它。
283
00:38:19,000 --> 00:38:20,00000:38:19,000 --> 00:38:20,000
Awesome. Cool.真棒。酷。
284
00:38:20,000 --> 00:38:32,00000:38:20,000 --> 00:38:32,000
Yeah, and I think this, this is like what it might look like if you were to visualize it, which is almost, yeah, just kind of MRI of like how these concepts in your brain may be connected to each other.是的,我認為這,這就像它可能看起來像,如果你是可視化的,這幾乎是,是的,只是一種MRI的像這些概念在你的大腦可能是連接到對方。
285
00:38:32,000 --> 00:38:41,00000:38:32,000 --> 00:38:41,000
It’s kind of like when you smell something and then like your brain is like going to spin up like 20 different memories or something, but you have gazillions of memories, right? Like how is it picking those?這有點像當你聞到一些東西,然後像你的大腦就像要旋轉起來,像20個不同的記憶或東西,但你有數以百萬計的記憶,對嗎?就像它是如何挑選那些?
286
00:38:41,000 --> 00:38:48,00000:38:41,000 --> 00:38:48,000
It has to because you go insane if you just got every memory. So that’s what the brain is good at. It’s good at surfacing like the right.它必須這樣做,因為如果你只是得到所有的記憶,你會發瘋的。所以這就是大腦擅長的地方。它擅長於浮現像右。
287
00:38:48,000 --> 00:38:59,00000:38:48,000 --> 00:38:59,000
When I do it, when I say, when I talk, I’m able to, my brain is constantly like getting all of these weights and able to fill in the next thing I’m going to say based on all the, all the things I’ve heard and all the things I’ve said.當我這樣做的時候,當我說,當我說話的時候,我能夠,我的大腦不斷地像得到所有這些權重,能夠根據所有,所有我聽到的東西和所有我說過的東西來填補我接下來要說的東西。
288
00:38:59,000 --> 00:39:11,00000:38:59,000 --> 00:39:11,000
And that would take just too long. So it has to get efficient. And that’s what I think is cool about GPT-3 is it made, it’s really like you’re taking 52,000 words, you’re finding the right 500 words and you’re using that in the prompt.而這要花的時間實在是太長了。因此,它必須得到有效的。這就是我認為GPT-3的酷之處,它使,它真的就像你把52,000字,你找到正確的500字,你在提示中使用它。
289
00:39:11,000 --> 00:39:24,00000:39:11,000 --> 00:39:24,000
And so for me, it’s really a simple thing, which is it’s yeah, it’s like Google search, except you’re not searching the whole internet. You’re searching a refined space, but with the power and the flexibility and the fuzziness that Google has, it’s just like, I can’t build my own Google.所以對我來說,這真的是一件簡單的事情,也就是它是的,它就像Google搜尋,除了你不是在搜尋整個網際網路。你是在搜尋一個精緻的空間,但以Google的力量和靈活性以及模糊性,它只是像,我不能建立我自己的Google。
290
00:39:24,000 --> 00:39:33,00000:39:24,000 --> 00:39:33,000
But all of a sudden I can build my own search engine on my little, on my document and it scales, right? You can take that script if someone wanted to just like Wikipedia was mentioned before you could do this.但突然間,我可以在我的小,在我的文件上建立我自己的搜尋引擎,而且它可以擴展,對嗎?你可以把這個指令碼,如果有人想就像之前提到的維基百科,你可以這樣做。
291
00:39:33,000 --> 00:39:44,00000:39:33,000 --> 00:39:44,000
I’m going to talk to Tiago Forte from Building a Second Brain. Ted, you might’ve actually, have you talked to him? Did you reach out to him? Someone I saw that reached out to him. So maybe he’s already figured it out, but he has like a million words sitting around.我現在要和來自 "建設第二大腦 "的蒂亞戈-福特交談。泰德,其實你可能已經,你和他談過了嗎?你聯絡過他嗎?我看到有人向他伸出了援手。所以,也許他已經想通了,但他好像有一百萬字在身邊。
292
00:39:44,000 --> 00:39:57,00000:39:44,000 --> 00:39:57,000
He’s like the king of that. And so I think it would be great if he could be able to ask a question. Right now he probably just has to do control F and say, okay, I’m, I don’t know, I’m woodworking or something, but you’re going to miss all the things in which maybe he misspelled woodworking.他就像那方面的國王。所以我認為,如果他能夠提出一個問題,那就太好了。現在他可能只需要做控制F,然後說,好吧,我,我不知道,我是木工什麼的,但是你會錯過所有的事情,其中也許他拼錯了木工的。
293
00:39:57,000 --> 00:40:13,00000:39:57,000 --> 00:40:13,000
And so anyway, it’s pretty cool. And then Mim asks, is the voiceover dynamically generated? So it is. Yeah. So basically it’s just another API that I hit that you, and I’ll show you actually what it is. It’s called Resemble. It’s pretty cool.所以無論如何,這是很不錯的。然後米姆問,這個配音是動態生成的嗎?所以它是。是的。所以基本上這只是我打的另一個API,我會告訴你它是什麼。它叫做 "相似"。它非常的酷。
294
00:40:13,000 --> 00:40:16,00000:40:13,000 --> 00:40:16,000
It works.它是有效的。
295
00:40:16,000 --> 00:40:26,00000:40:16,000 --> 00:40:26,000
So you can see there’s been 3,500 clips generated. They have a UI. So if you want to do this manually, like if you’re, you have some more manual, smaller scale use case, you can just go in and type the thing. You have your voice in here.所以你可以看到已經生成了3500個片段。他們有一個使用者介面。因此,如果你想手動做這個,比如你,你有一些更多的手動,規模較小的用例,你可以直接進入並輸入這個東西。你有你的聲音在這裡。
296
00:40:26,000 --> 00:40:35,00000:40:26,000 --> 00:40:35,000
So I just sent them some recordings of mine and they were able to spin up this voice and it’s pretty crazy. And then you just type into this and it just, it’s pretty fast. Yeah.因此,我只是給他們發了一些我的錄音,他們能夠旋轉這個聲音,這是很瘋狂的。然後你就在這裡面打字,它就,它就相當快。是的。
297
00:40:35,000 --> 00:40:39,00000:40:35,000 --> 00:40:39,000
I don’t know. Maybe it’s not that fast. It’s slow right now. Hi.我不知道。也許它沒有那麼快。它現在很慢。你好。
298
00:40:39,000 --> 00:40:50,00000:40:39,000 --> 00:40:50,000
Yeah. Sounds like how I would say it. And yeah. What other things are here? No, I think I covered everything. Any other questions?是的。聽起來像我說的那樣。也是。這裡還有什麼東西?沒有,我想我已經涵蓋了一切。還有什麼問題嗎?
299
00:40:50,000 --> 00:41:00,00000:40:50,000 --> 00:41:00,000
I noticed you are using the Curie model. Why not DaVinci? Speed and cost would be the, basically how you would actually speed, cost and you know how good it is. Right.我注意到你在使用居里模型。為什麼不是達文西?速度和成本將是,基本上你將如何實際速度,成本和你知道它有多好。對。
300
00:41:00,000 --> 00:41:15,00000:41:00,000 --> 00:41:15,000
So you have to just measure. I think Curie is just 10 times cheaper than DaVinci. And so it’s just a really easy, I would think about DaVinci is what you would use if you’re like really trying to impress someone and then in anything else, you probably want to use Curie or something like that.所以你必須只是測量。我認為居里只是比達文西便宜10倍。因此,這只是一個非常簡單的,我認為達文西是你會使用什麼,如果你真的想打動別人,然後在任何其他,你可能想使用居里或類似的東西。
301
00:41:15,000 --> 00:41:25,00000:41:15,000 --> 00:41:25,000
And you can see this, if you go DaVinci, there’s Curie. It’s actually kind of crazy. Like how, how big the gaps are. It’s almost you want to hire a first grader or third grader.你可以看到這個,如果你去達文西,有居里。這實際上是一種瘋狂。像如何,差距有多大。這幾乎是你想雇一個一年級學生或三年級學生。
302
00:41:25,000 --> 00:41:39,00000:41:25,000 --> 00:41:39,000
So you can see Ada is both the fastest and the cheapest, but presumably it’s not going to get super smart. So if you’re at, if you’re building, let’s say a math thing, I would like, let’s say a little like chatbot where you can ask it fuzzy math questions and it would answer.所以你可以看到Ada既是最快的,也是最便宜的,但想必它不會變得超級聰明。因此,如果你在,如果你在建立,比方說一個數學的東西,我想,比方說一個有點像聊天機器人,你可以問它模糊的數學問題,它會回答。
303
00:41:39,000 --> 00:41:47,00000:41:39,000 --> 00:41:47,000
I would say it’s probably, you could lean more to this side, but if you’re asking things that are more conceptually interesting, maybe this side, but you can see like the difference, right?我想說的是,你可以更傾向於這一邊,但如果你問的東西在概念上更有趣,也許這一邊,但你可以看到像區別,對嗎?
304
00:41:47,000 --> 00:41:58,00000:41:47,000 --> 00:41:58,000
So this is four thousandths of a cent or 10 thousandths of a cent, and this is two cents. This is, it’s something like a 50 X difference or something like that. It’s a big difference, or sorry, of a dollar, not of a cent.因此,這是千分之四或千分之十的美分,這是兩美分。這是,這是類似於50X的差異或類似的東西。這是一個很大的差異,或者說,對不起,是一美元的差異,不是一美分的差異。
305
00:41:58,000 --> 00:42:11,00000:41:58,000 --> 00:42:11,000
So yeah, you’d want to experiment with those things. And speed is another one. Ted says, what did you include in your embedding lookup? Every sentence, first sentence of paragraphs, just section titles.所以,是的,你會想用這些東西做實驗。而速度是另一個問題。泰德說,你在你的嵌入查詢中包括什麼?每一句話,段落的第一句話,只是章節標題。
306
00:42:11,000 --> 00:42:22,00000:42:11,000 --> 00:42:22,000
So I put in just pay, literally it’s just page one is the first column. So page N and then the entire, whatever the text is on the page, literally.所以我把只是支付,從字面上看,它只是第一頁是第一列。因此,第N頁,然後整個,無論文字是在頁面上,字面上。
307
00:42:22,000 --> 00:42:31,00000:42:22,000 --> 00:42:31,000
So it’s like very, it’s not that smart at all. Even, and I use like the, some, the one version that I had laying around. So that has all these weird comments and stuff in it and it’s still pretty decent.所以它就像非常,它根本不是那麼聰明。甚至,我使用像,一些,一個版本,我有躺在身邊。所以,那裡面有所有這些奇怪的評論和東西,它仍然很體面。
308
00:42:31,000 --> 00:42:43,00000:42:31,000 --> 00:42:43,000
But yeah, I think that’s a good example of something that I think you could severely improve. And I think there’s a huge opportunity for kind of startups to get into this, right? Where you’re able to set up the right sort of format.但是,是的,我想這是一個很好的例子,我認為你可以嚴重改善的東西。我認為有一個巨大的機會,種初創企業進入這個,對不對?在那裡你能夠建立正確的格式。
309
00:42:43,000 --> 00:42:53,00000:42:43,000 --> 00:42:53,000
And so let’s say you have a book, you might want to be able to split it up into sections or chapters or paragraphs. I think there’s a bunch of different things that will work better for different things.因此,讓我們說你有一本書,你可能希望能夠把它分成幾個部分或章節或段落。我認為有一堆不同的東西,對不同的事情會有更好的效果。
310
00:42:53,000 --> 00:43:03,00000:42:53,000 --> 00:43:03,000
I tried to, in an ideal world, I would have done probably, I probably would have included page, chap, and section and done it per paragraph, right?我試圖,在一個理想的世界裡,我可能會做,我可能會包括頁,章,節,並做每段,對不對?
311
00:43:03,000 --> 00:43:12,00000:43:03,000 --> 00:43:12,000
So every paragraph was its own or every section, actually, every section was its own line in the CSV instead of every page. And then that I just haven’t done that yet.所以每一段都是自己的,或者每一節,實際上,每一節都是CSV中自己的一行,而不是每一頁。然後,我只是還沒有做到這一點。
312
00:43:12,000 --> 00:43:26,00000:43:12,000 --> 00:43:26,000
Julie asks, is improving the quality of the embedding solely an engineering challenge or is it a writing content dataset challenge too? Could you have written the book differently so that Ask My Book works better?朱莉問,提高嵌入的質量僅僅是一個工程上的挑戰,還是也是一個寫內容資料集的挑戰?你能不能用不同的方式來寫書,讓《問道》的效果更好?
313
00:43:26,000 --> 00:43:47,00000:43:26,000 --> 00:43:47,000
Yeah, I think it’s mostly a, I guess it’s both. Like you can certainly, if you were writing content specifically to train the AI, I would think about it like a database of data, right? Like I would say, I’m writing that, I’m taking that PDF and I’m using that Python script to turn it into this CSV.是的,我認為這主要是一個,我想它是兩者都有。你當然可以,如果你專門寫內容來訓練人工智慧,我會把它想成一個資料資料庫,對嗎?就像我說的,我正在寫,我正在拿那個PDF,我正在用那個Python指令碼把它變成這個CSV。
314
00:43:47,000 --> 00:44:00,00000:43:47,000 --> 00:44:00,000
And I can show you what that CSV might look like. I go to Ask Book and find the pages.我可以告訴你CSV是什麼樣子的。我去問書,找到這些頁面。
315
00:44:00,000 --> 00:44:11,00000:44:00,000 --> 00:44:11,000
And I see an example of what, it’s really simple, literally title, page one, content, and it just stuffs it with introduction, five. This is getting, it’s the page number in the corner that it’s probably pulling, right? Chapter one.我看到一個例子,它真的很簡單,字面上的標題,第一頁,內容,它只是塞進了介紹,五。這是得到的,這是角落裡的頁碼,它可能是在拉動,對嗎?第一章。
316
00:44:11,000 --> 00:44:23,00000:44:11,000 --> 00:44:23,000
It’s just literally, it’s just that. And so you can make it way smarter, but I would think about it like, this is how you would produce the content in the first place if you were using it to be trained, right? You would never even need the script.它只是從字面上看,它只是這樣。所以你可以讓它變得更聰明,但我想的是,如果你用它來訓練,這就是你首先要製作的內容,對嗎?你甚至都不需要指令碼。
317
00:44:23,000 --> 00:44:36,00000:44:23,000 --> 00:44:36,000
You would create like a Notion database or table or Google sheets or A table, and then you would just create the data in that format right away. So it’s, I would say it’s 80 to 90% a content, like getting the right data content and getting into the right format.你會建立像Notion資料庫或表或Google表或A表,然後你就會立即以這種格式建立資料。所以這是,我想說這是80%至90%的內容,如獲得正確的資料內容和進入正確的格式。
318
00:44:36,000 --> 00:44:42,00000:44:36,000 --> 00:44:42,000
And then a small really engineering challenge, but the engineering stuff is like, it’s pretty simple in my view.然後一個小的真正的工程挑戰,但工程的東西是喜歡,它是相當簡單的,在我看來。
319
00:44:42,000 --> 00:44:53,00000:44:42,000 --> 00:44:53,000
Mo says, what text use cases do you think it wouldn’t be suited for? I’m curious about fiction. Is it, would it be useful if you were to be able to ask questions of the Harry Potter franchise or something like that?莫說,你認為它不適合什麼文字用例?我對小說很好奇。它,如果你要能問哈利波特系列的問題或類似的問題,它是否有用?
320
00:44:53,000 --> 00:45:02,00000:44:53,000 --> 00:45:02,000
My guess is asking questions, it wouldn’t be that interesting. That’s probably better for Google or Wikipedia or something, but generating might be interesting, right?我的猜測是問問題,它不會是那麼有趣。這可能更適合於Google或維基百科什麼的,但生成可能是有趣的,對嗎?
321
00:45:02,000 --> 00:45:13,00000:45:02,000 --> 00:45:13,000
So for example, if you were like, Hey, I want to write a story. And what if that embeddings was able to pull from the Canon that JK approved, you could build it, but Harry Potter writer or something like that.因此,例如,如果你喜歡,嘿,我想寫一個故事。而如果那個嵌入物能夠從JK批准的佳能中提取,你可以建立它,但哈利波特作家或類似的東西。
322
00:45:13,000 --> 00:45:23,00000:45:13,000 --> 00:45:23,000
But I think, that’s all right. I meant to ask what it’s not suited for, which I’m probably too optimistic about this. I can’t see like what it wouldn’t be good for. Maybe if there’s certain things that you can’t compress that much. Right.但我認為,這是所有權利。我是想問它不適合做什麼,我可能對這個太樂觀了。我看不出像什麼它不適合。也許如果有某些東西,你不能壓縮那麼多。對。
323
00:45:23,000 --> 00:45:35,00000:45:23,000 --> 00:45:35,000
So for example, go to Escher Bach or like certain, certain books like that, there’s just no compression algorithm. Ironically, go to Escher Bach is a lot, largely about information and compression and things.因此,例如,去找埃舍爾-巴赫或像某些,某些類似的書,就是沒有壓縮演算法。諷刺的是,去埃舍爾-巴赫是很多,主要是關於資訊和壓縮和東西。
324
00:45:35,000 --> 00:45:45,00000:45:35,000 --> 00:45:45,000
And highly recommend by the way, reading that book, if people are interested in this kind of stuff. But yeah, if there’s something that like you can’t compress, like it would just, for example, like an M. Night Shyamalan movie or something like that, right.並強烈推薦的方式,閱讀這本書,如果人們有興趣在這種東西。但是,是的,如果有一些東西,像你不能壓縮,像它只是,例如,像一個M. Night Shyamalan電影或類似的東西,對。
325
00:45:45,000 --> 00:45:55,00000:45:45,000 --> 00:45:55,000
Where it’s like about the whole thing only works or Fight Club or something like that. It wouldn’t work for that kind of stuff. And it’s just AI, right? So you can almost think about it. There’s certain things that are just like higher grade reading level.在那裡,它就像關於整個事情只有工作或搏擊俱樂部或類似的東西。它不會對那種東西起作用。而且這只是人工智慧,對嗎?所以你幾乎可以考慮一下。有一些東西就是像高年級的閱讀水平。
326
00:45:55,000 --> 00:46:06,00000:45:55,000 --> 00:46:06,000
Why is that? Because it’s more complex, different words, like all these reasons. And I would assume a lot of those map probably to the order in which AI will be able to think quote unquote about it and answer questions.為什麼會這樣?因為它更複雜,不同的詞,像所有這些原因。而我認為很多這些可能對應到人工智慧將能夠思考的順序,引用不引用它,回答問題。
327
00:46:06,000 --> 00:46:19,00000:46:06,000 --> 00:46:19,000
Flynn asks, what are your next steps with Ask My Book? Oh, I can show you. So I have a little notion that I use. Everyone probably knows I like notion. And so I just have a little thing that I talk about or my, my internal to the list.弗林問道,你對 "問我的書 "的下一步計畫是什麼?哦,我可以給你看。所以我有一個小的概念,我使用。大家可能都知道我喜歡概念。因此,我只是有一個小東西,我談論的或我的,我的內部到列表。
328
00:46:19,000 --> 00:46:28,00000:46:19,000 --> 00:46:28,000
So it’s pretty short, which is basically I want to obfuscate the ideas right now. It’s like question one, two, three, four, right? So I want to change that so that people can’t see what other questions people are asking.所以它是相當短的,這基本上是我想模糊的想法,現在。這就像問題一、二、三、四,對嗎?所以我想改變這一點,讓人們看不到人們在問什麼其他問題。
329
00:46:28,000 --> 00:46:37,00000:46:28,000 --> 00:46:37,000
I think one really interesting thing about this is that you can ask someone questions without them knowing that you’re asking them questions, which I think can be imposter syndrome.我想有一件非常有趣的事情是,你可以在不知道你在問別人問題的情況下問別人問題,我認為這可能是冒名頂替綜合症。
330
00:46:37,000 --> 00:46:44,00000:46:37,000 --> 00:46:44,000
Like it might be beneficial to be able to ask Peter Thiel a question without him judging you for being stupid or you thinking that he would.就像能夠問彼得-蒂爾一個問題而不被他判斷為愚蠢或你認為他會這樣做,這可能是有益的。
331
00:46:44,000 --> 00:46:51,00000:46:44,000 --> 00:46:51,000
Right. So I want to obfuscate the ideas so people don’t get a sense of the other kinds of questions people are asking. And then I want to do it for another author.對。因此,我想混淆視聽,使人們無法瞭解人們所問的其他類型的問題。然後我想為另一個作者做。
332
00:46:51,000 --> 00:46:58,00000:46:51,000 --> 00:46:58,000
I want to, I mentioned, I want to try training it on like tweets and other things and see if I can see if it scales and how fast I can do it the second time.我想,我提到,我想嘗試在像推特和其他東西上訓練它,看看我是否能看到它的規模,以及我可以在第二次做它的速度。
333
00:46:58,000 --> 00:47:05,00000:46:58,000 --> 00:47:05,000
And then also just get a sense of what’s different, what’s the same in two. Once you go from one to end, it’s like going from making one car and making two cars, right?然後也只是得到一個感覺,什麼是不同的,什麼是在兩個相同的。一旦你從一而終,這就像從製造一輛車到製造兩輛車,對嗎?
334
00:47:05,000 --> 00:47:15,00000:47:05,000 --> 00:47:15,000
You have to start to think about the factory a little bit more. And then I’m going to meet with Penguin Random House, which is my publisher, and just get their thoughts on what they think about this, if it’s interesting to them at all.你必須開始考慮工廠的問題,多一點。然後我將與企鵝蘭登書屋會面,這是我的出版商,只是瞭解他們的想法,如果這對他們來說是有趣的。
335
00:47:15,000 --> 00:47:24,00000:47:15,000 --> 00:47:24,000
Whether this is, would this be something you would embed in like an author’s website? Is it too much? Are you giving away like the form when you allow people to ask questions of a book?這是否是,這將是你會嵌入像一個作者的網站?這是否太多?當你允許人們對一本書提出問題時,你是否會給人以類似的形式?
336
00:47:24,000 --> 00:47:33,00000:47:24,000 --> 00:47:33,000
Is this a lot of publishers are concerned about these sort of Amazon monopoly on books and digital books and audio books. So maybe there’s like a new format that might be compelling to them, who knows?很多出版商都擔心亞馬遜對書籍、數字書籍和音訊書籍的壟斷。因此,也許有像一個新的格式,可能是引人注目的他們,誰知道?
337
00:47:33,000 --> 00:47:50,00000:47:33,000 --> 00:47:50,000
Oh, and then the last one is to try fine tuning a model with a bunch of real answers. So one layer, oh, one thing that I didn’t talk about that I should is if you can’t see this because you have to be logged in, but basically I see all the questions that people have asked that I haven’t yet answered myself.哦,然後最後一個是嘗試用一堆真實的答案來微調一個模型。因此,有一層,哦,有一件事我沒有談,我應該談的是,如果你不能看到這個,因為你必須登錄,但基本上我看到所有的人問過的問題,我自己還沒有回答。
338
00:47:50,000 --> 00:48:03,00000:47:50,000 --> 00:48:03,000
And so basically I want to go through and answer these kinds of questions. I built this like little dashboard and you can imagine similar to that CSV, if you have a similar CSV of like question, answer, question, answer, question, answer, question, answer.因此,基本上我想通過和回答這些種類的問題。我建立了這個類似於小儀表盤的東西,你可以想像類似於那個CSV,如果你有一個類似於CSV的問題、答案、問題、答案、問題、答案、問題、答案。
339
00:48:03,000 --> 00:48:18,00000:48:03,000 --> 00:48:18,000
Then I want to try fine tuning a model based on that. So instead of effectively, instead of having to stuff the prompt and you’re going to be limited with how many of these you can do, you can almost think of you train a model, a custom model on hundreds of these.然後我想嘗試在此基礎上微調一個模型。因此,而不是有效地,而不是必須塞進提示,你將被限制與多少這些你可以做的,你幾乎可以認為你訓練一個模型,一個自訂的模型在數百個這些。
340
00:48:18,000 --> 00:48:31,00000:48:18,000 --> 00:48:31,000
They recommend 500 or more, which is why I haven’t done it. Just going to take a while, but then you could do both. You could have, you could still have this and then you could, you could simplify and you could try different things in terms of where you want the intelligence to come from.他們建議500或更多,這就是為什麼我沒有做。只是要花點時間,但你可以同時做。你可以有,你仍然可以有這個,然後你可以,你可以簡化,你可以嘗試不同的東西,在你希望情報來自哪裡。
341
00:48:31,000 --> 00:48:41,00000:48:31,000 --> 00:48:41,000
And you can see here, I’ve actually trained a couple of different models. So there’s two that I fine tuned on 10 answers just to see. And it’s not good. It’s not, it just says weird stuff.你可以在這裡看到,我實際上已經訓練了幾個不同的模型。所以有兩個我在10個答案上進行了微調,只是為了看看。這不是很好。它不是,它只是說奇怪的東西。
342
00:48:41,000 --> 00:48:57,00000:48:41,000 --> 00:48:57,000
I’m like, Oh, that’s creepy. Cause I train on 10 answers just to do it and go through the code. And yeah, in theory, if you train it enough, you don’t need anything, right? You can just do Q or you don’t need as much because you, in theory, you’ve taken all of that context and put it into the model itself, right.我想,哦,那是令人毛骨悚然的。因為我在10個答案上訓練,只是為了做它,並通過程式碼。是的,在理論上,如果你訓練得夠多,你就不需要任何東西,對嗎?你可以只做Q,或者你不需要那麼多,因為你,在理論上,你已經採取了所有的背景,並把它放入模型本身,對吧。
343
00:48:57,000 --> 00:49:04,00000:48:57,000 --> 00:49:04,000
In the brain, like a very kind of like shallow layer of the brain in a sense. Right. So yeah, that, those are some of the next things that I want to do with Ask My Book.在大腦中,從某種意義上說,就像大腦的一個非常樣的淺層。對。所以,是的,那,那些是我想做的一些下一步的事情,與問我的書。
344
00:49:04,000 --> 00:49:15,00000:49:04,000 --> 00:49:15,000
So it’s a combination of better data, better prompt engineering, and then advancements just generally in GPT-4, et cetera. And then the tweaking the parameters. There’s all these kind of things you can tweak.因此,這是一個更好的資料的組合,更好的提示工程,然後進步只是一般在GPT-4,等等。然後調整參數。有所有這些種類的東西,你可以調整。
345
00:49:15,000 --> 00:49:24,00000:49:15,000 --> 00:49:24,000
There’s also a lot of, I think, magic that’s really simple heuristics. For example, I want the, I want it to be more conversational, which means that there should be longer sentences and shorter sentences.也有很多,我認為,神奇的是,真的很簡單的啟髮式方法。例如,我希望,我希望它更有對話性,這意味著應該有更長的句子和更短的句子。
346
00:49:24,000 --> 00:49:38,00000:49:24,000 --> 00:49:38,000
So that’s simple math, right? I can just pick a random number between 10 and 500 and it’s so you can get smarter and smarter over time. So they’ll still be like dumb parts, right? Quote, unquote dumb parts of me, just like pretending to be smart, pretending to be AI.所以這就是簡單的數學,對嗎?我可以在10和500之間隨機挑選一個數字,它是讓你隨著時間的推移變得越來越聰明的。所以他們還是會像啞巴一樣的部分,對嗎?引用,不引用我的啞巴部分,就像假裝聰明,假裝是AI。
347
00:49:38,000 --> 00:49:48,00000:49:38,000 --> 00:49:48,000
But then isn’t that what AI is trying to do? Right. So it’s all just layers of, it’s all spectrum, right? Intelligence. So anyway, I think I answered everybody’s questions.但是,這不就是人工智慧要做的事嗎?對。所以這一切都只是層層的,都是光譜,對嗎?智慧。所以無論如何,我想我回答了大家的問題。
348
00:49:48,000 --> 00:49:55,00000:49:48,000 --> 00:49:55,000
Oh, there’s one more, which is how good is fine tuning? For images, it’s very good. For text, it doesn’t seem to be that valuable, but I think it’ll get there.哦,還有一個,那就是微調的效果如何?對於圖像,它非常好。對於文字,它似乎沒有那麼有價值,但我認為它會到達那裡。
349
00:49:55,000 --> 00:50:04,00000:49:55,000 --> 00:50:04,000
I think my plan at some point is to go through these, answer as many of these questions. Like, for example, I say the book is something like that, right? And I would hit answer and then it would train.我想我的計畫在某些時候是去通過這些,回答儘可能多的這些問題。比如說,我說這本書是這樣的,對嗎?然後我就會點選回答,然後它就會訓練。
350
00:50:04,000 --> 00:50:14,00000:50:04,000 --> 00:50:14,000
It would, right now it doesn’t actually do anything. It just stores that data. And then eventually I export that data, train a new model on it and hope at some point it gets good enough that I don’t need as many Qs and As and things.它將,現在它實際上沒有做任何事情。它只是儲存這些資料。然後最終我匯出這些資料,在上面訓練一個新的模型,並希望在某些時候它變得足夠好,我就不需要那麼多的Q和As之類的東西。
351
00:50:14,000 --> 00:50:22,00000:50:14,000 --> 00:50:22,000
And the reason this takes so long is because it’s badly written code. I’m sure I could make this like much better if an engineer wants to help make it better.這需要這麼長時間的原因是它的程式碼寫得很差。我相信如果有工程師願意幫助我把它寫得更好,我可以把它寫得更好。
352
00:50:22,000 --> 00:50:30,00000:50:22,000 --> 00:50:30,000
I don’t know if this will be a product. I don’t know. I’m just, I just want to learn. I want to focus on basically like I asked my own book, what should I do?我不知道這是否會成為一個產品。我不知道。我只是,我只是想學習。我想專注於基本上像我問自己的書,我應該做什麼?
353
00:50:30,000 --> 00:50:35,00000:50:30,000 --> 00:50:35,000
And it says to focus on solving a problem that you have. So that’s what I’m doing and we’ll see. We’ll see where things go.它說,專注於解決一個問題,你有。因此,這是我正在做的,我們會看到。我們將看到事情的發展。
354
00:50:35,000 --> 00:50:42,00000:50:35,000 --> 00:50:42,000
So hopefully this was useful. Everyone knows hopefully where to find me at SHL on Twitter. If you have questions, happy to send you stuff, add you to code, send copy paste code to you.所以,希望這是很有用的。每個人都知道,希望在哪裡可以找到我在SHL的Twitter。如果你有問題,很樂意給你發東西,給你加程式碼,給你發複製貼上程式碼。
355
00:50:42,000 --> 00:50:52,00000:50:42,000 --> 00:50:52,000
But I highly recommend just like just copy, just do exactly what Justin has done with the Pokemon thing, with the Olympics example, like just try to do exactly and then start messing around.但我強烈建議就像只是複製,就像Justin對Pokemon的事情所做的那樣,以奧運會為例,就像只是試著做到底,然後開始亂來。
356
00:50:52,000 --> 00:51:13,00000:50:52,000 --> 00:51:13,000
But I’m excited. I think it’s gonna be an interesting few years. So hopefully that was useful and I’ll catch you guys later.但我很興奮。我認為這將是一個有趣的幾年。因此,希望這是很有用的,我以後會趕上你們。