AIn't What It Used to Be: November 2024

Friday, November 29, 2024

11/29/24: Free discussion

Machine Learning Study Group

Welcome! We meet from 4:00-4:45 p.m. Central Time. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many.

Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

140^th meeting, Nov. 29, 2024

Table of Contents

* Agenda and minutes

* Transcript (if available)

Agenda and minutes

Open discussion today!

Transcript:

Fri, Nov 29, 2024

3:18 - D.B.
Hey, everyone. Hello. Hey, so I'm leaving for the airport. So I'm just letting you guys run it. Bye.

3:30 - D.D.
Have a good trip.

3:32 - J.K.
D., do you have a good Thanksgiving man?

3:38 - Unidentified Speaker
No, not really. I was sick.

3:43 - J.K.
Oh, dang. I'm sorry, man.

3:47 - D.D.
I didn't get any turkey.

3:51 - J.K.
Nothing.

3:52 - Unidentified Speaker
Dang.

3:53 - D.D.
That's bad. Yeah, I'm sorry. Yeah, my wife didn't even make me a plate.

4:02 - J.K.
She went. Oh man, yeah.

4:04 - D.D.
They said I didn't see anything there that you'd like,

4:08 - J.K.
I'm sorry, man. Yeah, it happened.

4:11 - D.D.
I'm I'm feeling a lot better today.

4:14 - J.K.
Do you know where you had just like a stomach bug or no?

4:19 - D.D.
I mean, it could have been a cold. I mean, I don't know my chest. I still have kind of a lump in my chest, but my throat and it's not all itchy and scratchy anymore, but I don't know. Yeah. I didn't, I didn't, you know, I didn't have a favor, so I wasn't too worried about it. Yeah.

4:41 - J.K.
I just isolated myself. Yeah. That's still not fun, man. I hope you can, I hope you can make up for it with like a big Christmas dinner or something like that.

4:51 - D.D.
We'll see. So did anybody prepare anything? I did not. I was unwell and I have two papers out there right now. So I'm really busy trying to get camera ready stuff and prepare presentations and all that.

5:13 - J.K.
I didn't exactly, I wasn't able to follow the guideline for using like multiple agents to create something. But in the chat, I did share a prompt that I made from this past week. I call it Super Chat OS. I specifically designed it for chat GPT, but it'll work with any of them. And it's designed to give the user like advanced functionality. And if you look in it, it's mainly, you can paste it in the beginning of a conversation. I also included a version of it that you can put in the custom instructions in chat GPT so that it's persistent across every conversation. But I'm pretty proud of it. You can generate a knowledge base that works for the rest of the conversation. And it puts like different skills and different knowledge into the assistant. You can also, I think the coolest functionality of the system is you can do what's called expert chat, which it simulates a super user or someone who's really good at prompting talking with the assistant about your topic. So like a lot of time, I don't, I don't know what questions to ask, or I don't know, I don't know what prompts to write to learn more about something. Uh, and so. It's, uh, it's able to, it's able to simulate a conversation between someone who's better at me better than me at prompting or the other, the other really cool simulation that's in that prompt is called group think. And so if you type like groupthink chain prompting what will happen is it'll create a panel of experts on whatever topic you're discussing and then they will have a conversation about whatever you're whatever you're trying to learn. So to me to me it's just like I wanted to create a prompt that made anybody into a like chat GPT super user. And so definitely, definitely give me feedback. J. Yeah.

7:49 - D.D.
Did you say that you put this prompt in chat? Yeah, I'll reshare. Yeah.

7:55 - J.K.
So if we weren't in the meeting, then yeah, we don't.

7:59 - Multiple Speakers
Oh, I gotcha. Okay. So there.

8:02 - J.K.
Yeah, I put it in. I put it in again. But yeah, it starts with it starts with the user guide on how to use it. And then there's the main prompt. And then there is like the, it ends with the, the really token conscious, I think it's only like 1500 characters that you can put in for custom instructions. But, but, but yeah, I just wanted, we have all these discussions about just like, things, things that we're exploring within chat GPT. And I think this hopefully, kind of makes it possible for anyone to have really productive conversations with chat GPT. That is super cool, J. I'm really excited to take a deeper look, a deeper dive in that and try to hybridize my monolithic prompt approach with your multi-agent approach and see if we can come up with something there.

9:02 - V.W.
Thanks, man. I appreciate it.

9:04 - J.K.
I have a couple of slides I made for the meeting.

9:08 - V.W.
as I promised to share that took me a different direction than I intended to go. And when everybody's had a chance to say anything they need to say, I'd like to share it.

9:26 - J.K.
I'm ready. Yeah, I'm ready. I'm looking forward to seeing them.

9:32 - V.W.
Can you see my screen? Okay.

9:36 - J.K.
Yes. Okay.

9:37 - V.W.
Of all the opuses, this is the most minor of opuses, so it shouldn't really be called an opus at all. But last January, over a year ago, almost two years ago, I wanted to make an index of what tools were being launched on the scene in the following areas, text to text, text to code, text to image, text to audio, text to video, yada, yada, yada.

10:02 - Unidentified Speaker
And, you know, you can take the Cartesian product of those media and cross them with each other and generate this.

10:09 - V.W.
Here are things we can do with AI. And so when I went at earlier in the week, when I went to try to extend this list to support all the growth that I expected in the number of these tools, I thought it would be exploded. And so for example, for text to text, I had 24 January a year, a year before our previous January. But it turned out not to be the case. It turned out that there had instead been a consolidation of the industry around these four. And if you have a fifth, I'm happy to include it. The thing that surprised me was that users have settled on using these things and they've settled into patterns. And this is similar to what happened in the auto industry when There were something like 2000 car makers in the year 1920, and then those consolidated to the big three in the 70s and 80s and so forth. And so I just find it fascinating that this occurred and that instead of me having to do 48 text-to-text transformers and review how they should be used and all their nuance, it has instead simplified itself. To something that's closer to my heart, and that is text to code, because I really wanted to know which tools I should be choosing, given my limited time and limited attention. Where should I spend my time? And I initially developed this slide without the Gemini Advanced being in the lead line. And it showed Cloud Artifacts as being really a high performer. But then I thought, well, I don't want to leave Gemini Advanced out. And then I ran into some because if you ask each chatbot how much the other chatbots cost to develop, and what their context lengths are, and what their training set size were, there is a disparity of answers, especially when it comes to development cost. I found that there were 50 billion, 5 billion, and 0.1 billion cost estimates for the cost of training chat GPT-4.0. But then I was able to scale the calculations by the number of tokens that were being processed, the size of the training set. And I was able to come up with little closer figures that are maybe within 20% if we're lucky. But as we go across this graph, we find out that JavaScript and Python, fortunately, are the programming languages that are producing the best results in terms of accuracy with regards to some sort of prompt that the user has. And things like Visual Basic, and even poor lonesome C aren't faring quite as well, especially when we look to the off-brand code generators. So I thought, well, if I have to advise someone, what should they choose if they have limited time and limited resources to try to write code more efficiently? What should I do if they have a specific language preference? So this is all the major languages that are in play right now. I think TypeScript, which is T-script in the rightmost almost column, it shows a 94% with GitHub copilot. So that indicates tight integration and TypeScript was used to produce one of the most exciting demonstrations that I ever saw for introduction to artificial intelligence. And that is the TensorFlow playground. So I think TypeScript is still in the running because it's just a super set of JavaScript maintained by Microsoft. So our, family-friendly heroes, JavaScript, Python, and even Java, along with our other favorites of Java, C++ and C Sharp, are staying in the mix. And the trouble is, is that I have a lot of experience with using ChatsGPT 1.0 and 4.0 and CloudRFX for programming and some experience with Codium, but I haven't really had that rewarding of a journey with Gemini Advanced. And Gemini Advanced, Advanced claims, if you ask it, how many tokens it accepts in its context link, it claims a million tokens, but that's only for a specially privileged subset of users. And for the rank and file, like the rest of us, they claim 128 token links. I don't really know if that's true either, because I found that Gemini Advanced tends to poop out on me after a relative, it'll promise all these things that it could do, but then it won't do them when you say, I'd like you to do all of those. So there's something still suspicious in the mix. And, um, that's that I wanted to do the, uh, image generators, which we see in this first line. And, and this, by the way, is also old and there has been some consolidation here, but, um, I didn't have time actually took a Thanksgiving holiday. So that is the, uh, long and short of what I wanted to do. And are there any questions? Are you going to share these slides with us?

15:23 - Unidentified Speaker
Say what?

15:23 - D.D.
Are you going to share these slides with us?

15:26 - V.W.
Sure. Well, since the meeting was recorded, I think I just did. And if you need something more specific, I'd be happy to. I'd like that page right there.

15:36 - D.D.
Why don't you just screen grab that puppy or I'll do it.

15:40 - V.W.
I can probably do that.

15:42 - R.S.
It's been a long time. If you could summarize this like in one or two sentences, what would those sentences be?

15:52 - V.W.
The first sentence was that any of us have this opportunity to drive a $12.5 billion Porsche around the programming block, and that we should probably be doing so with all resolve. There's been so much money spent on making code generation more accurate, more fulfilling of what we actually want, that I think we should all be in the habit of using it. And so that's the first thing. The second sentence I would say is that the cost of developing these text-to-code generators has been the equivalent of paying 124,600 software engineers $100,000 for one year. And so this kind of looks, for people who are into financials, this gives us our cost amortization of what we have to extract this newfound value in order to justify its development. And I think portends an important criteria for AI, and that is only the big organizations have the resources to actually train these LLMs at scale. And for any of us to presume that we can do that is fallacious. On the other hand, we know with fine tuning, we can do transfer learning and get things going. But I think that there's this line, and I've certainly encountered that, where the complexity of our problem would prefer that it to be in the original training set and not as an add-on, say, in retrieval-augmented generation. So those will be my summary sentences.

17:28 - Y.i.P.
Dr. W., this is Y.

17:31 - V.W.
Hi, Y.

17:32 - Y.i.P.
Thank you for sharing that. I did take a screenshot. I have a few questions. So what this is saying is text to development. And I'm assuming you're using some research data to create this, but not in reality, people have actually tried to use it. Or have you actually tested any of these components or any of your students have tested any of these components to prove that, for example, JavaScript is 95, really at 95%. And if not, do you have any plans to do that?

18:18 - V.W.
Here's what I did, and my methodology was somewhat stilted, and I appreciate the question. I asked each of these LLMs what the statistics were for their performance in these language areas. I'm a little bit suspicious that Gemini Advanced may be inflating its figures, because when I asked Gemini Advanced alone what its development cost is, it said $33 billion. But when I fell back to the other four sources that I was using, they had much lower estimates of what it costs to deploy. Also, with OpenAI, I asked them the same question, give me the statistics, and when you give me the development cost, give it to me for the whole inclusion. Of Chat GPT 3, 3.5, 4, and the derivative ones we're using now, because they really have to include that since those generational tools were not suddenly restarted from scratch. So in terms of my personal experience, I have pretty strong experience with Chat GPT 4.0, Cloud Artifacts, and Codium, and I tend to believe the figures for them. One thing I'd like to see done is I'd like to see someone take this table and produce a coding example that's more complex than Hello World, but less complex than a full-on deployable app, and actually ask for their prompt that would be properly vetted through the J. quality control pipeline, for their prompt, however long it is that will still fit in the context, how close a result would they say they got in grading the usability of the code? When I presented last week, I talked about how many shots it took to get usable code. And I think that this graph is partially a reflection of that, that you can not only look in terms of zero slot accuracy, but how long are you going to have to fool around with the LM to get the

20:18 - Y.i.P.
actual code for what you're trying to do? Got it. So I'll tell you why I asked this question. While we speak, we are in fact, in my company, we are using, if I can say, it's not text-to-code necessarily, but even code-to-code. Yes. And meaning that...

20:38 - V.W.
Oh, rewrite this code for me and make it better.

20:43 - Y.i.P.
Exactly. Or we have peer reviews where we are actually doing integration testing or all kind of testing using the code.

20:54 - V.W.
And even for that, I'm not getting that high.

20:58 - Y.i.P.
I see 97% somewhere. You see it on Gemini Advanced, which so far is the most suspect of the lot. Yeah, and I have not used Gemini as yet. So I'm actually intrigued and I'm definitely going to ask my team to use it. But my son used the Gemini today for maths, but that's a different topic. But I am actually volunteering, I do have a couple of people who are doing it internally. But if you want, and I'll choose a couple of software also, I do see HTML CSS, JavaScript as one category, if I'm not wrong, that is one category.

21:47 - Multiple Speakers
That is correct. Yeah.

21:49 - Y.i.P.
So and we are using React.js on the top, and then there is HTML, CSS, and JavaScript. So my team is actually working. And if you have any students, a couple of students, I know last week also we spoke about collaborating on another topic. But if you want, I am very much interested in choosing Python and JavaScript, those two areas. And testing across at least first code to code and then text to code. Now, once we go through those two, because I'm actually building script, then we can go to the others, because we are also building an algorithm in Java, but that will start in March or April. But these three, I'm happy to actually go through it and test with the real application that we are building. And I have challenged my team to actually use it at least for testing or improving the code at this point of time. So I just wanted to bring to your attention, but this data is extremely helpful. And thank you for presenting. And I think we can do take step two, if you're interested in this.

23:05 - V.W.
I'd really like to see the results of that. And I would invite you and your people to present that because it'd be enormously helpful for those of us who have limited time to spend, but there are things that we have to get done. For example, some of us have to rarely write an SQL interface for work that we're doing. It's a one-time job, one-off, and we'd like to know which tool we should use to generate our SQL. So this kind of gives us a guideline that Cloud Artifacts might be a good starting point for that. Less of us are doing Rust, Swift, and Ruby, but those are still important things to consider for people who are trying to deploy apps on various platforms, whether they be servers or little computers or Macintoshes. So I'm just, I saw a demonstration this week of a pendulum, which was a two arm pendulum and a two arm pendulum is famous for after just a few periods of oscillation starting to become chaotic in its motion. And the person had taken all the cases for the initial values of the arm positions in the two link pendulum and they had made basically a spreadsheet with theta one and theta two, the starting angles, and then they had let the simulation run and they had noticed that values that were nearer to zero tend to stay regular for longer, but those who flip through a full two pi radian oscillation became chaotic quickly on the fringes, but eventually the chaos crept towards the center. Well, I kind of had visualized if someone could take this chart and zoom into it, morph, and just show, for a simple example, if it's true or not, and the degree to which it's true, in several areas. One is how many shots it took. The second is how good were the graphics. And three was maybe how pretty the code was, subjectively speaking, where pretty is a measurement of expandability, maintainability, and so forth.

25:06 - Y.i.P.
So, yeah, I would really invite that.

25:08 - V.W.
I think that all of us are pinging off of each other and getting ideas. And that would be a great way to keep this synergy going is to just take turns. It's like, you know, if you're in a, you never want to be the best musician in the band. You always want to see somebody else do a great solo that's, you know, really improves and provokes the band to great, you know, achievement. And so I'd like to see that for us too.

25:32 - Y.i.P.
I'll tell you some background in a couple of minutes that my team is doing at these two and we use SQL too. I forgot to say that we use Python and SQL alternately depending on whether we are going to use AI libraries or not. If we are not using AI libraries, we build things in SQL. But I'll tell you a couple of things we are doing. One is before even we use it, we are actually firming up our coding standards and coding methodologies and coding structure. What I mean is, for me to say that my Python code is 97%, 97% of what, right? Exactly. So that is the first thing we are firming up, especially when it comes to HTML and JavaScript. And what I feel is, and we are using by the way uh gpts uh to build that right and I missed the first presentation I have some questions on the the multi-agent thing but I'll ask later so that is the first thing we are doing um and perhaps when we meet the following weekend or whenever we are meeting I'll present that to you and then that will that will be our yardstick and then our measurement will be against that yardstick, but there will be a version one of those standard methodologies. And then there will be version two, because we want to improve that yardstick as well, which will essentially become the Bible of, hey, this is the best thing and the most effective and efficient. So that will also become version one or what you call first test, I mean, second or whatever.

27:24 - V.W.
And you, you, these, these, so let me interrupt you because you're so on track. There's an Oak Ridge National Laboratory has a set of benchmarks that every time a new machine, a new CPU, a new GPU come out, they run the benchmark on the new device and they put it in the log and tell you its performance. And it's become a very famous resource that's lasted years and years and goes way back to almost the dawn of time of numerical analysis. And we need a similar thing for evaluating the efficacy of these text-to-code and code-to-code generators. So, if you were able to put something like that together, it could be one of those things that it becomes additive over time. It becomes a product that can be rerun and rechecked when necessary, when new standards emerge, but can form a baseline for what our expectations should be. And I've noticed that people who choose to go into the benchmarking activity tend to have long lifetimes in terms of their value to computing society.

28:27 - Y.i.P.
It's like something to keep a mind on. Got it. The second question actually I have is, now that I told you the languages, when it comes to model, in the interest of time, I'm assuming that you would say, although I know Gemina is constantly advertising on my YouTube channel, all Google channels, somehow I'm seeing Gemina advertisements nowadays.

28:55 - V.W.
But I was forced to download Gemini app.

28:58 - Y.i.P.
I don't know whether you all know there's a new Gemini app on iStore.

29:04 - Multiple Speakers
Yeah. And I'll tell you something right on that question for apps is I see no reason to have a user PC resident app when all that functionality is available in the browser.

29:16 - V.W.
Because when you're in the browser, you're already connected to a large number of potential mashup members for whatever doing. And it seems that coming out of that into the app is unnecessary overhead, but there's a proviso on that. And that is if you're doing proprietary work, being in the app may confer on you a greater degree of privacy in the research that you're doing. So if proprietary work or patent work is important to your intellectual property work, there may be an advantage to using the app. But for me, you know, a pure academic, a token academic, as it were, I like to stay in the browser for as much of the work as possible.

29:59 - Y.i.P.
Got it. Thank you for that. So now I'm trying to relate the previous topic to this topic. When the previous gentleman I had missed, I think the previous discussion on the topic too. When you say multi-agent, are we talking about these different models and within those different models? Can I do run a code across all these models which has different agents at the same time? Is that what the previous point was? Three thoughts. I don't know if I'll recover them all.

30:34 - V.W.
One, I want J. to speak to this. Two, multi-agent is usually within the context of a single LLM. Three, A. N. just published yesterday that there is a new tool available from Stanford that can jump across multiple agents within the same context, provided you have the authorization tokens for that. So I hope that answers your question.

30:56 - J.K.
Yeah, I'm excited looking at this slide from a multi-agent perspective, because correct me if I'm wrong, the accuracy ratings that it's giving you is kind of for zero shot, right? Like where I say I need this piece of code and then it produces it with that level of accuracy.

31:17 - V.W.
Yeah, that's the presumption.

31:20 - J.K.
Yeah, so I personally don't write code, but I've had situations where I've needed, specifically Python, and what I do is I will put one agent into a chat GPT conversation and say, you are the world-leading Python developer. You produce, it's really, it's really funny to see how much better responses get when you simply say, you're the world's leading or you're a world down. It is shocking how that changes the output. But in that same chat, I'll also put a QA agent or a project manager agent and have them work together. I mean, the thing that I like looking at these numbers is, just once you have multiple agents, one checking the other, one can even be a junior developer and the other a senior developer.

32:24 - Multiple Speakers
All these numbers go to 100% whenever you consider the ability to have multiple agents checking one another and contributing.

32:33 - V.W.
In fact, this slide was produced using that exact technique because I got these figures and reintroduced these figures to multiple LLMs before I even approached the confidence that I had something that was presentable. And I still have a lot of there's still a lot of wiggle room in these numbers, if we pointed out. And I'm so tickled, J., every time you say you don't program, because the things that the thing that LLMs have enabled us to do is to program in English. And you're one of the best programmers in English that I happen to know. So every time you say you're not a programmer, it tickles me because you're programming at the highest possible level of human achievement.

33:13 - Multiple Speakers
And I appreciate your humility.

33:15 - J.K.
It's fantastic. Well, I mean, the thing, the big thing, the big takeaway from just looking at this at this graph is I think what we're going to see within the next couple of years is like we all have limited amount of time to learn things and apply them in our lives. And I think what we're going to see is the return on investment of learning how to manage a multi-agent team that has access to all of these languages. I, as a non-technical non-coder, I benefit more from learning how to manage a team of coders and basically be able to write in any of these languages, compared to trying to learn Python for the first time, which I've done in the past, it really gets to the point where we ask ourselves, is it worth learning a programming language from scratch? And especially I'm thinking about just like kids in coding boot camps and things like that, We are a generation on a precipice with respect to the question we just articulated.

34:39 - V.W.
Yeah, we have an avalanche of people who are literally having to make this not make versus buy. Well, it is kind of a make versus buy decision. Do I utilize the expertise in the LLM or do I try to go it on my own?

34:54 - J.K.
And that's ridiculous to go it on your own. Yeah, it really like it's going to get to the point where I think and this is this is uncomfortable for me to say, because I mean, I've, I've accepted and I've been humbled by chat GPT in that I've done marketing for over a decade. And I can sit down and have a 10 minute conversation with chat GPT. And it's, it's capable of doing more and, and producing higher quality marketing plans and assets and things like that, in 10 minutes than I then I could produce having 10 years experience.

35:29 - V.W.
So I'm having the same experience in articulating ordinary correspondence that I want to clean up.

35:34 - J.K.
The ability to do sentiment analysis and then reflect the better sentiment in the work.

35:39 - V.W.
There's even an Apple commercial about somebody who writes a letter and then they're all mad and stuff and then they have Apple clean it up and then it's all nice and it's much more effective because it's more honey, more ants with honey than with vinegar.

35:55 - J.K.
So yeah, I'm totally on board with that.

35:58 - V.W.
Here's where I think going to have a loss though. This whole avalanche of people are going to just sit on top of the LLMs. But there was a generation that grew up being able, and especially in rocket science, being able to do back of the envelope feasibility calculations. So if somebody came to them with some wild harebrained scheme or theory, they could quickly with their back of the envelope mental calculations, do feasibility analysis and say, I think that sounds specious, or I think that sounds possible. And so we lost the back of the envelope people a generation ago. And now we have the back of the envelope coders who, because they've been through so much integration and errors and lost weekends for a semicolon, et cetera, that they have a real bare metal view of what's possible and not possible. And so when we're doing complex things, sometimes when we're just living up at the mashup level, we can be invoking enormous amounts of unnecessary complexity to do relatively simple things. But then you can always say, well, if I can just throw another couple GPUs at it, who cares? And that's where we are now. And so it's going to be interesting to see in a generation or two programming wise, if there's even still a notion of programmers could become blue collar people like buggy whip manufacturers were in the automotive revolution. I mean, that's a terrible thing to say for those of us who have spent their lives trying trying to become decent programmers, but we have to acknowledge that possibilities there.

37:30 - D.D.
Think about the possibilities of the jobs out there, people that can come in and make things more efficient. Right.

37:38 - V.W.
Because we now know that at least for 124,000 software engineers, we don't need their services any longer. And what is that going to mean? Because Google made a huge tactical error that just as things were taking off with their large language model work with Bard, they let a lot of their AI staff go thinking they're not going to need as many programmers. And given that they were trying to deploy to Google Home and Google devices and various sorts of, you know, outreaches of these technology, they kind of made the worst possible management decision because they should have hired more people rather

38:16 - Multiple Speakers
than letting go as a cost-cutting measure of those really good people they had that are already trained, already had desks, already had computers.

38:23 - V.W.
So I think this premature loss of labor forces is, you know, and we've been through this before. We went through this in the early 2000s at the telco shakeup. You know, there were all these telecommunications companies trying to be first to give us that last quarter mile on that fiber optic to the curb and all that. And then it shook down to like the three or four major players that we all know and see their ads today with an occasional intrusion by a movie star buying their own company, but you know what I'm saying.

38:54 - J.K.
I think what we're going to see is, I mean, we're talking about people's skills being deprecated on some level, but I think it's going to have a flattening effect where, if I'm a Python developer, why would I try to work a company when I could simply spin up a multi-agent team that complements every other aspect of my skill and I'm basically my own dev shop. And I've got an answer for that.

39:26 - V.W.
I want to give it to you real quick and then I don't want you to lose your train of thought because I'll lose mine if I don't say this and I apologize for that. It's all about distribution channel and if you don't have a distribution channel you can be off in the corner playing your violin with the best symphony ever heard but if nobody hears it, it won't matter. And that's the problem we're starting to see now is we've got individual developers who are doing superlative levels of development that don't have a distribution channel. They don't have a contract with the universal or an MCA or somebody to distribute their work. And because of that, it dies on the vine. And so do they, because they can't make their rent. So I think that distribution channel is going to become very important and that's going to create a third problem. And that is that if you're the, if you're the programmer who has a hundred K two token, or a million token input for Gemini Advance, and your competitors only get 128 context links for their token inputs, the million token guy's going to win out every time. And it seems to me that the way things go, that to him who has much more will be given, and to him who has little, what little he has will be taken away, to quote the proverb, we're going to see this kind of aggregation where we'll see the super programmers and everybody else, and it'll become kind of a kingdom of fiefdoms or a little, there'll be the surf class and the Lord class. And, you know, fortunately these LLMs have been put out there, but if you look at the LLMs that died on the vine, a lot of them were individuals or people who got say less than $10 million in capitalization and they were, they ran for a while, but they just couldn't compete with a hundred million dollars or a billion dollars of capitalization. They just couldn't. Because it was a stroke of a pen.

41:12 - J.K.
Well, and I like what to your point about distribution channels, like I'm, I'm kind of reminded of the really awesome presentation. I think it was some of F.'s students who had worked on a tool that kind of auto generates stories and videos and things like that. And someone said like, cause again, they knew that I'd done marketing before and they were like, well now we like, we need help marketing this." I was pretty point blank. I was just like, you can build a multi-agent team of marketers. That's the thing. I'm working on a curriculum called Cyborg Thinking, and it's just this idea of AI as a cognitive extension of the user. The train of thought in the past has been I've made this cool thing. Now I need someone to help me to help me get the word out.

42:15 - V.W.
And so that's what all the guys at Sundance are saying. Everybody shows up at the Denver Sundance Festival and then everybody's movie gets all these awards and then maybe they show up on Netflix for a limited time show showing and are quickly swept into the dustbin of history. And so unless it can even matter not just that you have a distribution channel, but that your distribution channel, that is one that is positioned to write the big check, to give the thrust underneath your rocket, to even get it to lift up off the ground, even though it's a mighty fine rocket. It's a really E. M., T. situation. And what I'm trying to do is say, I'm agreeing with you. And I think that the kind of hope that you give people with multi-agent personal marketing is that they can break through this but at some point it will become about capitalization. And programmers attempted to get capitalization by going the open source route saying, I'm going to give my work away for free to everybody in the world and they're going to become so dependent on the quality of code that I write that they can't help but come back to me. But instead the world has said over the past decade, thank you very much next to quote A. G. And so I'm really wanting the little guy to win here I don't see it happening and I want you to change my mind.

43:36 - J.K.
Well, I think I mean it's to your point about To your point about The film industry I and and to go back to what I talked about flattening. It's gonna get to the point where one person like like one person who has built a network of complimentary agents is going to be more agile than a big company, even if the big company has, has similar access to tools. I mean, I, okay. You changed my mind and you're done.

44:12 - V.W.
You changed it.

44:13 - Multiple Speakers
And here's why is we have a precedent and the precedent are YouTube influencers where single individuals can attain a massive following and then monetize further work and make it sustainable.

44:26 - V.W.
And we've got, many examples of successful scientific, technical, and entertainment YouTube influencers that my hope is restored.

44:37 - Multiple Speakers
Thank you very much.

44:40 - J.K.
I'm very pessimistic about the way Disney is just consuming all this IP and then what I would consider under-delivering on these things. Truly believe, I'm very optimistic and bullish about a lot of this stuff.

But you yourself cited the example of, you know, Disney took Grimm's fairy tales, which were in the public domain, and then made proprietary intellectual property from them, which they've made money on since the 1930s.

45:14 - V.W.
So I think that if you did the same, I've also noticed the tone in your work of being able to look for the open source for looking for the Grimm's fairy tales that are in the public domain and then build on top of them proprietary works that could be for profit if you chose to go that direction.

45:39 - J.K.
It's funny you say that. One of my side projects is, again, my background's in writing and I view everything that I do as an extension of writing. I'm actually placing bets on the most popular superheroes entering public domain. And I'm building kind of stories in parallel. I'm creating unique superhero characters so that when Batman and Superman and Spider-Man go into the public domain within our lifetimes.

46:13 - Multiple Speakers
You can eat them alive? Superman versus the mummy? Yeah, they'll show up. These characters enter public domain, right, I'll have a, I have characters and plots waiting.

46:28 - J.K.
But, but I really think again, coming back to this, coming back to this image, like we as educators, we as the people who are teaching other people, we really need to be mindful about how we are equipping people because it's going to get, it's going to get to the point. And I like, um, without going into too much detail, like it's, it's been frustrating a little bit trying to, uh, get buy-in into the, these ideas that we need to be integrating large language model education in as many places as we can. Yeah.

47:07 - V.W.
But you, you just made a point better than I ever could have in that you presented a video a week ago or so, which began to actualize your ideas as consumable media content. And it was very motivational in that sense, because you'd also used AI to help you generate it, which was smart because it saved you time from stop frame animation, Ray Harryhausen type stuff. So I think that we're poised to do what you're saying, but here's the deal. The YouTube influencer has to have the following diverse skillset. They have to have a core interest that they really care about. They have to have the research chops to find out the facts about it to the point that they know more than the ordinary person so that there's leverage, intellectual property leverage. But thirdly, most importantly, and this is true for you and several others here, you have to have the communication skills to put your ideas down into formats that can stand alone by themselves for you not having to mind them being consumed by others, say YouTube videos or other kinds of content. And so when I, ruminate over that you have to be a final cut pro editor, you have to be a you have to be a sound logic pro sound analysis person to put a good soundtrack on. You have to have this skills that are outside the main thrust of your expertise that enable your expertise to have a vehicle for play. And you can accomplish that by having multiple helpers, which most influencers eventually hire assistants or video editors or people to help them deploy. But initially, we have to be training students on a central core interest, and then on a set of communication skills that are absolutely essential to take square one in the game.

48:54 - J.K.
Yeah. I think there's a book called Range, and the subtitle is Why Generalists Thrive in a Specialized World. And again, it really drives me crazy how as I've, as I've delved into, uh, academia, like we're, we're, we're constantly encouraged to specialize in niche down and, and, and become subject matter experts in one thing. And it, it really, I mean, it, it ends up handicapping the entire academic. It's worse than that.

49:31 - V.W.
You actually ask incoming grad students to handicap themselves for the rest of their lives by becoming a domain expert in a single topic that is so specialized that if technology moves on, and it always does, that their skill set is rendered almost immediately useless. And I know of no better example than the NIH charter to have a bunch of PhDs study the activity of different enzymes. And so a given PhD will be an expert in the way a given molecular machine works, a given enzyme works, and then And once that's understood and elucidated for the public to consume, that person is basically used up, a

50:16 - J.K.
spent cartridge in the ammunition of technology progress. Yeah. I would go as far as to say it's almost predatory in that it's not, especially in 2024, moving forward. Really need to focus on kind of the skills you're talking about, where it's like, how do you package your idea that you like, I've been I've been meeting with some former professors of mine from UCA. As they're trying to update their writing curriculum to incorporate prompt engineering and things like that. And I've just said, I've said, like, does it matter? Does it really matter if the book that changes your life started out in a person's head and then was brought into the world almost with a surrogate, with AI acting as a surrogate?

51:12 - V.W.
Yeah, we kind of don't care about the origin story until way later when we're trying to write the history of how we did it. In the right now, we don't really need the origin story because we're standing on the shoulders of so many giants that we can have a time to do the giant genealogy, although those things are done, should be done, and

51:38 - J.K.
are also in the dustbin of history and are occasionally useful to unearth and appreciate. I keep coming back to, I haven't encountered a task or a deliverable that I can't do with multi-agent. I've tried, I've genuinely tried to find things that are not possible or the output is worse than what an expert in the domain could do.

52:09 - Y.i.P.
J., I'm going to send you an example of a math problem, ACT problem, that my son was trying to solve.

52:20 - J.K.
I'll give you some homework.

52:22 - Multiple Speakers
I would love to try and do that, absolutely.

52:27 - Y.i.P.
Okay, so I have a couple of questions and follow-up items. Last week, I asked you a question, Dr. W., on the chat. Somebody said that there are some master students and the test of building a website or a user interface was a trial that somebody wanted to do. And I said, let's meet offline. Was it D?

52:58 - Multiple Speakers
Yeah, it was D.

53:02 - V.W.
But yeah, D. is a great resource for that because he has the coterie of graduate students that would be eligible to do that for their capstone project or whatever.

53:16 - Y.i.P.
Here is my idea. I can use this chart and maybe one one model. And when you say website or UI, essentially, it is HTML, CSS, JavaScript embedded into it. And we are using react, but anything will call it artifacts will also use react on the front end.

53:37 - V.W.
And if you so you can just say, please use react to do my user interface, because I find it more extensible than yada, yada, yada.

53:48 - Y.i.P.
And I have a lot of documents that can actually help these students to build something and test something. So that's one. So I'll reach out to D. I think I have his email as part of the invite. And then the second idea that I have and J. for that, if you're interested, by the way, when we are doing that, we can pick up one model and try multi-agent as part of the test. If you're interested in doing that, that as part of the test. But I would like to have an offline conversation. I saw the link of the document that you sent. And pardon the lack of my knowledge and technicality in this. I would like to have some decoding session with you on the document. But if you're interested, we can choose a model and create multi-agent. And before I say multi-agent, I want to understand your definition of agent and multi-agent.

54:48 - J.K.
and all that and perhaps test that as part of the same project.

54:54 - Y.i.P.
So we can really combine three concepts as part of this and create a result of all these three concepts. That is what I wanted to share before we hang up today. We spoke something that can be made real and we can present something while we hide the concept but we can still present the accuracy we got compared to this chart, did multi-model within one model.

55:23 - V.W.
Y., if some of your students could just take the upper nine entries in this table for Jscript, Python, and Java, for Gemini, ChatGPT, and Cloud Artifacts, and just do a simple nine-way example where they give the same prompt. Well, I guess they would give three different prompts to three different LLMs. And you could demonstrate for us in the next week or so what you found, because then we could calibrate our expectations accordingly. And what's nice about that is it doesn't violate anybody's IP or having to disclose things they're not totally comfortable disclosing, but it does get us that contributory place where we make progress. I remember there was a thing called the Silicon, the Santa Clara or Silicon Graphics kind of a computer club. Somebody can help me on that. Exact name, that was going on at the time that B.G. was beginning to have the opportunity to build the first version of DOS and print money from his operating system. And there came a time where they disbanded things because they became so financially lucrative, they couldn't keep sharing openly. So I think if we're smart and unionize ourselves up front, rather than after the fact, that we can figure out how to compartmentalize our sharing so that we produce the maximum benefit for each other, with a minimum long term detriment. So I think we need to be disciplined about that.

56:48 - Multiple Speakers
Absolutely. And my firm has a formal NDA with you.

56:52 - Y.i.P.
So similar to last time when you presented, you said that lady rather right, she has to take approval from some people before sharing and we decided okay, we don't disclose that before she gets the approval. When I present, we can if I'm present something that is confidential to any of our clients I'll say don't record and but I'm because we have that formal agreement I'll present to you all under the non-disclosure agreement but I think I would this is this would be a very interesting test which will start with a real requirement to a final output and leverage the risk Dr. W. or J. or what D. is trying to do, we can combine. We could also write a joint paper on it.

57:45 - Multiple Speakers
We could put all our names on it. And then everybody gets the intellectual property value of having published that on archive or whoever wants to carry it.

57:53 - V.W.
And it becomes part of the corpus of progress that we have, you know, in watching this, how just in a year and a half, we went from 24 companies to four or five. That's sort of scary, because I was looking at the five and $10 million investments that people made in the 20 companies that are no longer with us. And it made me kind of sad, because those people were certainly just as skilled and just as able to do great and creative things. But for one reason or another, they didn't survive the call of natural selection. And I would like to think that we can structure things, as we have so far, to survive the call of natural selection and just become an ongoing contributor to the art, so that just like the benchmarkers always survive, if we benchmark these nine cases enumerated earlier, then we survive because benchmarks survive. And the governance of the history of the art always survives in the face of all other natural selection forces. So yeah. Correct.

58:59 - Y.i.P.
So I will reach out to the D. and maybe I'll create one pager or two pagers on what we are trying to do and we agree and we decide to progress on that on the next call. I may not be able to.

59:15 - V.W.
I'm also excited about D.D.'s work because he's going to the very core of language specifications and how we specify tasks for machines to accomplish and how we measure whether or not the machine accomplished that, the metric by which we actually your accuracy as being an objective rather than subjective quantity. And so I'm kind of excited about the research that he's doing along those lines, because to me, it seems very central to these core issues that we're trying to formalize in a way that we can have robust performance metrics that if we give them to someone else, like a Consumer Reports says, buy the Maytag washer and not the Kenmore washer, that we can know we did the right thing by giving them that advice. Correct.

59:59 - Y.i.P.
So let me take a stab at creating that document and I'll reach out to D. also to confirm his intentions and what he would like to do. And if everybody agrees, we'll kickstart that the following week. It may take a couple of weeks or three weeks, but maybe around Christmas or in the New Year's, we'll present you all something on those dimensions. Mentioned. So sorry, Dr. W., it may not be in two weeks we present. We I'll at least present what we will do and how we'll do and if we agree in two, three weeks, we'll present you back with the comparison with this chart and what actually happens with what we do.

1:00:47 - V.W.
So it was the Silicon Valley Computer Club that found themselves in the very predicament that we find ourselves in. And so now we've gotten a model. Let's recap what we've gotten, because we've gotten a really some good stuff today. J. has proved to us, at least informally, that the YouTube influencer is our great hope for the future that large corporations will eat us all alive, like it was depicted in the movie. What was the series? Was it iRobot? It wasn't iRobot. It was Mr. Robot. Yeah, that was the show that Corporate greed.

1:01:25 - Multiple Speakers
Evil Corp. Evil Corp.

1:01:28 - Unidentified Speaker
Yeah.

1:01:28 - V.W.
Yeah.

1:01:30 - J.K.
I think, I mean, we're talking about like, I know you've mentioned that these these companies have been consolidated down. I my gut tells me that these tools are more of utilities. I mean, I I really don't think between Having worked on prompt engineering for a long time, I don't think that it's going to be, even if you condense this down into the top three or the top one or two, obviously there are costs. They obviously have some command of greater cost control once they have captured this field. But my gut tells me that Once we have these tools in place, the expansion happens again on an individual level. We've gone from zero to one in that anyone can build anything. And the question just becomes- What are you going to build? What are you going to build?

1:02:37 - Multiple Speakers
Yeah. I mean- B.G. said, we now have a 747, speaking of the internet and the personal computer combined, he said, we now have a 747 for the cost of a pizza.

1:02:49 - V.W.
So what are you going to do with this capability? And who are the people that we're going to see emerge that have the potential skills? You know, the skill sets that got us here, the J.v.N. who could do differential equations in his head, that was a different skill set that took us to the next step of building PCs and giant pieces of software, but they were related. But now we're getting more into the different families of will succeed because the distribution of skills needs to be a little bit different. So it'll be interesting to see who survives the next culling, the next meteor impact, if you were a dinosaur, as it were.

1:03:33 - J.K.
My thing is just, again, studying curriculum design and education. I'm excited that regardless of your language, your location, your background, anyone can build anything. I mean, it used to be heavily siloed. You had to know people who knew people. And at the end of the day, that's not the case anymore. It's less the case.

1:03:56 - V.W.
And that's the case I was trying to make.

1:04:00 - J.K.
It's less the case.

1:04:01 - V.W.
And the question is, what heroes like yourself can we identify like you and Y. and R. and D. and so forth, who will be emergent to show us what that next possibility is who will be the, uh, the B.G., S.J., or even E.M. of this next use case to S. to give us all the rest of us the inspiration, because, you know, you used to tell kids, Hey, someday you can be an astronaut. And then I was at JPL when the space shuttle exploded and not as many people wanted to be astronauts or teachers anymore, because they saw that's right.

1:04:36 - D.D.
They saw C.M. get blown to bits.

1:04:39 - V.W.
And it was the deepest heartbreak, especially those of us who dedicated themselves to rocket science, it was the most horrible scenario we could possibly contrive. And there it was in front of our eyes. So then we had to say, well, you could be a doctor or a lawyer. So we have to be able to point people in the direction that we have to take care of the Earth. Earth is our home. We have to do a good job here before we start exporting ourselves to other planets. Make sure we're putting the best version of ourselves forward. So I'm looking for these use cases. And right now, we have these wonderful YouTube influencers in science and technology that are beginning to show us like what's like Veritasium is a great example, or three blue one brown
These guys are one-person roadshows that are doing incredible things to engage us all in an ongoing way. And we have the forefront of quantum computing and what vistas will that open? Well, breaking all the credit card numbers isn't enough motivation for me to move forward. But the possibility of being able to do things with drug discovery with quantum mechanical simulations, that does have a great hope for a large number of people suffering from currently incurable diseases. So I just I'm looking to collect those. It used to be robots, radio and rockets, like when you were growing up robots, radio and rockets. Could engage people, young adolescents on those three ideas they were in. Well, it's not that anymore. It's something different. And we need to identify what that is with a clever with a clever slogan that we need a three-R slogan for robots, radio and rockets, reading, writing and arithmetic, that engage people to continue their education, so that they can enjoy what we enjoy. And that is that the opportunity to surf the leading edge for a while.

1:06:35 - D.D.
Well, guys, I've got to go get ready for dinner. All right. Go get ready for dinner, D.

1:06:41 - V.W.
Yeah. Yeah. It wasn't too scary today, was it?

1:06:44 - D.D.
No, it was great. Hey, we didn't have the yeah, we have the fear thing.

1:06:49 - V.W.
We got closer. I look at four thirty-seven. We had a potential fear moment when we were talking about being wiped out by the big corporations. But J. has extricated us from that.

1:07:01 - Multiple Speakers
So I'm I feel pretty good.

1:07:03 - J.K.
The corporations are the ones you have to worry now.

1:07:08 - Multiple Speakers
We're all... We are the corporations you have to worry about. We are all capable of competing with them now.

1:07:17 - V.W.
We've met the enemy and they is us.

1:07:21 - J.K.
Again, we are more agile. That sounds a bit Pollyanna-ish, but we're going to get to the point where a single person with with a team of AI can become a nation-state actor.

1:07:35 - V.W.
Yeah, man. I don't know.

1:07:37 - J.K.
Oh, gosh, it's OK. I just got scared.

1:07:41 - Multiple Speakers
Yeah, I have a good. You guys have a great rest of your holidays and it's been great talking to you.

1:07:49 - J.K.
Have a good one.

1:07:51 - Unidentified Speaker
Thanks.

1:07:51 - D.D.
Thanks, man.

1:07:52 - Unidentified Speaker
Thanks, J.

1:07:53 - J.K.
Thank you.

Friday, November 22, 2024

11/22/24: Demo of use of AI to create animations of physical phenomena

Machine Learning Study Group

Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes (139^th meeting, Nov. 22, 2024)

Table of Contents

* Agenda and minutes

* Transcript (if available)

Agenda and minutes

Announcements, updates, questions, etc.?

Meet next Friday? Yes, but it will be short hands-on type session.
A demo of the real time use of AI to create the Doppler effect interactive animation and perhaps other demos will be scheduled as soon as convenient for RM and VW. Great presentation! Thanks!
Here is a tool the library is providing. Some people here thought it would be a good idea to try it live during a meeting, so we will do that soon. Maybe even today.

Library trial of AI-driven product Primo Research Assistant

The library is testing use of Primo Research Assistant, a generative AI-powered feature of Primo, the library's search tool. Primo Research Assistant takes natural-language queries and chooses academic resources from the library search to produce a brief answer summary and list of relevant resources. This video provides further detail about how the Assistant works.
You can access Primo Research Assistant directly here, or, if you click "Search" below the search box on the library home page, you will see blue buttons for Research Assistant on the top navigation bar and far right of the Primo page that opens. You will be prompted to log in using your UALR credentials in order to use the Research Assistant.
DB will try to find a masters student to do the project below. An important qualification for the student is to be able to attend these meetings weekly to update us on progress and get suggestions from all of us!

Project description: Suppose a generative AI like ChatGPT or Claude.ai was used to write a book about a simply stated task, like "how to scramble an egg," "how to plant and care for a persimmon tree," "how to check and change the oil in your car," or any other question like that. Just ask the AI to provide a step by step guide, then ask it to expand on each step with substeps, then ask it to expand on each substep, continuing until you reached 100,000 words or whatever impressive target one might have.

We could have a workshop session where we collectively decide what each followup prompt is.
Thanks to DD for providing anonymized transcripts to add to the meeting minutes!
The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates when available. Maybe every month or so?
Anything else anyone would like to bring up?

Here are the latest on readings and viewings

Next we will work through chapter 5: https://www.youtube.com/watch?v=wjZofJX0v4M. We got up 15:50 awhile ago but it was indeed awhile ago so we started from the beginning and went to 15:50 again. Next time we do this video, we will go on from there. (When sharing the screen, we need to click the option to optimize for sharing a video.)
We can work through chapter 6: https://www.youtube.com/watch?v=eMlx5fFNoYc
We can work through chapter 7: https://www.youtube.com/watch?v=9-Jl0dxWQs8
Computer scientists win Nobel prize in physics! Https://www.nobelprize.org/uploads/2024/10/popular-physicsprize2024-2.pdf got a evaluation of 5.0 for a detailed reading.
We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion.
Chapter 6 recommends material by Andrej Karpathy, https://www.youtube.com/@AndrejKarpathy/videos for learning more.
Chapter 6 recommends material by Chris Olah, https://www.youtube.com/results?search_query=chris+olah
Chapter 6 recommended https://www.youtube.com/c/VCubingX for relevant material, in particular https://www.youtube.com/watch?v=1il-s4mgNdI
Chapter 6 recommended Art of the Problem, in particular https://www.youtube.com/watch?v=OFS90-FX6pg
LLMs and the singularity: https://philpapers.org/go.pl?id=ISHLLM&u=https%3A%2F%2Fphilpapers.org%2Farchive%2FISHLLM.pdf (summarized at: https://poe.com/s/WuYyhuciNwlFuSR0SVEt). 6/7/24: vote was 4 3/7. We read the abstract. We could start it any time. We could even spend some time on this and some time on something else in the same meeting.

Here is a picture made to match this page:

Transcript:

Friday, November 15, 2024

11/15/24: Discussion; video review

Machine Learning Study Group

Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes (138^th meeting, Nov. 15, 2024)

Table of Contents

* Agenda and minutes

* Transcript (if available)

Agenda and minutes

Announcements, updates, questions, etc.?

The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI PedaPedagogy gogy in the Curriculum." IU is on it and may be able to provide an update when available.
A demo of the real time use of AI to create the Doppler effect interactive animation and perhaps other demos will be scheduled as soon as convenient for RM and VW. Next week we may likely have a demo of this process.
Here is a tool the library is providing. Anyone try it? Should we try it together in a meeting? (Yes, some attendees thought it would be a good idea.) Thoughts?

Library trial of AI-driven product Primo Research Assistant

Hello,
The library is testing use of Primo Research Assistant, a generative AI-powered feature of Primo, the library's search tool. Primo Research Assistant takes natural-language queries and chooses academic resources from the library search to produce a brief answer summary and list of relevant resources. This video provides further detail about how the Assistant works.
You can access Primo Research Assistant directly here, or, if you click "Search" below the search box on the library home page, you will see blue buttons for Research Assistant on the top navigation bar and far right of the Primo page that opens. You will be prompted to log in using your UALR credentials in order to use the Research Assistant.
We value your feedback on this and other generative-AI resources. Please try it out, share it with anyone who might be interested, and let us know your thoughts here: Feedback form
Feel free to reach out with any questions or concerns.
--
Bonnie Bennet | Discovery and Systems Manager
University of Arkansas at Little Rock | Ottenheimer Library
501.916.6563 | bbennet@ualr.edu | https://ualr.edu/library
Here is another event. Anyone go? No, but there was a lot of discussion about the abstract below.
Harness the Power of Generative AI for Good, not Evil Register now for the Ark-AHEAD Fall (Virtual) Workshop: Helping Students (and Faculty) Harness the Power of Generative AI for Good, not Evil
Registration Link

Presenter: Liz McCarron, EdD, MBA, ACC, CALC
Webinar Nov 14, 2024
9:30-Noon

Students quickly adopted Generative AI, but faculty have been slower to get on board. Worried about cheating, many schools banned the technology. But this can hurt neurodiverse students who have adopted GenAI at a higher rate than neurotypical peers. This session will help beginners learn what GenAI is and what it is not, what it can do and what it can’t. Attendees will gain a basic understanding of how ChatGPT works and its key features,
capabilities, and limitations. Attendees will also experience creating and refining prompts. We will discuss the ethical implications of using GenAI and how to create assignments that help students use GenAI responsibly. Join us and get inspired to experiment with GenAI to help your students and yourself.
Suppose a generative AI like ChatGPT or Claude.ai was used to write a book about a simply stated task, like "how to scramble an egg," "how to plant and care for a persimmon tree," "how to check and change the oil in your car," or any other question like that. Just ask the AI to provide a step by step guide, then ask it to expand on each step with substeps, then ask it to expand on each substep, continuing until you reached 100,000 words or whatever impressive target one might have.

Would this work, would the result be alright, or garbage, or what would it be like? Would it be reasonable to have a master's student do this as a master's project to see what happens? Should the master's student come to these meetings and provide weekly updates and get suggestions from all of us? Yes.

Can someone send me read.ai anonymized transcripts to include in the minutes? Yes, DD has kindly volunteered to do that.
Anything else anyone would like to bring up?

Here are the latest on readings and viewings

Next we will work through chapter 5: https://www.youtube.com/watch?v=wjZofJX0v4M. We got up 15:50 but it has been awhile so we started from the beginning and went to 15:50 again. Next time we do this video, we will go on from there. (When sharing the screen, we need to click the option to optimize for sharing a video.)
We can work through chapter 6: https://www.youtube.com/watch?v=eMlx5fFNoYc
We can work through chapter 7: https://www.youtube.com/watch?v=9-Jl0dxWQs8
Computer scientists win Nobel prize in physics! Https://www.nobelprize.org/uploads/2024/10/popular-physicsprize2024-2.pdf got a evaluation of 5.0 for a detailed reading.
We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion.
Chapter 6 recommends material by Andrej Karpathy, https://www.youtube.com/@AndrejKarpathy/videos for learning more.
Chapter 6 recommends material by Chris Olah, https://www.youtube.com/results?search_query=chris+olah
Chapter 6 recommended https://www.youtube.com/c/VCubingX for relevant material, in particular https://www.youtube.com/watch?v=1il-s4mgNdI
Chapter 6 recommended Art of the Problem, in particular https://www.youtube.com/watch?v=OFS90-FX6pg
LLMs and the singularity: https://philpapers.org/go.pl?id=ISHLLM&u=https%3A%2F%2Fphilpapers.org%2Farchive%2FISHLLM.pdf (summarized at: https://poe.com/s/WuYyhuciNwlFuSR0SVEt). 6/7/24: vote was 4 3/7. We read the abstract. We could start it any time. We could even spend some time on this and some time on something else in the same meeting.

Transcript:

Fri, Nov 15, 2024

1:35 - Unidentified Speaker
Well, R. M. gave an excellent talk yesterday at the Baptist health college to the amateur radio emergency services group.

1:41 - V. W.
And I credit the fact that she did so well with the dress rehearsal that you guys provided her here in the ML seminar. So thank you for that. And, uh, I wrote a recap of it that I'll send you. If you'd like to see it, it's a couple of pages.

1:57 - D. B.
Well, congratulations to her. Well, out.

1:59 - V. W.
There was enough technical discussion at the end from all the radio experts at the meeting that there was the just the proper amount of blood on the floor of, you know, defending dissertation proposals and defenses that will come down the road, but done in a very, say, home field advantage situation.

2:19 - Unidentified Speaker
So, that's good.

2:39 - M. M.
Are they in the distance?

3:46 - D. B.
Well, I guess we should get started, even though some more people are signing in.

4:08 - D. B.
Let me share my screen.

4:16 - Unidentified Speaker
Wrong screen.

4:34 - Unidentified Speaker
So. Let's see where we are.

4:38 - D. B.
So last time I. T. told us about this committee she's on as a representative of our STEM college, and I thought, you know, anytime she's here, I'll ask you for an update, but she's not here, so we won't get any update today. We're hoping for a demo of real-time use of AI to do things like creating the Doppler effect interactive animation we saw last time. Our next week is the target for to have some kind of little short presentation put together. Okay.

5:23 - Unidentified Speaker
So next. Dr.

5:29 - V. W.
B., I sent you the revised transcript.

5:43 - D. D.
Oh, great.

5:48 - D. B.
I'm going to go ahead and let the read.ai join the meetings as long as somebody will send me the anonymized transcripts. And by anonymized, I mean replace names with initials or something like that. And the easy way to do that is to just paste the transcript into ChatGBT and say, please replace names with initials, something like that.

6:37 - D. D.
It's semi-easy. I had to keep telling it to go past the token limit. Use the token limit twice. Huh.

6:46 - D. B.
Try Claude.ai because Claude.ai at least used to have a long, you know, that was its sort of virtue was that it would.

6:57 - D. D.
Let me see how I tried first though.

7:01 - D. B.
Okay. You can also go to meta.ai.

7:04 - D. D.
It allows you to use the Clima 3 405b which has 128 I tried Gemini and it just over and over again would just stop in the middle and say, and the text continues on. Even Gemini Advanced is trash compared to these other LLMs.

7:24 - Multiple Speakers
And I may just not subscribe to it anymore because I'm not sure it's worth it.

7:30 - V. W.
I get from a 20 bucks a month in Poe.com, I get access to all these bots, including the ones mentioned here today, and I don't And so I'm not sure Google Gemini advanced is worth the money.

7:45 - J. K.
What are you using? The only.

7:47 - V. W.
Yeah, podoc podoc com right now is the deal on wheels.

7:52 - E. G.
Because I sign up for each one individually, like for chat GPT.

7:57 - V. W.
Well, that'd be so expensive.

7:59 - J. K.
I'd be out of money. The only real like innovative edge that I can see to Gemini is someone's approached me recently about uh, auto transcription, like they'll upload a video and that, that aspect of Gemini is really impressive. That's true. Yeah. If you have to do anything with video, uh, Gemini, Gemini is the one to go to. But other, other than that, I, I agree with you, Vince.

8:25 - V. W.
You know, also I thought that the fact that Gemini has up to the date web, it's more continuous training than the others, which are lump sum training. So there's an advantage to, Gemini if you need current information?

8:39 - D. D.
Well, the other aspect is the API. I don't think you can get access to the API through Poe. I think you just get the chat program. But if you really want to get in there where you can access it with Python, I think there's other ways to access it. And I think currently you have access to the system prompt now in chat, but the API is such a powerful tool. I mean, I want to- It also depends on what kind of working, like for you and E., that API access might be essential.

9:20 - V. W.
Yes. But for people who are programming using the LLM like myself or Read, we can come into the LLM and just get work done all day long. Yeah, I've seen that.

9:32 - D. D.
tokens and permissions and all that.

9:34 - V. W.
Yeah. It's amazing what you're doing. Yeah.

9:37 - D. B.
Well, um, D., if, if it's not, if it doesn't want to anonymize the entire transcript, you can anonymize it, you know, in two parts or three parts or something. That's what I'm doing. That's right.

9:49 - D. D.
That's, that's, that's my next plan, but I got one, I got this, this week's done and I'll just keep, I'll just keep doing it.

9:58 - D. B.
Okay. Happy to contribute. Well, I appreciate it. Read.AI is involved in this meeting, and I'll just keep it going as long as we get the transcripts that I can post. What else? Okay, well, there's a... E.?

10:16 - E. G.
Yeah, I sent you a message. Is it po.com?

10:20 - Multiple Speakers
Poe.com, like Edgar Allen.

10:22 - E. G.
Ah. Okay. All right, so the library is providing a tool Primo research assistant, anyone?

10:30 - D. B.
Well, I'll let you read all this and then we'll see if there's any thoughts about it.

10:41 - V. W.
And the reason for that response is Read just finished a 12 section broad literature survey on 10 specific aspects of her dissertation. And it's just the results are unbelievable. I've even cataloged the statistics on the use of them. And so, yeah, I would say the library should save its money because everybody can access the stuff nearly for free. So Primo research assistant is not good, is what you're saying? I'm saying it may represent overhead for the library that doesn't benefit the institution, although perhaps for people who are in a loop of already going to the library first for all their literature searches, maybe that would be a good access for them. But it seems redundant to me, given the network, the internet, and the personal computer.

11:41 - D. D.
I mean, Google is still a very powerful search engine, and it will return an adequate number of results for, especially Google, Yeah. I mean, I get a lot from Google Scholar, but I haven't tried it. It might be a really good tool. I think we should test it. Yeah.

12:01 - A. B.
And I know too, like I've gone to the library for certain articles. Cause like, um, if it's, you know, something like a certain, certain, you know, you need a, um, what do you call it? Like a, a lot, a login or something, institutional login. Like sometimes those are hidden behind some of these, uh, like Abisko, like the, I can't remember like the databases and so forth. I don't know. If they use Preem Research, if it would connect through the library that way and give you access to more data.

12:30 - Multiple Speakers
That's true, too.

12:31 - D. B.
Well, should we try this thing in one of our meetings, like in real time?

12:37 - D. D.
Yeah, that's a good idea. Yeah, I would vote for that. Any other thoughts on that?

12:44 - A. B.
I give it a 3.9.

12:46 - Unidentified Speaker
OK. Yes, I agree.

12:47 - M. M.
It's a good idea. Library also you can ask for loan or copy. They have in the webpage actually the form how you can request the article that you cannot access. But let's try this. Yeah. All right. Well, I'll schedule it soon, maybe even next time or the time after.

13:13 - D. B.
As soon as I can schedule it. We'll do that. OK. There was an event. I guess it was yesterday. Today is the 15th. This was the event. Anyone go or? Any thoughts on it? I did not go. Here's the blurb about it.

13:47 - M. M.
Can you tell us about this event, please?

13:51 - D. B.
Yeah, I don't know anything about it except that I was wondering if anybody went to it and could tell us. No.

14:02 - M. M.
I'm reading the description right now. Who is talking? It was the speaker over there.

14:12 - M. M.
So many experts in the area.

14:20 - Unidentified Speaker
It's good.

14:22 - D. B.
My impression of this is, this is sort of the stuff we've been talking for a couple of years now already, and I'm not sure. You have to look at the content, have to know what the content is to know whether it was anything, they presented anything that I, you know, that I didn't already know.

14:56 - V. W.
An issue that comes up for us is, does the part of the university that makes sure that people with disabilities have the correct accommodations, are they enabling the use of AI as accommodations for the neurodiverse, but then the neurotypical could come along and say, well, they're getting an unfair advantage over me by the ability to freely use AI without censorship. And then it goes to court and goes way up and it's ruled, okay, everybody can use it, leave us alone. So, you know, you can run the whole thing out in your head and see where it's going to go.

15:32 - D. B.
It's a pretty strong claim that neurodiverse students are using it more. If it's true, I wonder what's going on.

15:41 - J. K.
I think it's an excellent learning tool.

15:45 - V. W.
I mentor a neurodiverse student and I have firsthand witnessed the fact that it levels the playing field for rapid learning.

15:58 - J. K.
Yeah. I attended one of the university disability resource center presentations on just technology accessibility tech and I asked in that meeting how they view chat GPT and generative AI and they were a little skeptical like I think I think they're still trying to kind of like V. mentioned like navigate out to recommend these tools I did I sent them a prompt that is meant for students with dyslexia where basically I just I trained the AI to be aware of the rules that make words difficult to read for people with dyslexia, and then the user can upload a body of text that's standard, and according to those rules, it will retain the meaning but completely change the wording to avoid those challenges. So they have that, but I know from being in that talk that they're still just kind of feeling out exactly how to endorse generative AI for students with learning disabilities and stuff like that. That is fantastic, J., that you wrote a tool that specifically does that.

17:14 - Multiple Speakers
And offline, I'd be interested in learning the tricks that you use to create such a thing.

17:21 - V. W.
That's a novel thing that should be just living on a server and made available to any Euler student or anywhere.

17:29 - J. K.
Yeah. Well, it's an application of the tool that plays to all its strengths. I mean, there's no reason, again, it's context awareness, make sure that no matter what the verbiage or phrasing or sentence length is, that a student can understand those things. And I'd even included small rules to say, like, if the user indicates that certain words or certain phrases are difficult, that it would accommodate those, just because, I mean, there's a pretty broad spectrum of dyslexia, but I will work to send that to you guys later today. It was a really fun project, and I think it's one that has a lot of merit just for people who need it.

18:14 - V. W.
And you know, not only for text output, but when the output can be spoken back to the student or read back aloud instead of requiring the student to necessarily read it themselves, that can also circumvent certain disabilities that people carry. And so the fact that you can give your questions and receive your questions as spoken words, as opposed to text is a advantage for the neurodiverse.

18:41 - J. K.
Yeah. I also think, I mean, I had a discussion not too long ago about how large language models will affect language acquisition in general. Um, and I was, I was focused on kids and the, the AI, just based on the tools we were discussing, seemed to think that it could cut down the time it takes to acquire language by, I think it was 40% or something, something ridiculous like that. Which if you think about it, I mean, it's just capable of rewording or making minute adjustments based on the learner. And so we're just going to get to the point where for most of these basic skills, an AI is so much more capable of speeding up learning and really helping people close gaps. It's going to be astounding, I think.

19:41 - Unidentified Speaker
Correct.

19:41 - Multiple Speakers
I'm using Grammarly turned on. Let me just have one second, V.

19:48 - M. M.
And J., also for people for second language learning.

19:53 - J. K.
Yes, absolutely.

19:54 - M. M.
And we have so many requests for kids learning foreign language or any.

20:01 - Multiple Speakers
My daughter actually is working in this area and I have a lot of stuff in learning foreign languages. Yeah, absolutely.

20:11 - J. K.
We talk a lot about AI as an equalizer or just as a remover of barriers. And I think there's so many brilliant minds that just because English is the de facto for a lot of academia and stuff like that, I mean, it's going to really change the landscape of how people can collaborate regardless of what language they speak or what level they speak at.

20:41 - V. W.
It'll democratize academic participation to a larger cross-section of the world.

20:45 - J. K.
Absolutely, yeah.

20:46 - V. W.
You know, I have wondered to myself whether in 10 years, of AI, I will be a babbling caveman saying, me want milk, or I would be more erudite and better able to express myself more fluently. I have wondered what impact it will have on me. But I know with Grammarly, over the last couple of years of using it to just correct typos, rewrite little idioms that aren't very portable, it's really improved my writing to the point that I make those mistakes, unquote, of my own spoken voice less frequently while still retaining the spoken voice of me. And so I'm thinking that this tool is going to elevate us and not de-evolve us or primitivize us.

21:37 - J. K.
The other thing I would mention about just interlanguage conversation or communication is that Google Translate has always been a thing. Or this hasn't always been a thing, but it's predated a lot of this tech. And I've found, again, by using chain prompting and multi-agent, if you need something translated into a target language, it helps to have a two-step process where you provide the piece of text and say, please, while retaining the meaning of this, please remove any language that would make it difficult to translate into Spanish, Chinese, like, like, it's, that's, that's one of the things that it excels at is just ironing out anything free translation, getting rid of any, anything that doesn't translate well.

22:31 - D. B.
So that's, that's been very successful for me. Any, did you have something that you wanted to mention? Somebody did.

22:45 - D. B.
I thought that was V.

22:47 - E. G.
Well, she's not here.

22:48 - V. W.
Her mic may be off. No, she's not on mute.

22:53 - M. M.
No, I'm here. I'm here. Just the camera is not working here. I don't know why. Oh. Yeah, I like mentioned the foreign languages.

23:03 - V. W.
Yes, it's exactly what we're talking about.

23:05 - D. B.
That's fantastic. I've asked ChatGPT to talk to me, but in its response, replaced the most common with the translations in another language, Spanish or something. And if with a little pushing, it will do it. It doesn't do a great job, but it will do it. And you can, you know, you sort of like can kind of tune the amount of translation to, you know, to what you're capable of comprehending without turning it and turning your light reading into a heavy duty study session. So he reads and studies it.

23:39 - V. W.
and perfect Spanglish. Yeah.

23:42 - E. G.
Um, I'll speaking as a neurodivergent person who's enjoyed the benefits of it with apparently very little of the drawback. One of the best ways of analyzing chat GPT from a neurodivergent perspective is we look at patterns, patterns that work patterns that don't. We don't regurgitate information well. We have to understand what's going on. So yeah, we're a lagging adopter of chat GPT. The use of chat GPT or some other large language models, does that make us better? Truthfully, no. I think a lot of times coming into it, understanding the foundational pieces, Because as we look at things, because when we look at things, you're able to look at one space, at a two-dimensional model. We're looking at the underlying piece. I can't understand something unless I know the underlying concepts involved. So when I look at something like chat GPT, I didn't use it until maybe a year, year and a half ago. Because I had to spend time learning all of the pieces that went into it to really understand how to approach it.

25:10 - J.K.
I would, I mean, I, I'm also neurodivergent and I use ChatGPT a lot. And I would just say, I mean, there's this universal idea that if you design for, um, the outliers, if you design, design anything, whether it's just physical access, accessibility or digital accessibility, it's the best design out there. I mean, I would say neurodivergent or neurotypical, we all have, we can only process so much information. And so I would just say, I believe this statement that neurodiverse students are using it at a higher rate just because we have to. We have to be tool learners. But I would just add that we're ahead of the curve in a lot of ways. I don't think there is an advantage for neurodivergent students that doesn't exist for neurotypical people at the end of the day. Now, don't get me wrong.

26:13 - E.G.
Like V., Grammarly has been my savior because as I go to present information, I tend to be very succinct.

26:25 - Unidentified Speaker
And I will assume a lot of different pieces are in place, but grammarly actually forces me into describing.

26:32 - E.G.
So that way they follow the continuity of my thought because the way I present it is we go from A to B, B to C, C to D. A lot of times I'll go from A to D and skip the B and C because that's not part of the equation that I'm trying to I'm trying to get us here. But I need to sometimes draw a map kind of like the old days of MapQuest. Take this turn at this point, printed out all I had.

27:08 - V.W.
That's what Grammarly does for me, right?

27:11 - D.B.
Alright, well. Let's see where we are.

27:14 - Unidentified Speaker
I see I.T. welcome. I thought if you had any updates on the campus effort.

27:21 - D.B.
that you introduced us to last time. I think we'd be glad to hear about it.

27:35 - Unidentified Speaker
He's still muted, you're still muted.

27:41 - D.B.
I don't know, maybe she's not even really there. Anyway, what I thought, you know, with something like this, as long as somebody involved is here, they could keep us updated. I have a question for you all. Supposing a generative AI, let's say ChatGPT, was used to write a book in the following way. You ask it a simple question, how to plant and care for a persimmon tree, how to check the oil in your car, and please give step-by-step instructions. Well, it'll do it, but it'll give you 10 steps to scrambling an egg or checking the oil in your car or planting a persimmon tree. And then you can just ask it to expand on each step with sub-steps, and it'll do that. And you could then ask it for each of these sub-steps, can you expand it in much more detail, look at all those alternatives and all the possibilities and it'll do it. And you could, I mean, what would happen if you continued until you got to a hundred thousand words?

28:53 - Multiple Speakers
There's actually, there's a term for that now that's emerging. And the term emergent term is AI slop.

28:58 - V.W.
Now that said on the extreme side, if you naively go in and ask ChatGPT one zero, how to check and change the oil in your car, you're going to get 10 steps that if you follow them, your oil will be changed and you probably wouldn't have done any lab casting damage to your car, the threads on the filter or anything like that. So I would say that that would be how I would delimit it.

29:23 - D.B.
So my question is, what would happen? Would it be a result that would be useful or would something go wrong? And if it went wrong, what would go wrong? Anyway, so my thought was, would it be reasonable to have a master's student do that as a master's project what would happen. And in the process... And the title could be on the creation of AI slop in mass.

29:48 - V.W.
I mean, the question is, with proper prompting, would it come up with something good or not?

29:54 - D.B.
I don't know. Well, then you have the thing that just how many steps do you need?

30:00 - V.W.
Because someone who's a car mechanic type is not going to need as lucid or detailed an explanation as someone who's never changed the oil on anything before. So you have a point in that you need more sub steps as you need as your familiarity with the landscape of that is decreased. So there's a contextualization for any user doing any task from spinning CD wafers to changing the oil on their car. There's a set of steps that you can enumerate that a person who simply can follow the instructions could accomplish it. All we're negotiating is the number of things. And so if anybody tries to work out of their domain, we all have our domains at which we're facile. But if trying to work out of domain, we need instructions. Example, the printer on my, I have this Epson printer that has endless EcoTank ink. I never have to buy ink cartridges again. And it's been the best thing since sliced bread until after a year of faithful service, the printhead quit working. And so I chatted and said, oh, you don't have to buy a new printer. You just have to dive deep into in here and change this one thing. And then I did that and I got the answer. And with with the assistance also of a Jeep, of a YouTube video. So even I, although I'm comfortable with mechanical things, I didn't know if you disassemble the printer in the wrong order, not only will you not get it back together, but you'll probably break something expensive in the process. So so I see this sort of like, are we in band or out of band for the task being specified?

31:33 - A.B.
But again, these things are more, I don't know, I think, I feel like often they're more optimized to give you an answer that sounds right versus is right. And like, you know, if you go back and forth with them, you can convince them that it could give you a right answer. And I feel like with enough prompting, you can kind of convince it that it's, you know, it's, it's, you can say, Oh, that's wrong. And then eventually you can kind of like, you know, arm wrestle it into giving, giving the answer that you want.

31:58 - V.W.
That comes up a lot when you're trying to write a program with specific objectives and you ask it to do it and it simply stops complying because about 30 shots in, it loses context and can no longer remember what it was doing like a person with a senility or dementia. So you have to kind of coax it along. So sometimes you have to take the entire previous transcript, load it into another LLM and get it to pick up where you left off or figure out how to factor your task. But if your interfaces are complex in a programming that it was doing for you, you have to manually re-articulate all those interfaces and the leverage that you were getting from the AI disappears. So for simple tasks, scrambling an egg, persimmon tree and changing oil, it works, but asking it to write a complex program that we'll talk about next week, hopefully, there can be definite gotchas because we have these zero and three shot successes that R. benefited from And then we also have these 30 shot epic failures that never gave us the answer.

32:56 - J.K.
we wanted so I've actually experimented with kind of composition on longer form longer form writings and most people I mean you you'll see a lot of AI slop in the Kindle market these days and you can read it and pretty quickly identify what is what is strictly AI generated my personal theory is is that so whenever I mean, whenever an author is writing a piece of fiction, like, we don't really talk about it very much. But you're kind of using different mental processes to write dialogue than you use to write internal dialogue or description. And so to, to get a quality human sounding piece of prose, I've found instead of focusing on chapter by chapter, you do scene by scene, and you generate one layer at a time. I mean, we come back to chain prompting and decomposing tasks and things like that. But I like to have it generate dialogue between two characters, then the internal dialogue as they're saying those things, and then finally the exposition or narration. And when you layer it like that, you're, again, replicating the fact that a human author is having to shift gears in order to write those different components. So I think what you're describing where it's like, again, you're, you're gradually expanding or adding complexity to this writing, it's entirely possible, it's just kind of counterintuitive, because you have to go in and approach it and say, how are each little sub? How is each sub category of this content? Different and how are the rules, how do you apply the rules so it comes together as something that reads naturally.

34:54 - V.W.
And one problem with that is it knows those rules if you bother to specify them in the prompt, but then you run out of your token or GPU allocation that's going to allow it to retain that biggest state in memory from which it could write the correct solution to your problem with its existing training. A lot of times you're hitting up against a gas pump limit instead the limit of the training of the model.

35:18 - J.K.
Well, and that's the beauty of going scene by scene, too, is just that you're never, if you optimize for the smallest token window or whatever is the smallest subcomponent of whatever you're trying to write, you're good. You're golden. It's going to be pristine, high quality stuff every time. It's just a little more complicated. I mean, we want it to be a shortcut where we put in a prompt and it spits out a book. But the fact is just that unless you optimize the process and say, I want the highest quality scene and I'm going to generate a dozen scenes for a chapter, you're going to get this very vanilla, odd-sounding fiction, unfortunately.

36:06 - V.W.
It reminds me a little bit of digital image compositing or even audio compositing where with your notion of layers you have the foreground you know the two actors the over the shoulders camera shots and just the close-up of their conversation you also have the background wherever they are in location and then you have the middle ground of the extras and so forth that are circulating to make the scene appear real I think the layering is a really good idea that not only applies to creation of novel content as in not new novels but novel novels. And yeah, that's all.

36:43 - Multiple Speakers
I mean, well, anybody else who hasn't had a chance to contribute to this question?

36:48 - D.B.
Have anything they want to add? All right, well, I could look for I mean, I sure I could find a master's student who's willing to do it. If I did find a master's student like that, and they were willing to come to these meetings and provide us weekly updates and take our suggestions, Should I do it?

37:11 - A.B.
Yeah, I think it's really interesting.

37:14 - V.W.
Yeah, it is. It'd be good for them too because they could distinguish slop from craft.

37:22 - Unidentified Speaker
Okay.

37:22 - D.B.
Any other comments on that? I'll put a big yes if not. All right. Well, I'll see if I can find somebody and then they'll have to be in to make these meetings or they won't be a suitable candidate for the project. That's their main qualification is they must be free on Fridays at 4 p.m.

37:47 - V.W.
Well, it could be seminar credit too, right?

37:50 - D.B.
No, the seminar is a different time actually, which is good because then they'd be more likely to be free at 4.

37:58 - V.W.
But I mean, this ML seminar has bordered on something that I have invited students of mine to come to and participate participate in, I think that this is becoming, and it has moved from the informal seminar to actually a thing that we look forward to each week of being able to dive deep into the current milieu of AI.

38:19 - Multiple Speakers
I think attending these meetings could be done for some kind of credit.

38:23 - D.B.
I think it will be possible. I haven't looked into it. It deserves that, I think. All right. Let's see. What else?

38:32 - Unidentified Speaker
All right, well, we got about 10 minutes left.

38:40 - D.B.
I could start, we could go back and review that video. Let me do that.

38:53 - Unidentified Speaker
All right, I probably need to and re-share with optimizing for video.

39:05 - D.B.
So let me see if I can do that.

39:14 - Unidentified Speaker
OK. Share. Optimize for video. Sharing.

39:20 - D.B.
Okay. Let me start this thing going.

39:25 - Unidentified Speaker
The initials GPT stand for Generative Pre-trained Transformer. Is that coming through okay? Yes.

39:35 - D.B.
Yes. All right. So that first word is straightforward enough. These are bots that generate new text.

39:46 - Unidentified Speaker
Pre-trained refers to how the model went through a process of learning from a massive amount of data, and the prefix insinuates that there's more room to fine-tune it on specific tasks with additional training. But the last word, that's the real key piece. A transformer is a specific kind of neural network, a machine learning model, and it's the core invention underlying the current boom in AI. What I want to do with this video and the following chapters is go through a visually driven explanation for what actually happens inside a transformer, we're going to follow the data that flows through it and go step by step. There are many different kinds of models that you can build using transformers. Some models take in audio and produce a transcript. This sentence comes from a model going the other way around, producing synthetic speech just from text. All those tools that took the world by storm in 2022 like Dolly and Midjourney that take in a text description and produce an image are based on transformers. Even if I can't quite to understand what a pie creature is supposed to be, I'm still blown away that this kind of thing is even remotely possible. And the original transformer, introduced in 2017 by Google, was invented for the specific use case of translating text from one language into another. But the variant that you and I will focus on, which is the type that underlies tools like ChatGPT, will be a model that's trained to take in a piece of text, maybe even with some surrounding images or sound accompanying it, and produce a prediction for what comes next in the passage. That prediction takes the form of a probability distribution over many different chunks of text that might follow. At first glance, you might think that predicting the next word feels like a very different goal from generating new text. But once you have a prediction model like this, a simple thing you could try to make it generate a longer piece of text is to give it an initial snippet to work with, have it take a random sample from the distribution it just generated, append that sample to the text, and then run the whole process again to make a new prediction based on all the new text, including what it just added. I don't know about you, but it really doesn't feel like this should actually work. In this animation, for example, I'm running GPT-2 on my laptop and having it repeatedly predict and sample the next chunk of text to generate a story based on the seed text. And the story just doesn't actually really make that much sense. But if I swap it out for API calls to GPT-3 instead, which is the same basic model just much bigger, suddenly, almost magically, we do get a sensible story, one that even seems to infer that a pi creature would live in a land of math and computation. This process here of repeated prediction and sampling is essentially what's happening when you interact with ChatGPT or any of these other large language models, and you see them producing one word at a time. In fact, one feature that I would very much enjoy is the ability to see the underlying distribution for each word that it chooses. Let's kick things off with a very high-level preview of how data flows through a transformer. We will spend much more time motivating and interpreting and expanding on the details of each step, but in broad strokes. When one of these chatbots generates a given word, here's what's going on under the hood. First, the input is broken up into a bunch of little pieces. These pieces are called tokens, and in the case of text, these tend to be word or little pieces of words or other common character combinations. If images or sound are involved, then tokens could be little patches of that image or little chunks of that sound. Each one of these tokens is then associated with a vector, meaning some list of numbers, which is meant to somehow encode the meaning of that piece. If you think of these vectors as giving coordinates in some very high-dimensional space, words with similar meanings tend to land on vectors that are close to each other in that space. This sequence of vectors then passes through an operation that's known as an attention block, and this allows the vectors to talk to each other and pass information back and forth to update their values. For example, the meaning of the word model in the phrase a machine learning model is different from its meaning in the phrase a fashion model. The attention block is what's responsible for figuring out which words in the context are relevant to updating the meanings of which other words, and how exactly those meanings should be updated. And again, whenever I use the word meaning, this is somehow entirely encoded in the entries of those vectors. After that, these vectors pass through a different kind of operation and, depending on the source that you're reading, this will be referred to as a multi-layer perceptron or maybe a feed-forward layer, and here the vectors don't talk to each other, they all go through the same operation in parallel. And while this block is a little bit harder to interpret, later on we'll talk about how this step is a little bit like asking a long list of questions about each vector and then updating them based on the answers to those questions. All of the operations in both of these blocks look like a giant pile of matrix multiplications, and our primary job is going to be to understand how to read the underlying matrices. I'm glossing over some details about some normalization steps that happen in between, but this is after all a high-level preview. After that, the process essentially repeats. You go back and forth between attention blocks and multi-layered perceptron blocks, until at the very end, the hope is that all of the essential meaning of the passage has somehow been baked into the very last vector in the sequence. We then perform a certain operation on that last vector that produces a probability distribution over all possible tokens, all possible little chunks of text, that might come next. And like I said, once you have a tool that predicts what comes next given a snippet of text, you can feed it a little bit of seed text and have it repeatedly play this game of predicting what comes next. Sampling from the distribution, appending it, and then repeating over and over. Some of you in the know may remember how long before ChatGPT came into the scene, this is what early demos of GPT-3 looked like, you would have it autocomplete stories and essays based on an initial snippet. To make a tool like this into a chatbot, the easiest starting point is to have a little bit of text that establishes the setting of a user interacting with a helpful AI assistant, what you would call the system prompt, and then you would use the user's initial question or prompt as the first bit of dialogue, and then you have it start predicting what such a helpful AI assistant would say in response. There is more to say about an added step of training that's required to make this work well, but at a high level, this is the general idea. In this chapter, you and I are going to expand on the details of what happens at the very beginning of the network, at the very end of the network, and I also want to spend a lot of time reviewing some important bits of background knowledge, things that would have been second nature to any machine learning engineer by the time transformers came around. If you're comfortable with that background knowledge and a little impatient, you can probably feel free to skip to the next chapter, which is going to focus on the attention blocks, generally considered the heart of the transformer. After that, I want to talk more about these multi-layer perceptron blocks, how training works, and a number of other details that will have been skipped up to that point. For broader context, these videos are additions to a mini-series about deep learning. And it's okay if you haven't watched the previous ones, I think you can do it out of order. But before diving into transformers specifically, I do think it's worth making sure that we're on the same page about the basic premise and structure of deep learning. At the risk of stating the obvious, this is one approach to machine learning, which describes any model where you're using data to somehow determine how a model behaves. What I mean by that is, let's say you want a function that takes in an image and it produces a label describing or our example of predicting the next word given a passage of text, or any other task that seems to require some element of intuition and pattern recognition. We almost take this for granted these days, but the idea with machine learning is that rather than trying to explicitly define a procedure for how to do that task in code, which is what people would have done in the earliest days of AI, instead you set up a very flexible structure with tunable parameters, like a bunch of knobs and dials, and then somehow you use many examples of what the output should look like for a given input to tweak and tune the values of those parameters to mimic this behavior. For example, maybe the simplest form of machine learning is linear regression, where your inputs and your outputs are each single numbers, something like the square footage of a house and its price. And what you want is to find a line of best fit through this data, you know, to predict future house prices. That line is described by two continuous parameters, say the slope and the y-intercept, and the goal of linear regression is to determine those parameters to closely match the data. Needless to say, deep learning models get much more complicated. GPT-3, for example, has not two, but 175 billion parameters. But here's the thing, it's not a given that you can create some giant model with a huge number of parameters without it either grossly overfitting the training data, or being completely intractable to train. Deep learning describes a class of models that in the last couple decades have proven to scale remarkably well. What unifies them is that they all use the same training algorithm. It's called backpropagation. We talked about it in previous chapters. And the context that I want you to have as we go in is that in order for this training algorithm to work well at scale, these models have to follow a certain specific format. And if you know this format going in, it helps to explain many of the choices for how a transformer process is which otherwise run the risk of feeling kind of arbitrary. First, whatever kind of model you're making, the input has to be formatted as an array of real numbers. This could simply mean a list of numbers, it could be a two-dimensional array, or very often you deal with higher dimensional arrays, where the general term used is tensor. You often think of that input data as being progressively transformed into many distinct layers, where again, each layer is always structured as some kind of array of real numbers until you get to a final layer which you consider the output. For example the final layer in our text processing model is a list of numbers representing the probability distribution for all possible next tokens. In deep learning, these model parameters are almost always referred to as weights and this is because a key feature of these models is that the only way these parameters interact with the data being processed is through weighted sums. You also sprinkle some nonlinear functions throughout but they won't depend on parameters. Typically though, instead of seeing the weighted sums all naked and written out explicitly like this, you'll instead find them packaged together as various components in a matrix-vector product. It amounts to saying the same thing if you think back to how matrix-vector multiplication works, each component in the output looks like a weighted sum. It's just often conceptually cleaner for you and me to think about matrices that are filled with tunable parameters that transform the vectors that are drawn from the data being processed. For example, those 175 billion weights in GPT-3 are organized into just under 28,000 distinct matrices. Those matrices in turn fall into eight different categories, and what you and I are going to do is step through each one of those categories to understand what that type does. As we go through, I think it's kind of fun to reference the specific numbers from GPT-3 to count up exactly where those 175 billion come from Even if nowadays there are bigger and better models, this one has a certain charm as the first large-language model to really capture the world's attention outside of ML communities. Also, practically speaking, companies tend to keep much tighter lips around the specific numbers for more modern networks. I just want to set the scene going in that as you peek under the hood to see what happens inside a tool like ChatGPT, almost all of the actual computation looks like matrix-vector multiplication. There's a bit of a risk of getting lost in the sea of billions of numbers, but you should draw a very sharp distinction in your mind between the weights of the model, which I'll always color in blue or red, and the data being processed, which I'll always color in gray. The weights are the actual brains. They are the things learned during training, and they determine how it behaves. The data being processed simply encodes whatever specific input is fed into the model for a given run, like an example snippet of text. With all of that as foundation, let's dig into the first step of this text processing example, which is to break up the input into little chunks and turn those chunks into vectors. I mentioned how those chunks are called tokens, which might be pieces of words or punctuation, but every now and then in this chapter, and especially in the next one, I'd like to just pretend that it's broken more cleanly into words. Because we humans think in words, this'll just make it much easier to reference little examples and clear verify each step. The model has a predefined vocabulary, some list of all possible words, say 50,000 of them, and the first matrix that we'll encounter, known as the embedding matrix, has a single column for each one of these words. These columns are what determines what vector each word turns into in that first step. We label it WE, and like all the matrices we see, its values begin random, but they're going to be learned based on data. Turning words into vectors was common practice in machine learning long before transformers, but it's a little weird if you've never seen it before, and it sets the foundation for everything that follows, so let's take a moment to get familiar with it. We often call this embedding a word, which invites you to think of these vectors very geometrically, as points in some high-dimensional space. Visualizing a list of three numbers as coordinates for points in 3D space would be no problem, but word embeddings tend to be much, much higher dimensional. In GPT-3, they have 12,288 dimensions. And as you'll see, it matters to work in a space that has a lot of distinct directions. In the same way that you could take a two-dimensional slice through a 3D space and project all the points onto that slice, for the sake of animating word embeddings that a simple model is giving me, I'm going to do an analogous thing by choosing a three-dimensional slice through this very high-dimensional space and project the word vectors down onto that and displaying the results. The big idea here is that as a model tweaks and tunes its weights to determine how exactly words get embedded as vectors during training, it tends to settle on a set of embeddings where directions in the space have a kind of semantic meaning. For the simple word-to-vector model I'm running here, if I run a search for all the words whose embeddings are closest to that of tower, you'll notice how they all seem to give very similar tower-ish vibes. And if you want to pull up some Python and play along at home, this is the specific model that I'm using to make the animations. It's not a transformer, but it's enough to illustrate the idea that directions in the space can carry semantic meaning. A very classic example of this is how if you take the difference between the vectors for woman and man, something you would visualize as a little vector in the space connecting the tip of one to the tip of the other, it's very similar to the difference between king and queen. So let's say you didn't know the word for a female monarch, you could find it by taking king, adding this woman minus man direction, and searching for the embeddings closest to that point. At least, kind of. Despite this being a classic example for the model I'm playing with, the true embedding of queen is actually a little farther off than this would suggest, presumably because the way that queen is used in training data is not merely a feminine version of king. When I played around, family relations seemed to illustrate the idea much better. The point is, it looks like during training, the model found it advantageous to choose embeddings such that one direction in this space encodes gender information. All day creamy Fontina and smoky prosciutto at low prices. Alright.

55:40 - D.B.
He's like a good so that was review any comments or questions about that. Alright. Well, we'll start from there next time we do this. That was a repeat, so I didn't stop it in the middle. Any last points before we adjourn?

56:07 - D.D.
It never gets less cool.

56:10 - D.B.
Okay. Good. I'm glad it was the truth.

56:14 - D.D.
That is the truth.

56:16 - D.B.
Okay. Have a good afternoon. Alright. Bye everyone.

56:20 - D.D.
Bye guys. Thanks for coming.

Friday, November 29, 2024

11/29/24: Free discussion

Friday, November 22, 2024

11/22/24: Demo of use of AI to create animations of physical phenomena

Friday, November 15, 2024

11/15/24: Discussion; video review

Harness the Power of Generative AI for Good, not Evil Register now for the Ark-AHEAD Fall (Virtual) Workshop: Helping Students (and Faculty) Harness the Power of Generative AI for Good, not Evil