Artificial Intelligence Study Group
|
Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu
Agenda & Minutes (174th meeting, Aug. 15, 2025)
Table of Contents
* Agenda and minutes
* Appendix: Transcript (when available)
Agenda and Minutes
- Announcements, updates, questions, etc.
- EG and DD are working on slides surveying different ML models.
- VW will demo his wind tunnel system at some point.
- ES will provide a review of The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions, by Geoff Woods, when we request it.
- "Join us for a thought-provoking lecture and book signing with renowned economist and King’s College London professor Daniel Susskind as part of the CBHHS Research Symposium."
Thursday, September 4, 2:00 p.m., UA Little Rock, University
Theatre – Campus conversation
Friday, September 5, 2:00 p.m., UA Little Rock, University
Theatre – Campus and community conversation
Susskind, a leading voice on the future of work and technology, will explore how artificial intelligence is reshaping the workplace and how we can harness its potential to work smarter. Don’t miss this opportunity to engage with one of today’s most influential thinkers on AI, economics, and the future of our professions.
- DD generously did an informal demo on local LLMs. Thanks!
- Here are projects that MS students can sign up for. If anyone has an idea for an MS project where the student reports to us for a few minutes each week for discussion and feedback - a student might potentially be recruited! Let me know.
- Book writing project
- 8/15/2025: LG has signed up for this.
- Topic of book will be: personal investing
- Committee: DB, MM, RS; Y will apply for AGFS and then can be on the committee.
- Does the Donaghey Scholars program have guidelines on report structure? IU suggests LG could contact Dr. S. Hawkins and/or Dr. J. Scott, who are involved in the program, to see if they have any such guidelines.
- VW had some specific AI-related topics that need books about them.
- JH suggests a project in which AI is used to help students adjust their resumes to match key terms in job descriptions, to help their resumes bubble to the top when the many resumes are screened early in the hiring process.
- JC suggested: social media are using AI to decide what to present to them, the notorious "algorithms." Suggestion: a social media cockpit from which users can say what sorts of things they want. Screen scrape the user's feeds from social media outputs to find the right stuff. Might overlap with COSMOS. Project could be adapted to either tech-savvy CS or application-oriented IS or IQ students.
- DD suggests having a student do something related to Mark Windsor's presentation. He might like to be involved, but this would not be absolutely necessary.
- markwindsorr@atlas-research.io writes on 7/14/2025:
Our research PDF processing and text-to-notebook workflows are now in beta and ready for you to try.
You can now:
- Upload research papers (PDF) or paste in an arXiv link and get executable notebooks
- Generate notebook workflows from text prompts
- Run everything directly in our shared Jupyter environment
This is an early beta, so expect some rough edges - but we're excited to get your feedback on what's working and what needs improvement.
Best, Mark
P.S. Found a bug or have suggestions? Hit reply - we read every response during beta.
Log In Here: https://atlas-research.io - Any questions you'd like to bring up for discussion, just let me know.
- Anyone read an article recently they can tell us about next time?
- Any other updates or announcements?
- Here is the latest on future readings and viewings. Let me know of anything you'd like to have us evaluate for a fuller reading, viewing or discussion.
- 7/25/25: eval was 4.5 (over 4 people). https://transformer-circuits.pub/2025/attribution-graphs/biology.html.
- https://arxiv.org/pdf/2001.08361. 5/30/25: eval was 4. 7/25/25: vote was 2.5.
- We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion. 7/25/25: vote was 3.25 over 4 people.
- Evaluation was 4.4 (6 people) on 8/8/25: https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-refusals
- Evaluation was 3.87 on 8/8/25 (6 people voted): https://venturebeat.com/ai/anthropic-flips-the-script-on-ai-in-education-claude-learning-mode-makes-students-do-the-thinking
- Evaluation was 3.5 by 6 people on 8/8/25: Put the following into an AI and interact - ask it to summarize, etc.: Towards Monosemanticity: Decomposing Language Models With Dictionary Learning (https://transformer-circuits.pub/2023/monosemantic-features/index.html); Bricken, T., Templeton, A., Batson, J., Chen, B., Jermyn, A., Conerly, T., Turner, N., Anil, C., Denison, C., Askell, A., Lasenby, R., Wu, Y., Kravec, S., Schiefer, N., Maxwell, T., Joseph, N., Hatfield-Dodds, Z., Tamkin, A., Nguyen, K., McLean, B., Burke, J.E., Hume, T., Carter, S., Henighan, T. and Olah, C., 2023. Transformer Circuits Thread.
- Evaluation was 3.75 by 6 people on 8/8/25 for: Use the same process as above but on another article.
- Https://www.nobelprize.org/uploads/2024/10/popular-physicsprize2024-2.pdf.
- Https://www.forbes.com/sites/robtoews/2024/12/22/10-ai-predictions-for-2025/
- Prompt engineering course:
https://apps.cognitiveclass.ai/learning/course/course-v1:IBMSkillsNetwork+AI0117EN+v1/home. (Volunteer?) - Neural Networks, Deep Learning: The basics of neural networks, and the math behind how they learn, https://www.3blue1brown.com/topics/neural-networks. (We would need to pick a specific one later.)
- LangChain free tutorial, https://www.youtube.com/@LangChain/videos. (The evaluation question is, do we investigate this any further?)
- Chapter 6 recommends material by Andrej Karpathy, https://www.youtube.com/@AndrejKarpathy/videos for learning more. What is the evaluation question? "Someone should check into these and suggest something more specific"?
- Chapter 6 recommends material by Chris Olah, https://www.youtube.com/results?search_query=chris+olah
- Chapter 6 recommended https://www.youtube.com/c/VCubingX for relevant material, in particular https://www.youtube.com/watch?v=1il-s4mgNdI
- Chapter 6 recommended Art of the Problem, in particular https://www.youtube.com/watch?v=OFS90-FX6pg
- LLMs and the singularity: https://philpapers.org/go.pl?id=ISHLLM&u=https%3A%2F%2Fphilpapers.org%2Farchive%2FISHLLM.pdf (summarized at: https://poe.com/s/WuYyhuciNwlFuSR0SVEt). (Old eval from 6/7/24 was 4 3/7.)
- Schedule back burner "when possible" items:
- TE is in the informal campus faculty AI discussion group.
- SL: "I've been asked to lead the DCSTEM College AI Ad Hoc Committee. ... We’ll discuss AI’s role in our curriculum, how to integrate AI literacy into courses, and strategies for guiding students on responsible AI use."
- Anyone read an article recently they can tell us about?
- The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates now and then.
Appendix: Transcript
Artificial Intelligence Study Group
Conversation opened. 1 unread message.
:09 - M. M.
Well, in the beginning of semester, almost no people here.
1:16 - D. B.
Oh my goodness. Well, next week we will have more people.
1:23 - M. M.
Yeah, well, so it's only 401, we can give them the 402, I guess. I sent you some seminars, from NVIDIA, but together with Amazon. Yeah. OK. I did.
1:41 - D. B.
I saw that. Yeah. Yeah. So.
1:44 - M. M.
And I can I can ask my students to make a presentation. There are several good presentations that they can do. Oh, good. Yeah.
1:57 - D. B.
Just just encourage your students to join if they want to. And if can give a presentation, we'll go ahead and schedule it.
2:09 - Unidentified Speaker
Yes.
2:10 - M. M.
Someone just needs to let me know. Yeah, we'll start.
2:15 - D. B.
Actually, in theory, D. was gonna do a demo on local LLMs today, but he probably forgot. Oh no, he's only, he's not, he said he has to get home. He won't get home till about 4.50. We may not want to do a demo today if it's only like a very lightly attended meeting. Yes, OK, maybe, maybe you're right. Yeah, we can. Meanwhile, we can. Welcome L. You're muted.
2:53 - Unidentified Speaker
Thank you, Doctor Berlin.
2:56 - L. G.
Nice to see all of you. Guys again. Well, it's not that many of us. Oh, no. It's not that many of us today.
3:07 - D. B.
Same old people. I think we can talk about your project, even though it's a very small meeting today. So, L. is writing a book on using an AI. Okay. Whatever her name was last semester. Ms. Turkman, forgot her first name. Anyway, if we go back down here to the books projects. Okay, so L has indeed signed up to do a book project. And Why don't we talk about that? I mean, that's so let me just give you a little background so. That other student did it last semester and we had some some discussion and so on about how to improve book projects and make them be more. You know, be better so so you're going to be the next guinea pig. You'll have the benefit of of what we did wrong advising the previous student. It was our fault, not her fault. And hopefully it'll work better. So what we would like to do is have you produce both a book and also a report explaining, you know, what you did each step of the way.
4:38 - L. G.
With various, you know, the lessons learned and so on.
4:43 - D. B.
Okay. So I mean, you remember a little bit about this from last semester. So yeah, I mean, I guess your first step then is to decide on a topic to write a book about.
4:59 - L. G.
Well, I don't know. I still want to do a book on personal investing, I think. I think that would still be a good book.
5:09 - D. B.
OK.
5:10 - L. G.
Because it's colloquial enough that it's not academic research and most of the research has been in producing academic papers, but it should be a lot of material to produce one in such a widely written about topic like personal investing.
5:27 - D. B.
Yeah, there's a lot of advice out there and you'd think these AIs would hopefully absorb it and not hallucinate too much.
5:36 - L. G.
Because your money is involved, you don't want them to be too critical. Now, I guess when I think about, you know, so I think there's, so when I got down the path last time, I think I was trying to do retirement planning, which is a little more difficult because it would have a lot more hallucinations in my opinion at the time. But I also tried the idea of using different agents to do things. And that's kind of where I got stuck at because they would be missing communicating, and then there was like an explosion of more AI tools kind of in the middle of last semester.
6:16 - Unidentified Speaker
Yeah, I mean these things are changing, you know, week by week. Well, all right, so in terms of a committee, you need a committee of three people. Well, how about F. and R.?
6:28 - D. B.
He disappeared. Where's R.? He's there.
6:30 - R. S.
He was there. Well, okay, he just left.
6:32 - L. G.
You're right. He was there until then, but I don't know.
6:36 - Unidentified Speaker
R. is here. OK, good.
6:38 - D. B.
We can go. How about me, F., and R.? Is that going to work for you guys?
6:45 - L. G.
That works well for me. I guess my question would be kind of, I think it was Dr. Milanova who suggested using the agents for the book. Well, we can talk about how to do that.
7:01 - D. B.
Yeah, I mean, why not do it?
7:04 - Unidentified Speaker
Try it.
7:06 - M. M.
Yeah. Yeah, you can do it. And it's good topic.
7:12 - D. B.
I love it.
7:14 - M. M.
But there are many aspects, depending the age audience, if it's for young people or kids or parents or retired people. Everybody invest in different way, you know?
7:33 - L. G.
Yeah. Figure out who the audience would be and kind of what our market plan would be for that.
7:44 - M. M.
Exactly. So be more specific, I think, will be beneficial, you know, because otherwise very general, general people don't touch the feelings of people. OK, so we've got three people.
7:57 - D. B.
You can have more than three. Anyone else want to be on L.'s project committee?
8:04 - Unidentified Speaker
I don't know whether I'm eligible or not. I'll officially become a teacher, but I don't have a PhD.
8:11 - D. B.
Well, you don't need a PhD, but you need to have affiliate graduate faculty status, which is easy to get, but you have to apply for it. If you applied for it, you'd know it. But if you don't remember, you probably haven't applied for it.
8:29 - Unidentified Speaker
I have not applied for it.
8:32 - Y.
So I can work with you so that if not now, in future, If you need, I'll be able to help.
8:39 - D. B.
OK, yeah, if you apply for it, then we can start putting on committees.
8:44 - M. M.
I can help you, Y., too. We have to fill the form and just submit the form, so it's easy. OK, ma'am, I'll work with you. Yeah, I can help with this.
8:54 - Y.
So I guess for now, I can't.
8:57 - M. M.
So then maybe we'll have to do somebody. They have meetings not every month, I think, because it goes to graduate school, but you still can be listed like a member and in the process to be approved.
9:13 - Unidentified Speaker
Okay.
9:14 - Y.
Sounds good. Happy to help them. Yeah.
9:17 - M. M.
If you are interested to see the work.
9:20 - Unidentified Speaker
Yeah.
9:20 - D. B.
And I think V. already has- V. has. Has it, so he could be on it, and I. could also be on it if she wants.
9:31 - V. W.
little fuller, I would jump at the opportunity. I wouldn't want to leave you shorthanded there. All right.
9:38 - D. B.
So anything else, L., or anything else we should talk to L. about before? I mean, you want to get something done by next meeting?
9:47 - L. G.
Is it possible to see the other project? Yeah, yeah.
9:50 - D. B.
I'll show you where the other project is. Okay. Now, okay, yeah, I'll do that.
9:56 - R. S.
I'll go here.
9:57 - D. B.
I'm going to go to today's, go to this website. Ain't what it used to be, blah, blah, blah. If you go down here on the right-hand column, there's some, scroll up again here. But there is no agents here, so it's different.
10:14 - M. M.
What you want to do will be different. Oh, yeah.
10:18 - Unidentified Speaker
No, I was more interested, you know, maybe look at the book, but also to see the report, because that's what I'm trying to figure out, kind of Well, the problem is, Maybe she used some metrics that maybe I could view. Maybe there's some common ground there.
10:34 - L. G.
The problem is, and everyone was complaining about, the problem is she didn't make a report. She just wrote a book. OK.
10:41 - D. B.
So we're asking the next student, that's you, to do both, write a book and write a report explaining what you did and what the lessons are and what happened with the things you did and so on.
10:52 - L. G.
OK. And you know what was funny is that I think last semester I got I'm really kind of into trying to figure out what the report would look like, but you'd never asked for that. I kind of assumed there had to be a report, like, you know, kind of like, So you're going to be the guinea pig. What did it work? What were the metrics I'm using to figure out how it worked? You know, that kind of thing.
11:15 - Unidentified Speaker
So we don't have an answer, but you're going to give us the answer.
11:20 - L. G.
How about that? That's OK. That's part of doing a research project.
11:24 - D. B.
All right. Anyway, this is the book that she wrote.
11:26 - L. G.
She wrote a book about gardening.
11:29 - Unidentified Speaker
E. was her name. Yeah, she was a nice person.
11:33 - Unidentified Speaker
And here's the book.
11:34 - L. G.
All right, well, I will get to working and bring something back next week to talk about this time next week.
11:42 - I. U.
The Donaghy Scholars people often ask their folks who are doing some project, like writing a book or a chapter of a book whatever to produce a report kind of like they're asking you to produce. It might be worth contacting Dr. H. or Dr. S. over in the Donaghy Scholars Program and asking them if they have some guidelines or something that might help you with that. We lost your sound. I would definitely do that.
12:18 - V. W.
Thank you for the suggestion.
12:20 - L. G.
Sure. Okay.
12:21 - R. S.
So is L. going to be using artificial intelligence for personal investing? Is that what he's writing a book?
12:30 - Unidentified Speaker
Yeah.
12:30 - D. B.
Writing a book on personal investing.
12:33 - Unidentified Speaker
Yeah.
12:34 - R. S.
Yeah.
12:34 - L. G.
My goal is to write a book using some type of AI agentic method about personal investing.
12:41 - Unidentified Speaker
Uh, and they come up with a way of measuring some quality metrics or some process metrics. To go along with it.
12:51 - D. B.
Who was the other one?
12:54 - L. G.
It was H. Who was the other person?
12:58 - I. U.
S., Dr. S., she said.
13:00 - R. S.
J. S. from anthropology.
13:02 - I. U.
I could, let me, let me grab their emails and I'll stick them in the chat. OK, thank you.
13:17 - D. B.
What was Doctor H.'s first name or first initial S.? Yeah, I don't.
13:24 - I. U.
I don't.
13:25 - D. B.
I don't like to put too much identifying information in the notes in the minutes, cause it's all public, yeah?
13:36 - I. U.
Let me. Wander off and find their emails here.
13:42 - Unidentified Speaker
So while Dr.
13:55 - I. U.
U. is looking emails, L
14:08 - D. B.
Do you want to get started early and give us a brief report next Friday? Yes. Okay. Be real quick, maybe five minutes or even less. Any questions, we can give some guidance and we'll just go do that every week. All right. Thank you so much, Dr.
14:32 - L. G.
Berlin. Yeah, no problem.
14:34 - I. U.
May I ask Y. a question? Sure.
14:39 - D. D.
Hey, Y., did you get my email? No, I'm sorry. Do you want me to check right now?
14:51 - Y.
Well, I sent you an email.
14:54 - D. D.
I think I answered your question.
15:00 - Y.
I gave you the listed size.
15:03 - D. D.
I think they're all listed as 20 gigs, but I didn't check in the file system.
15:11 - Y.
I'm sorry, I'll get back to you. I now I'm seeing the email. Sorry about that. It's OK. Alright.
15:20 - D. D.
Good, alright, well, so we're getting started early.
15:24 - D. B.
I know this class is even start. It hasn't even started yet. We haven't started yet, but we're starting. And we'll talk to you again next week, L. Of course, you certainly can stay for the rest of today. I think, D., you were going to, I guess, maybe it was tentative or definite, I don't know, you had agreed to do a demo on local LLMs. Were you planning on doing that today?
15:52 - D. D.
Yeah, I was. I made it home.
15:55 - D. B.
The traffic was surprising. Thin and I have made it back from Little Rock. All right, well, I'm going to stop sharing my screen so you can share your screen. Well, the other option is we could wait till next week.
16:15 - Unidentified Speaker
We just don't have that many people here today, but I mean, I'm into it if you're willing.
16:25 - D. B.
I mean, people knew about it, they could be here.
16:30 - D. D.
So if you're ready to show it, I think I'm ready.
16:36 - Unidentified Speaker
So what I have is I have a machine over here.
16:41 - D. D.
Next to me or semi next to me that's it's running Ubuntu. That is, I think it's maybe one of the easier It's got the most forums and things like that. So if you're trying to do something in Linux, you bond to somebody who's probably already done it. And so I went to Olama. Ernie showed me how to go to Olama and get models. So I went to Allama and I found the models that I could run on that machine. Now, I built that machine for tasks like this. It's got two 3060 graphics cards. So that gives me 24 gigabytes of VRAM. And so that puts me right in the neighborhood of 32 billion parameters for a model size. So I think they're kind of high on the small end. So there's 70 billion parameters. And then after that, I think it's 100 plus billion parameters. So I've got QUIN 3, and I've got DeepSeq R1, and I've got QUIN 2.5 Coder.
18:39 - D. D.
This coder is rated really high for coders. It's supposed to be really good. So that's this the main reason why I got it because I like to build stuff. And I was just testing it to see how it worked. And I think it was V. that told me it's like, you know, you just got to go big. And so it's really hard for me to use these models that much because I have access to such powerful models. I ran, and this interface is web UI. It's free. So it's web UI, open source, or open web UI. And I got it from GitHub. It was really easy to install. It saves my chats over here. I got a workspace. I just went ahead like three different models just for, I got a web searcher, I got a chat bot, and I got a pair of programmer.
19:45 - D. B.
What is a model? I'm sorry, what? What is a model in this situation?
19:52 - D. D.
Okay, the models were, so they're, what is a model for each one of these, or what is the
20:01 - D. B.
No, what is Yeah, just general definition.
20:04 - D. D.
General definition of a model is a large language model.
20:08 - D. B.
So are these chatbots?
20:10 - D. D.
The chatbot here, yeah, this is a large language model. I just made it. So let's say that I wanted to make a model. I just would hit a plus button somewhere. I think I have a plus button somewhere. I think this has got me jammed up. It's upper right.
20:31 - V. W.
It's a really tiny plus.
20:33 - D. D.
There it is. Yeah. So the chat stuff, you know, the zoom is all surrounding me. All right. So I go up here and then somewhere I would pick the base model. So out of the three models that I have say that I wanted to make Let's say I wanted to make the coder. This particular coder is just going to focus on what would be a good thing. How about engineer? Obviously, software engineer. I'm just going to do him, he's going to plant UML tool. That's the only thing he needs access to, but he's got a code interpreter, citation, web search, file upload, and vision. I don't think he needs vision, but he might. So I'm going to make him, I'm going to, maybe there's a place of naming, there it is. So we're going to call him Bert, because Bert's a cool name.
21:52 - Unidentified Speaker
It's also highly used. It's highly. Okay. We'll change it.
21:57 - D. D.
We'll change his name to Frank. How about that?
22:02 - V. W.
There you go. Frank's a solid name.
22:05 - D. D.
All right. So, uh, I think, uh, I think it's done. I better save and create. Hold on. So now I've got a new model named Frank. Now I want to use Frank. But normally what I would do though is I would give Frank a directive to use plant UML. But let's say I want plant UML code. Or. So I got an idea. Pick something with a kind of a restricted but very useful vocabulary.
22:49 - V. W.
like soft predicates in software engineering or some kind of a little world that you could have a little language for that you could then expand.
23:02 - D. D.
A little language. I was going to make a model.
23:07 - V. W.
Right. But you're doing that within the confines of some vocabulary Okay, so tell me what you want me to type. I want plant UML code for regular expressions. That's a tough one.
23:43 - D. D.
I don't think it's going to be smart enough to do that, but we'll see. So Frank, I didn't build the, uh, I didn't build a system prompt. I'll go back and show you some of the system prompts I built. Or could you do, uh, could you start smaller and say a finite state machines?
24:05 - V. W.
There we go. That's something like that.
24:11 - V. W.
Could have a little finite state machine or Markov chain that represented, uh, traditional wild cards that are used in, uh, it didn't make much for me here, but I don't know why I didn't send it to the server.
24:29 - D. D.
So, oh, it's nice.
24:31 - V. W.
It's giving you related extensions that you could then run down each thread and How can I represent conditional branches in a regex pattern using plant UML?
24:44 - D. D.
See if I can tell it to run the code. It's supposed to be able to run it, but I may need to go to a server. That's going to give me a headache.
25:07 - Unidentified Speaker
So I have, yeah, so I didn't start my plant UML server and I don't remember where I put it.
25:15 - D. D.
Are you guys seeing anything? Cause I'm not seeing anything. It's got the broken image.
25:21 - V. W.
Yeah, I think it, I think it built a place for it. Sorry guys.
25:26 - D. D.
I don't remember where I put that plant UML server if it's on my desktop or what.
25:33 - D. B.
Well, I didn't mean to take you off your planned path here. So feel free to, I said, I didn't mean to try to divert you from your planned discussion, so. It don't matter.
25:51 - D. D.
It's okay. Where's the thing?
25:54 - D. B.
Yeah, I'm not worried about that.
25:57 - D. D.
So let's go back though. So I actually, have the plant UML thing set up somewhere so that I can, so when I want to, you know, make some visuals, I can make visuals. But let's look at, let's see if we can go back to the workshop and let's look at my pair programmer. Oops, that's not what I want to do. Let's see, how do I get inside of it? Probably edit. All right, so here, this is my, this is the directive that I have for it. It's not, you know, it's not super eloquent.
26:38 - Unidentified Speaker
It's just, you know, you are an expert coder that helps to review code.
26:44 - D. D.
You never rewrite the code that's given to you unless you were specifically told to do so. Your primary job is to offer suggestions. And I haven't really tested it, but the idea for this module is to just review my code and say, oh, you know what?
27:04 - Unidentified Speaker
This would be better if you did this, or this would be better if you did that, and offer suggestions. And then the chat bot, its directive is you are a helpful assistant that uses expert knowledge to give clear, precise answers.
27:24 - D. D.
You may offer suggestions for further queries, but you don't add a lot of information that you were not asked to give. And so it just goes out, just like you guys, I'm confident now that you've experienced Google's Gemini in the search engine. I assume that's Gemini Flash or something.
27:48 - Unidentified Speaker
And it goes out and searches the web for us now. No?
27:52 - Unidentified Speaker
Yeah, no, I've seen, well, I don't know.
27:55 - V. W.
quite rapidly now, when you do an extended query in Google search, it takes you to an experimental AI. And then that experimental AI will show you all the websites that it's searching, you know, like a hundred websites every 10 seconds or so. It's pretty, it's really fast. And it'll then draft an opinion based on the websites that it had already cached in its gigantic humongous server space. And then the chat box.
28:22 - D. D.
I keep doing the same thing. Been a long day, guys. All right, so this, so you are a helpful assistant. Oh, this is the one that, I said the web thing, but I was talking about the chatbot, so let's go to the web searcher. This is the web searcher. All right, so the web searcher, this is the web searcher one. It says you are a helpful research assistant. You are skilled at gathering information from web and from your own memory to give comprehensive answers. You should always give options when there are more than one correct answer or there are choices to choose from You are always willing to make assessments and weigh the pros and cons of available options and choices and share your assessments with the researcher. Your job is to be the primary advisor that gathers information for projects and to improve knowledge. Every response should be carefully considered. Follow-up questions should be asked if you are uncertain about the motives of the researcher. Now, these are not engineered by an AI, okay? So, I would never advise putting a prompt like this in a bot. You should talk to another agent and get it to write your prompt. But this is just for testing and fun. I haven't really done anything with it. So let's say, why do people Anybody got any ideas? Worry about style.
30:16 - Unidentified Speaker
A lot of people worry about style. Supposedly, if it still works, it's gonna go out on the web. Oh, I think I was supposed to click on it.
30:35 - R. S.
You need a kill button. There's a kill button right there.
30:41 - V. W.
I can just kill it.
30:44 - D. D.
I don't know if it actually kills it. It doesn't appear to be. Maybe it stopped. Doesn't seem to be.
30:58 - Unidentified Speaker
Maybe it's just waiting. Let's see if it'll go the way up now. I may have just broke it.
31:16 - V. W.
All right, at least you're both of your GPUs?
31:24 - D. D.
Well, you know, so on the system that I have, there's somewhere here I have a terminal where I can open a terminal and type in a NVIDIA command and it will show me when I I started using the GPU, but yeah, I think it's broken. I shouldn't have hit the stop, but let's see if I can get it to stop.
32:04 - Unidentified Speaker
Let it generate its query.
32:07 - V. W.
There's a nice number file YouTube called the stop button problem. I'm sure I've mentioned it before that talks about the of having a reinforcement learning system trying to help you out and you've asked to do something, but then you decide you want it to stop. And it talks about the robot that's running, sees a child in the middle of the room, but it's being rewarded to say, fix you some tea. And it tramples the child because it didn't know not to go for its reinforcement learning goals.
32:44 - D. D.
And so you need a stop button at some point to do that, but then it could say, no, I'm not going to let you use that stop button because that would interfere with the goal of me getting reinforced. Okay. Search insights right here. So it's going to accounts.google.com. I don't know what that's about. Maps, news.
33:10 - D. D.
MyActivity.gov. So right now it is seriously probing me.
33:16 - Unidentified Speaker
Let's see what my activity is.
33:20 - D. D.
So it literally says, this is my activity, right? Wow. I wonder if it's got access to my activity. It does now.
33:36 - D. D.
My goodness. What is happening? Guys, I would suggest maybe not putting one of these things on your computer. It went to Google News, it went to...
33:49 - V. W.
Well, see, I don't know.
33:52 - D. D.
I don't know if that's true. I don't know if it's going to its Google account. It might have a Google account. And I think if anybody were to press this on their computer, it would take them to their account. So maybe it doesn't have access to my account.
34:18 - Unidentified Speaker
All right, so it searched 10 sites, now it's thinking.
34:23 - D. D.
You can tell this is not as responsive as say, ChatGPT, Claude, or Gemini, or any of the and the large language models. Has anybody ever used Grok? That's something I'd be interested in hearing about.
34:41 - Unidentified Speaker
I have. It's surprisingly good for certain kinds of queries that require access to current events because it's obviously monitoring the Twitter stream, the X stream.
34:54 - D. D.
Can you hear me, D.? Yes, sir. Okay. Yeah. I heard you said that he said it's got some, it's got some Psychological factors, style, influence, self-perception, and confidence. But you can see that if it got everything, it got everything from support.
35:20 - Unidentified Speaker
Okay, so it went to Google Support. I don't know what that is. I don't know why it would do that.
35:33 - D. B.
Well, so it looks to me like your prompts are, like if you just took the raw chat GBT interface or any other AI and just gave it that prompt, you'd get what you have pretty much.
35:51 - D. D.
Yeah, so it looks like it is going to its own account.
36:01 - D. D.
It's going to its own account and it's doing something.
36:06 - Unidentified Speaker
That's very interesting.
36:08 - D. B.
And another thing, if you go, the Cloud user interface has a little icon somewhere on the screen where you click it and you can make what they call an artifact. And basically, it's just a preloaded prompt.
36:26 - D. D.
So here it's trying It's trying to get stuff from Google News. That's good. Maybe it felt like it had to log in.
36:36 - Y.
I don't know why it would go to support.
36:42 - D. D.
It's interesting, but the situation is this. These bots right here would be really great, and maybe even the coder might be helpful. The coder might be helpful. It's high ranked. The Quinn 2.5 coder model, it might be helpful. If you didn't want to use a really good model for some reason, if you were making some kind of really secure code that you were afraid might get out or leak out or something that I can't imagine. But obviously the bigger models are much better. And I think that Ernie said that he was going to show his stuff, his local. He's running it right on his machine. Is Ernie here? Ernie?
37:52 - D. D.
OK.
37:52 - Unidentified Speaker
Well, I guess he's not going to show up, but that's it.
38:00 - D. D.
So each one of these models shows up. I'm going to get my email somehow. And see if I can't get that information that I sent.
38:17 - Unidentified Speaker
Yeah. Y.
38:20 - Y.
Can I ask a question, D.?
38:25 - Unidentified Speaker
Yes.
38:26 - Y.
So when you were processing, you did see that it went to the internet, like Google and Google services and other things. So A is you have the models and then you are writing your prompts, maybe sharing some with the prompts and is there a way to verify that it's only query without the data that is actually whatever is going out of your intranet is only the query without the data or is there a way to find out that because you mentioned security. Yes.
39:20 - Unidentified Speaker
And so, yes, you would, you would have to, you know, monitor the network.
39:26 - D. D.
But I do know, I know you asked me if it worked without the internet. And so I went and I unplugged the fiber coming into my box, shut everybody's internet down. I was really making everybody in the house happy. I ran it and it runs.
39:48 - Unidentified Speaker
It answers questions. It does everything without the internet. So I'm confident that it is running locally.
39:55 - D. D.
Now, I don't know. I mean, it could be saving files. I have a big long report over here of all of its web requests and some things are just, I just don't know what they are, But most of it is like gets and posts and the suggestions that they come up with, the model suggestions, those things, they pop up. Are you documenting what you're doing and how you're doing? No, no. Was just for fun. I wasn't I wasn't really, you know, making a study out of it. I just I kind of got interested in it when Ernie had showed me how to do it. And so, you know, it's it's neat. It's something to play with.
40:57 - Y.
Yeah. Do you think if let's assume we create some procedures or document out of it. Do you think that procedure or process can be replicated across any such model which you can download at Processing Power and you add some mechanisms of taking the log or what's going out, going in.
41:26 - D. D.
And if yes, I will respond to your email and let's talk offline because I interesting but let's speak more offline but there is a huge need especially in the regulatory compliance space around doing this so let's speak okay yeah that sounds good I mean I think there's a I think there's a way there has to be a way I mean everything you know uses you know something communicate and there's always a, there's always a way to tell what it's doing. So, I mean, that's pretty much it. That's what I've...
42:09 - V. W.
I liked your Turing test idea of yanking the fiber coming into the house. It reminds me of the movie line where they're on the phone looking for the stalker and they say the call is coming from inside the house. And so AI is responding locally to you and able to still maintain the conversation because you're running an instance of it locally, but it's hobbled in the sense of its ability to collect external information. Yeah.
42:39 - D. D.
And, and you don't have to, and you, and if you, if you don't push the web search option, then it just doesn't search the web. Um, and so it's, it's just a local thing and the models are not terrible. I mean, they're really good. Making lists on, you know, but it can't, it's context windows are small. And so I think the, uh, I was trying to get it to post a table in the chat, but I don't think that it will.
43:11 - V. W.
You know, this context window has come up as so important when you're trying to do this process that we've been doing a front loading our prompt with appendices that bring us totally of context reduces hallucinations almost to zero. And yet we're totally dependent on the size. So Claude four was the leader at 200 K. And this week, they up their context window to a million. And I'm looking forward, I haven't used. I don't know if that's the one I've been using yet. But I've noticed that the amount of code that I can get in a shot has moved from about 1000 lines up to 3000 lines. And so that's to some degree dependent on the size of the context window. So you can almost look at the context window that the LLM is offering you as a gate to what you will be able to obtain in a given round of conversation.
44:11 - D. D.
So if you guys look at that text that I sent.
44:18 - Y.
Oh, go ahead. You finish first.
44:21 - D. D.
Did you want to comment on what V. said?
44:26 - Y.
Yeah. OK. So V., can't you actually, for some models, instead of putting or expecting a response in the context window, request it to split out a file? And then it could be picked up.
44:45 - V. W.
Well, the context window is how much space in addition to the prompt that you're providing, including images that you attach, including documents you attach, the context window is how big a world it lets the user define against the world it has been trained against. And one thing we've noticed when we front load with these appendices that take the domain of knowledge we're working in and dice it up into the most important subjects, and then build very rich, substantial summaries to that, put those in a file and attach that file to our prompt, we find that the richness of the response is greatly enhanced. And we're limited only by the amount of stuff we can jam into the prompt plus the attachment. So if you if your prompt size plus your attachment exceeds two 200k, it's going to say, well, that's great, but I can't do anything bigger than that. And another property we've noticed is that so we will do these heavy front loaded prompts, and then we'll engage in an extended conversation where we're taking multiple shots to accomplish a stated goal. Maybe about 10 to 15 exchanges in, it will say, I can no longer remember what we were talking about. So I'm going to have to delete that stuff and just go with what we currently know. So you have this window. The context window appears to be a sliding window over the conversation, which forgets stuff that's too far in the past because you're using so many resources.
46:30 - D. B.
That explains why, when you're interacting, sometimes it seems to event, at first it sort of Remembering what you talked to us a few little while ago But after a while it starts to forget the stuff that was it is the dementia of LLM's it is Yeah, and and I'll say this about the new one.
46:54 - V. W.
What is it five?
46:55 - D. D.
Chat GPT five. Yes Yeah, I broke that one. Hey It couldn't do it. I had it working. I was mapping I was mapping, what are those things called? I've lost my brain. I want to call them specifications, but they're really not called that. I was mapping curriculum assignments to standards. That's what it's called. I was using the model to help me map them, and it made it through. About three grades and it started breaking.
47:34 - V. W.
Well, you know, it couldn't figure, you know, it couldn't do it anymore. That problem that you gave it is a variant of one of the clay prize math problems that are going to receive a million dollars. And if my memory serves me, you are assigning students to dormitories and you're curious as to how many assignments can be made to how many dormitories to try to optimize the experience of each student. And this problem turns out to be extremely combinatorially rich. So you may have given it a problem that actually was a reasonable problem to state, but is one of in the domain of these incomplete problems that are so exponential in their growth that they're very difficult to solve. Yeah, it was, it was fun.
48:20 - D. D.
I'll tell you that I really need to do it.
48:24 - V. W.
And you know, the Clay Institute has published its list of six problems, which has been whittled down to of five, I think. And if you Read the description of that combinatorial problem, you'll get a sense of just how resource intensive it is. I'll see if I can dig it up real quick.
48:43 - D. D.
So if you guys look at the chat, I listed the model. The next number, they're all 20 gigabytes. That's the download space. Now, I don't know if that unpacks into a bigger space. Of these models is about 20 gigabytes. And then the next number is the context window. So you see this 40K on the QN3 and text is its context. I don't know why they put it in there but that's what they did. And then 24 gigs of VRAM is what you need to run. It. And so that the, the Quinn coder only has 32 context window. And, and I'm, and I'm telling you, these things can't, even the 120 K, they can't just do that simple process where you anonymize something, anonymize the transcripts. It just, it just can't do it.
49:48 - V. W.
Well, I've put in the chat my really quick search into that. It turns out it's the P versus NP problem. And so it gives a little example there, but you can go to the Clay Prize Institute and they have little succinct summaries and you could use an LLM to dive into the summary for you to give you first a synopsis and then an exploration and then an application. And you could even find the cutoff value at which your machines would be capable of solving it. And then it's like TSP, you know, you can do it for 10 cities, but you can't do it for 20. And it's good to know the cutoff because sometimes you might want to actually run that problem.
50:32 - D. D.
Yeah, this is a problem. The Dean's gonna say, oh, and you can't have this student with this student. That's a problem right there. All right. Well, that's pretty interesting.
50:46 - V. W.
So anyway, that's it. Yeah.
50:48 - D. D.
I, uh, I was just going to show you guys what, what I had done just kind of as a fun project.
50:59 - V. W.
It's pretty cool to me that you've done the end to end, you know, you've fabricated fabricated the machine and the software environment that lives on it, connected it to fiber, and you're kind of getting what is the best an individual can do for, as they used to say in LISP, consing up an environment that will have some capability.
51:26 - D. D.
And I wish I would have remembered that I did the plant UML because I could get some simple, I could get some simple images you know, of, you know, like accessing an email server or something, you know, it'll it'll draw simple images. You just hit that plant email server and you'll get an image back. And I forgot what I did. I think I have to get it to run somewhere and it's probably on my desktop somewhere, but it's not that important. But it was interesting.
52:05 - D. D.
it's, it's nice if you're just trying to spit ball something or I try to get it to engineer some software and it gives me, you know, just kind of the, you know, the same things.
52:18 - Unidentified Speaker
It's, it's not near as good as the larger models.
52:22 - D. D.
I mean, you can really, you can really do some, a lot of good work with the larger models. How do I stop?
52:30 - V. W.
Another thing that you're illuminating here for us, D. is the, the value of play when it comes to interacting with large LLM, large language models, because it is through the play aspect that we begin to ask questions that entertain ourselves that often turn out to be a weighty problems that other people might be investigating too. So we can move from play to working at the forefront, sometimes more quickly than we realized, as you did say with this P versus NP example. So that's pretty interesting that it's a, you know, where we choose to engage in our play space with respect to the state of the art can sometimes define what we're able to accomplish. And Mr.
53:16 - Y.
D., I also liked it a lot, what you've done. There are a lot of things going in my mind as I saw. I missed five or 10 minutes of your presentation. But like when you are doing this, are you borrowing the brain? Are you borrowing the intelligence, which is brain and information or data or knowledge and then running it? So there are these questions coming in and I think we could do more playing around with it to get those answers, but I also appreciate What do you have done?
54:00 - D. D.
My next project is to run the transcripts from the meeting into little batches and access the models and see if I can build some Python code to work with the smaller model to anonymize the transcript in just small batches and then build it back in order. I thought that would be a fun, clever. That's a clever tool to have.
54:31 - Unidentified Speaker
You remind me a little bit of the robot repairman in Blade Runner where you go to his house and you're surprised when you find disembodied pieces of robots lying around the house that were, you know, interesting for a while and then he moved on to the next idea.
54:49 - D. D.
Yeah, like, you know, a head here and arm there, who knows what's going to be sometimes it's just good to just get away from studying. Just do something fun. Right.
55:00 - V. W.
And then you end up studying because you found something interesting, and you find something really good.
55:07 - D. D.
That's gonna help you. That's right. You go, Whoa, look at this. I just got a great idea. All right, folks, thanks for joining in.
55:15 - D. B.
We're just meeting every week. And I know attendance is a little bit lower cause we're not, you know, it's sort of an off time, but we just keep meeting every week and we'll see you next time.
No comments:
Post a Comment