Friday, August 8, 2025

8/8/25: Evaluate some readings

Artificial Intelligence Study Group

Welcome! We meet from 4:00-4:45 p.m. Central Time on Fridays. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many. 
Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes (173rd meeting, Aug. 8, 2025)

Table of Contents
* Agenda and minutes
* Appendix: Transcript (when available)

Agenda and Minutes
  • Announcements, updates, questions, etc. as time allows: none.
  • Next time at about 4:15 (when he gets home!) DD has generously agreed to do a demo on local LLMs.
  • EG and DD are working on slides surveying different ML models.
  • VW will demo his wind tunnel system at some point. 
  • ES will provide a review of The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions, by Geoff Woods, when we request it.
  • Join us for a thought-provoking lecture and book signing with renowned economist and King’s College London professor Daniel Susskind as part of the CBHHS Research Symposium.
    Thursday, September 4, 2:00 p.m., UA Little Rock, University Theatre – Campus conversation    
    Friday, September 5, 2:00 p.m., UA Little Rock, University Theatre – Campus and community conversation 

Susskind, a leading voice on the future of work and technology, will explore how artificial intelligence is reshaping the workplace and how we can harness its potential to work smarter. Don’t miss this opportunity to engage with one of today’s most influential thinkers on AI, economics, and the future of our professions.

 Register to Attend


  • If anyone has an idea for an MS project where the student reports to us for a few minutes each week for discussion and feedback - a student could likely be recruited! Let me know
    • JH suggests a project in which AI is used to help students adjust their resumes to match key terms in job descriptions, to help their resumes bubble to the top when the many resumes are screened early in the hiring process.
    • JC suggested: social media are using AI to decide what to present to them, the notorious "algorithms." Suggestion: a social media cockpit from which users can say what sorts of things they want. Screen scrape the user's feeds from social media outputs to find the right stuff. Might overlap with COSMOS. Project could be adapted to either tech-savvy CS or application-oriented IS or IQ students.
    • We discussed book projects but those aren't the only possibilities.
      • VW had some specific AI-related topics that need books about them.  
    • DD suggests having a student do something related to Mark Windsor's presentation. He might like to be involved, but this would not be absolutely necessary.
      • markwindsorr@atlas-research.io writes on 7/14/2025:
        Our research PDF processing and text-to-notebook workflows are now in beta and ready for you to try.
        You can now:
        - Upload research papers (PDF) or paste in an arXiv link and get executable notebooks
        - Generate notebook workflows from text prompts
        - Run everything directly in our shared Jupyter environment
        This is an early beta, so expect some rough edges - but we're excited to get your feedback on what's working and what needs improvement.
        Best, Mark
        P.S. Found a bug or have suggestions? Hit reply - we read every response during beta.
        Log In Here: https://atlas-research.io
  • Any questions you'd like to bring up for discussion, just let me know.
  • Anyone read an article recently they can tell us about next time?
  • Any other updates or announcements?
  • Here is the latest on future readings and viewings
    • Let me know of anything you'd like to have us evaluate for a fuller reading.
    • 7/25/25: eval was 4.5 (over 4 people). https://transformer-circuits.pub/2025/attribution-graphs/biology.html.
    • https://arxiv.org/pdf/2001.08361. 5/30/25: eval was 4. 7/25/25: vote was 2.5.
    • We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion. 7/25/25: vote was 3.25 over 4 people.
    • Evaluation was 4.4 (6 people) o 8/8/25: https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-refusals
    • Evaluation was 3.87 on 8/8/25 (6 people voted): https://venturebeat.com/ai/anthropic-flips-the-script-on-ai-in-education-claude-learning-mode-makes-students-do-the-thinking
    • Evaluation was 3.5 by 6 people on 8/8/25: Put the following into an AI and interact - ask it to summarize, etc.: Towards Monosemanticity: Decomposing Language Models With Dictionary Learning  (https://transformer-circuits.pub/2023/monosemantic-features/index.html); Bricken, T., Templeton, A., Batson, J., Chen, B., Jermyn, A., Conerly, T., Turner, N., Anil, C., Denison, C., Askell, A., Lasenby, R., Wu, Y., Kravec, S., Schiefer, N., Maxwell, T., Joseph, N., Hatfield-Dodds, Z., Tamkin, A., Nguyen, K., McLean, B., Burke, J.E., Hume, T., Carter, S., Henighan, T. and Olah, C., 2023. Transformer Circuits Thread.
    • Evaluation was 3.75 by 6 people on 8/8/25 for: Use the same process as above but on another article.
    • Https://www.nobelprize.org/uploads/2024/10/popular-physicsprize2024-2.pdf once got a evaluation of 5.0 for a detailed reading. 
    • https://www.forbes.com/sites/robtoews/2024/12/22/10-ai-predictions-for-2025/
    • Prompt engineering course:
      https://apps.cognitiveclass.ai/learning/course/course-v1:IBMSkillsNetwork+AI0117EN+v1/home
    • Neural Networks, Deep Learning: The basics of neural networks, and the math behind how they learn, https://www.3blue1brown.com/topics/neural-networks
    • LangChain free tutorial,https://www.youtube.com/@LangChain/videos
    • Chapter 6 recommends material by Andrej Karpathy, https://www.youtube.com/@AndrejKarpathy/videos for learning more.
    • Chapter 6 recommends material by Chris Olah, https://www.youtube.com/results?search_query=chris+olah
    • Chapter 6 recommended https://www.youtube.com/c/VCubingX for relevant material, in particular https://www.youtube.com/watch?v=1il-s4mgNdI
    • Chapter 6 recommended Art of the Problem, in particular https://www.youtube.com/watch?v=OFS90-FX6pg
    • LLMs and the singularity: https://philpapers.org/go.pl?id=ISHLLM&u=https%3A%2F%2Fphilpapers.org%2Farchive%2FISHLLM.pdf (summarized at: https://poe.com/s/WuYyhuciNwlFuSR0SVEt). 6/7/24: vote was 4 3/7. We read the abstract. We could start it any time. We could even spend some time on this and some time on something else in the same meeting.
  • Schedule back burner "when possible" items:
    • TE is in the informal campus faculty AI discussion group. SL: "I've been asked to lead the DCSTEM College AI Ad Hoc Committee. ... We’ll discuss AI’s role in our curriculum, how to integrate AI literacy into courses, and strategies for guiding students on responsible AI use."
    • Anyone read an article recently they can tell us about?
    • If anyone else has a project they would like to help supervise, let me know.
    • (2/14/25) An ad hoc group is forming on campus for people to discuss AI and teaching of diverse subjects by ES. It would be interesting to hear from someone in that group at some point to see what people are thinking and doing regarding AIs and their teaching activities.
    • The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates now and then. 
Appendix: Transcript 

Artificial Intelligence Study Group



Friday, August 1, 2025

8/1/25: DD on prompting AIs to anonymize transcripts of these meetings

Artificial Intelligence Study Group

Welcome! We meet from 4:00-4:45 p.m. Central Time on Fridays. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many. 
Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes (172nd meeting, Aug. 1, 2025)

Table of Contents
* Agenda and minutes
* Appendix: Transcript (when available)

Agenda and Minutes
  • DD demoed how to use AIs in a sample use case: generating anonymized transcripts of these meetings. Discussion was animated and he tried various experiments at the urging of the attendees. Hands-on demos of prompts and discussions about them seems to be a good topic for meetings.
 The meeting ended here. 
  • Announcements, updates, questions, etc. as time allows: none.
  • DD has generously agreed to do a demo on local LLMs. He can go whenever it works out.
  • EG and DD are working on slides surveying different ML models.
  • VW will demo his wind tunnel system soon. 
  • ES will provide a review of The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions, by Geoff Woods, when we request it.
  • If anyone has an idea for an MS project where the student reports to us for a few minutes each week for discussion and feedback - a student could likely be recruited! Let me know
    • JH suggests a project in which AI is used to help students adjust their resumes to match key terms in job descriptions, to help their resumes bubble to the top when the many resumes are screened early in the hiring process.
    • JC suggested: social media are using AI to decide what to present to them, the notorious "algorithms." Suggestion: a social media cockpit from which users can say what sorts of things they want. Screen scrape the user's feeds from social media outputs to find the right stuff. Might overlap with COSMOS. Project could be adapted to either tech-savvy CS or application-oriented IS or IQ students.
    • We discussed book projects but those aren't the only possibilities.
      • VW had some specific AI-related topics that need books about them.  
    • DD suggests having a student do something related to Mark Windsor's presentation. He might like to be involved, but this would not be absolutely necessary.
      • markwindsorr@atlas-research.io writes on 7/14/2025:
        Our research PDF processing and text-to-notebook workflows are now in beta and ready for you to try.
        You can now:
        - Upload research papers (PDF) or paste in an arXiv link and get executable notebooks
        - Generate notebook workflows from text prompts
        - Run everything directly in our shared Jupyter environment
        This is an early beta, so expect some rough edges - but we're excited to get your feedback on what's working and what needs improvement.
        Best, Mark
        P.S. Found a bug or have suggestions? Hit reply - we read every response during beta.
        Log In Here: https://atlas-research.io
  • Any questions you'd like to bring up for discussion, just let me know.
  • Anyone read an article recently they can tell us about next time?
  • Any other updates or announcements?
  • Here is the latest on future readings and viewings
    • Let me know of anything you'd like to have us evaluate for a fuller reading.
    • 7/25/25: eval was 4.5 (over 4 people). https://transformer-circuits.pub/2025/attribution-graphs/biology.html.
    • https://arxiv.org/pdf/2001.08361. 5/30/25: eval was 4. 7/25/25: vote was 2.5.
    • We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion. 7/25/25: vote was 3.25 over 4 people.
    • popular-physicsprize2024-2.pdf once got a evaluation of 5.0 for a detailed reading.
    • https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-refusals
    • https://venturebeat.com/ai/anthropic-flips-the-script-on-ai-in-education-claude-learning-mode-makes-students-do-the-thinking
    • https://transformer-circuits.pub/2025/attribution-graphs/methods.html
      (Biology of Large Language Models)
    • https://www.forbes.com/sites/robtoews/2024/12/22/10-ai-predictions-for-2025/
    • Prompt engineering course:
      https://apps.cognitiveclass.ai/learning/course/course-v1:IBMSkillsNetwork+AI0117EN+v1/home
    • Neural Networks, Deep Learning: The basics of neural networks, and the math behind how they learn, https://www.3blue1brown.com/topics/neural-networks
    • LangChain free tutorial,https://www.youtube.com/@LangChain/videos
    • Chapter 6 recommends material by Andrej Karpathy, https://www.youtube.com/@AndrejKarpathy/videos for learning more.
    • Chapter 6 recommends material by Chris Olah, https://www.youtube.com/results?search_query=chris+olah
    • Chapter 6 recommended https://www.youtube.com/c/VCubingX for relevant material, in particular https://www.youtube.com/watch?v=1il-s4mgNdI
    • Chapter 6 recommended Art of the Problem, in particular https://www.youtube.com/watch?v=OFS90-FX6pg
    • LLMs and the singularity: https://philpapers.org/go.pl?id=ISHLLM&u=https%3A%2F%2Fphilpapers.org%2Farchive%2FISHLLM.pdf (summarized at: https://poe.com/s/WuYyhuciNwlFuSR0SVEt). 6/7/24: vote was 4 3/7. We read the abstract. We could start it any time. We could even spend some time on this and some time on something else in the same meeting.
  • Schedule back burner "when possible" items:
    • TE is in the informal campus faculty AI discussion group. SL: "I've been asked to lead the DCSTEM College AI Ad Hoc Committee. ... We’ll discuss AI’s role in our curriculum, how to integrate AI literacy into courses, and strategies for guiding students on responsible AI use."
    • Anyone read an article recently they can tell us about?
    • If anyone else has a project they would like to help supervise, let me know.
    • (2/14/25) An ad hoc group is forming on campus for people to discuss AI and teaching of diverse subjects by ES. It would be interesting to hear from someone in that group at some point to see what people are thinking and doing regarding AIs and their teaching activities.
    • The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates now and then. 
Appendix: Transcript 
Artificial Intelligence Study Group
Fri, Aug 1, 2025

0:01 - R. S.
you.

0:54 - D. B.
Hi everyone.

0:55 - Unidentified Speaker
Hello.

0:55 - R. S.
Hey, A.

1:06 - D. B.
A, are you there? Oh, hey, yes, sir.

1:09 - A. B.
How you doing? Hi. Yeah, I got back this afternoon.

1:13 - D. B.
That's why I canceled all the meetings until this one. I figured I was sure to be back by.

1:20 - A. B.
Yeah. No, all good. All good. I've had a good trip.

1:24 - D. B.
Yeah, we took a trip to Baxter State Park in Maine.

1:28 - A. B.
It was fantastic. Yeah, it was really good.

1:31 - D. B.
But I got pretty tired. I'm getting pretty old.

1:35 - A. B.
Yeah.

1:36 - Unidentified Speaker
Awesome.

1:37 - D. B.
Glad you had a good trip. All right. Yeah. OK. Well, we'll give another minute. Hello, V.

1:54 - V. K. (CARTI)
Good afternoon. Hi.

1:58 - D. B.
Well, if we have another small group today, which it was looking like it might be, we'll just continue to do our readings of abstracts and things to assess, evaluate our next reading. So I guess we can go ahead and get started.

2:24 - V. K. (CARTI)
Have a good weekend.

2:27 - Unidentified Speaker
Hello?

2:28 - D. B.
Did someone say something?

2:30 - Unidentified Speaker
All right. Anyone have any announcements, updates, or questions?

2:36 - D. B.
Then D. has agreed to do a demo on how he generates the transcripts of these meetings. Turns out he was not well, not feeling well last week. But he's not here now, so we'll just do it when he comes back, whenever he's available. And E. also says that they're still working on the slides and they still want to do this. So again, when possible, we'll do it. And V. is happy to demo his wind tunnel system, I think, anytime. But he's not here now, so we'll put that on the queue as well. And finally, E. S., who's a professor in the psychology department, is still willing to meet with us and give us her opinion of this book. And she's trying to schedule when to do it. And she said, well, you know, whenever we want her to do it, she'll do it. So I just have to schedule a time.

3:46 - D. D.
Hello, D. Hello. Hi.

3:48 - D. B.
And again, I'm planning on, we got a bunch of master's projects available, so when the semester starts, I'll send out an email to the master's students and see if anyone wants to do any of these. An essential prerequisite is that they're free to meet with us on Fridays at four. If they're not willing to do that, they can find another project.

4:16 - A. B.
What else? Yeah, go ahead. I would like to, at some point, as I get closer to defense this semester, maybe do a dry run, if that's possible here. Absolutely.

4:25 - D. B.
We've done that many times, and I recommend it. In fact, your defense probably should be one of the meetings here, if your committee members can make it. Oh, OK. Yeah.

4:35 - Unidentified Speaker
If they can't, we'll do a rehearsal, and then your defense will be some other time.

4:40 - D. B.
But I always try, whenever there's an AI-related defense or something like that that I'm involved with, I always say, well, how about Friday for, you know, doesn't always work, but you know. Okay, well, D., do you feel like telling us about your, your interaction process with? Yeah, yeah, I could do that.

5:04 - D. D.
Yeah, that'd be great.

5:06 - D. B.
We'll go ahead and do that, then. All right.

5:10 - D. D.
I'm gonna go ahead and unshare the screen so you can share and go right ahead. Find the share button. Yes, that's the share. I don't have anything to share just yet, so I'm just going to share my whole screen. Yeah, I don't pick it in my computer. No picking. All right.

5:43 - D. B.
Can I make a comment from the peanut gallery about your, no, I'm just kidding.

5:51 - D. D.
Go ahead. So I got the, so I go to somewhere like, so what we're talking about, anybody that has no idea what's going on. So the AI, Read AI has a transcript. Can to identify who's speaking, what they're saying, and it builds a full transcript of the entire meeting. And you can download it as a text document. And so what I'm doing is I'm taking that transcript and I'm bringing it to the AI and having the AI anonymize it so that you know, it just puts our initials instead of our name, and then anybody can just Read the transcript or whatever, and they really don't know who was there.

6:50 - D. B.
As a footnote, D. sends me the anonymized transcripts and I add them to the minutes, so you can go to the website and check the minutes and see what the transcript was, but it's publicly viewable to the world. So it's anonymized.

7:06 - D. D.
So I think it's this one I go to. So I go here and I log in to the API. Oh there's something in my way I can't see though. Oh there it is.

7:27 - Unidentified Speaker
Okay. And this This is the chat GPT API, and this is the dashboard.

7:36 - D. D.
I go in here and I have to find the prompt, and you can see it's not super. I think it's called clear names.

7:57 - D. B.
So as a point of information, these prompts are all, you've saved a bunch of your old prompts so you can reuse them?

8:05 - D. D.
Yeah, some of these are for my research. Some of these are just stuff that I practiced with. Let's see if I can find this thing. Oh, it's here.

8:17 - R. R.
So do you actually do some prompts to get this code? Is that what you're doing?

8:23 - D. D.
I'm sorry, say that again, sir.

8:27 - R. R.
No, don't call me sir, just call me R.

8:32 - D. D.
R, okay. Yeah.

8:34 - R. R.
Do you use prompts and do you create code based on the prompts? Is that what you're doing?

8:43 - D. D.
Well, I mean, I could do it through like a Python wrapper and there's other ways to besides Python. But I could access the API through API key and just have a script or something. But no, I just do it. It's kind of like another chat interface. Over here is where I have the system properties.

9:07 - Unidentified Speaker
Yeah. All right. I'm sorry.

9:09 - R. R.
This is the system prompt. I'll just Read it.

9:12 - D. D.
I don't know if you guys can Read that well. Yeah, yeah.

9:16 - R. R.
You can Read it well?

9:18 - Unidentified Speaker
Everybody OK with it? Yeah. OK. I just want to Read better, bigger.

9:23 - D. D.
I'm sorry, what?

9:24 - R. R.
I just made my screen bigger. OK, yeah. And so. This chat GPT 4 0. I think is the one I finally decided on because, OK, so that, you know, this is an hour long meeting.

9:44 - D. D.
Approximately, I think it's 45 minutes, some weeks and an hour, some weeks. But there can be a lot of communication. And so the text can get long. So you need a model with a very large context window. And I would like to say I have tested this on smaller models that I run locally on a Linux machine next to me. And it can't handle the job. So it probably can if I get creative about breaking it up you know, maybe send it to the thing in multiple prompts. But what I have to do is I have to make sure that I've got it where it'll save the temperature at zero. And this is the V4 of this prompt. That means this is the fourth version. I have three other versions of this same thing, but this is the best But it will not save the tokens. And so I'm going to turn that all the way up.

10:54 - D. B.
I have a question, D. Can you show us the previous versions and maybe explain what went wrong with them? OK. I mean, if you don't mind.

11:07 - D. D.
I don't mind. Thank you.

11:09 - R. R.
That's a good question. Thank you.

11:12 - D. D.
I think people are always wondering, well, make good prompts. Okay so all right so this let me go let me see what I got here so version one is an older version of chat GPT and it was saved with a temperature of one and this right here of course could be adjusted but it only goes up to 496 which is not enough and that temperature it kills it and so the reason kills it is because the model is a generating model. And if you give it, the temperature is its creativity. Okay. So what it likes to do, and sometimes it doesn't do it much, very subtle, but it'll change what was recorded in the transcript. It will reword what somebody said to try to say it better for them.

12:11 - R. R.
Okay. Which I don't like that.

12:13 - D. D.
I think that that's a bad stir of data. If I take the data and I change it and say that somebody said something that they didn't. So I've learned that I have to change the temperature. So this version was a bust, right? So next version, let's see what I did.

12:34 - D. B.
How do you change the temperature?

12:36 - D. D.
Okay, so it's just in the settings.

12:39 - D. B.
see this little thing right here? Yeah.

12:42 - D. D.
It's just right here next to the model. You change the model, change the temperature.

12:47 - Unidentified Speaker
And the temperature, this is version 2, the temperature is all the way down and saved at 0.

12:54 - D. D.
So this one will work. And then this can go up to the 4096. But what happens is it'll only go through like a third of the transcript. And then I'll have to type in there, please finish. And then I'll do another third. Well, that token, I thought, that token limit looks different than your last version. I thought the last version had a lot more token input. Well, that was version four. Version one had the same token limit.

13:36 - A. B.
I'll go back. Sorry.

13:38 - D. D.
No, no, no, it's fine. So this right here is version one. The temperature was saved. And this right here goes to 496 or 4,096. And then the same with version two. And I think that I got the temperature right, but I still didn't have the context window that I wanted. Then I got to version 3, and I think version 3 is actually similar to version 4. Let me just check really quick.

14:13 - Unidentified Speaker
I thought I saw a much bigger number in the token.

14:18 - A. B.
You did.

14:18 - Unidentified Speaker
It might be here. Yeah, that's it.

14:21 - A. B.
This one right here will go all the way.

14:26 - D. D.
Let me see, this is my default. So let me see if there's a difference. 16, 3, 8, 4. OK, so what happened here was when I got to version 3 and I saw that I got the temperature safe where I wanted and I had the context window, I saved it. But then whenever I went to use version 4, the context window return back, I had to come and manually do it. So that's why I'm on version four. But this is the best I can do right here is this model. But every time I started, I have to do the manual tokens. And then I have to find a transcript.

15:20 - Unidentified Speaker
Let me see if Find the transcript here. Here we go. Here's one, a raw transcript. Oh, look, I have all these others open.

15:39 - D. D.
And you can say that this tells you his room Here's Dr. M. I guess this is the one I wasn't in. I couldn't come last week. I had a medical problem. Oh, no, I was here. So this must be the week before that. So you can see it's got everybody's name in it. That was it here in the mail. I'm going to take this control and copy it. Just copy the text. And then. I'm just going to paste it in the chat. Now, if everything goes right, I wish I could get rid of this.

16:37 - Unidentified Speaker
OK, there it's gone. This right here. And I didn't say that.

16:45 - D. D.
I bet you the recording, I didn't say that.

16:47 - Unidentified Speaker
But that's what I think the transcript they Read that AI frequently attributes the wrong person to this. Yeah.

16:55 - D. D.
But, uh, yeah. So, yeah, I can, if you can imagine though, if that, if the one AI makes mistakes and then this other AI decides to change it, it can get, it could get pretty convoluted, but so, um, Let's see if I can run this. Theoretically, if everything's set up right, this will run the entire transcript. The first time this happened, that I was able to run the entire transcript was when I sent the transcripts earlier this week. I think it was earlier this week. That was the first time that I've got it to do the whole thing. One, because this context window is so large. It's a good model. And the temperature's turned down low. And so when the temperature's all the way at zero, it doesn't try to create anything. It only does what it's told.

18:00 - D. B.
I noticed that also there was, like people were talking about, like we talk about G. H., it'll change it to GH, which is what you told it to do.

18:13 - Unidentified Speaker
Yeah, that's what I told it today. That's right. This is R.

18:19 - D. D.
Can I ask you some dumb questions?

18:22 - R. R.
Are you talking to me? Yes. Yes, you sure can. Okay. All right. Temperature tokens and I saw another dimension there. I think I understand what tokens are in this context, but could you explain? Yeah, please. Thank you. Yeah. Explain what those dimensions are. Temperature, tokens, top piece, store logs.

18:53 - D. D.
I didn't hear the last part.

18:57 - R. R.
The store logs, the last one.

19:01 - Unidentified Speaker
Yeah.

19:03 - D. D.
the store logs are some part of the API. I don't know what they are exactly, but I mean, I could find that out probably. Okay, so the temperature. The temperature is described as the model's creativity. So the, It's not done yet. Whenever the temperature is high, I'll demonstrate that. As soon as it's done and we get finished with this, I'll just have it do it again, except for with the temperature turned up.

19:47 - D. B.
I'd like to describe temperature in a more under the hood way. If you think of generative AI as picking the most likely next word. If the temperature is zero, it will, in fact, pick the most likely next word. If temperature is higher than zero, it will, with some small or large, with some probability, it'll pick a word that's not the most likely, but maybe the second most likely, or the third most likely, or the 10th most likely, with declining probability.

20:18 - Unidentified Speaker
I think that you're describing top.

20:21 - D. B.
Oh. OK. So temperature is.

20:24 - D. D.
I do. I think that you're describing top.

20:29 - D. B.
So luckily, we live in the world of AI now.

20:35 - D. D.
And so I can go, what is the temperature for an A, let's go A, large language model? Oh, looks like somebody is a bad typist or spelled wrong, which. Maybe both who knows. So in the context of large language models, the temperature parameter controls the randomness of the models output oil. That sounds like what you're saying. This parameter influences high predictability and create or creative. The models output is or how creative it should say how predictable or how creative yet. Low temperature results in more deterministic. And predictable outputs where the model favors the most probable words. This is useful for tasks requiring accuracy such as summarization or translation. Medium temperature provides a balance between randomness and predictability. And often used as a default for generating text. High temperature is randomness and All right, so let's go back.

21:47 - D. B.
I have a question about the temperature. If a student wants to use an AI to hand in their homework, do their homework, if they make the temperature high or higher, then would an AI detector be less able to detect that it was written by an AI?

22:12 - D. D.
The only thing that I know to do is to form a couple of hypothesis and null hypothesis and do a bunch of tests. We'll get a sample of your students and say, OK, now you don't do your homework, but you do your homework.

22:35 - D. B.
All right, what is time?

22:36 - D. D.
It makes sense what you're saying.

22:40 - A. B.
because it would be a more, higher temperature would produce a more randomized output. Therefore, I mean, those detectors that are trained on, you know, finding, I guess, language that sounds more deterministic probably wouldn't catch that, right? Yeah, how else would it be able to detect AI?

23:01 - D. D.
So the top P for a large language model, in the context of large language model, top P, also known as Nucleus sampling is a setting that determines which tokens, words, or subword units the model considers when generating response. It is a way to control the randomness and diversity of the generated text. Here's how top P works. Probability distribution, when a large language model generates, it first calculates the probability of each possible next token in its vocabulary. And so it doesn't sound a terribly lot different than temperature, does it? Cumulative probability threshold, you set a top value between 0 and 1, or null, which is default to 1. This value represents a cumulative probability threshold. Selecting the nucleus, the model sorts the tokens by their probability in descending order and the smallest set of high probability tokens whose combined commutative probability is at least equal to the top P threshold. This set of tokens is referred to as the nucleus. Sampling. The large language model then samples the next token only from this selected nucleus of tokens.

24:31 - D. B.
So I'm going to jump in. So if you set the temperature too high, it gives you some really weird, senseless, gibberish results. That one model that we tried last time, yes. I don't know about this model. But if you set the top P to disregard those really low probability next words, then it it will be, you know, it could be highly, fairly random because it's a high temperature, but it won't do something like give a Chinese word, you know, or a mishmash words together or something like that. Cause there's too low probability. Yeah.

25:15 - A. B.
It's like the temperature is telling you how surprising the results can be, like how surprising the results you're, you're allowing it to be. But then the top P is more like, well, how, how far down in the likelihood are you allowing the model to go?

25:31 - D. D.
Yeah. So really with a top P closer to one, it's actually giving it a little more creative output options. Right.

25:40 - Unidentified Speaker
Yeah.

25:40 - A. B.
So he's onto something here.

25:42 - D. D.
Let's go back to our thing. Now let me see if I can see if this is a good place to go.

25:53 - A. B.
So then the D's Pointer, if you had a high temperature and hide top P, that would probably not spike up in a detector or something, right?

26:06 - D. D.
So this is the logs of when I've been working on these transcripts here. So you can see that it's got, so I imagine that if I uncheck that box, and I took away your picture. Was it R that asked the question originally about the settings? Yep. Yeah. And so I think that this just says, like, see where I've been typing in, please finish, please finish, please finish. Those were different times, you know, different, different times, maybe the same day, maybe not the same day where I was asking the model to, you know, finish.

26:50 - A. B.
When you're doing it, lower token input.

26:52 - D. D.
along here and I found the Eureka moment, right? When it'll do the whole thing. That's with the higher token input setting, right?

27:03 - A. B.
Yeah, that's with that.

27:04 - Unidentified Speaker
This turned my token back down. Now, let's turn the temperature up.

27:10 - D. D.
Now, my clipboard may still have this in here. It does. So we'll just turn the temperature all the way up. The top is all the way up. The token count is, we don't necessarily need to do the whole token count. Let's just turn the token count down for the, I think it was suggesting it somewhere around 2,000, so we'll go down to there. We don't need the whole thing done. But let's see what it gives us here. This is with the temperature up, the top up, and a kind of low token. Now we're starting to get symbols that maybe are some, maybe Chinese, maybe.

27:55 - Unidentified Speaker
You got a mix.

27:58 - D. D.
I see Korean, I see. Maybe, yeah, maybe Korean. Which that would be more, that would be more.

28:10 - A. B.
Expanding out the top P would probably lead to more of that stuff, right? You have like different languages.

28:17 - Unidentified Speaker
Well, we're at max, I think, on top.

28:20 - A. B.
That's what I'm saying, like the fact that we did that, that's probably a result more of the top P that brings in these other languages. I think so, yeah. Allowing...

28:31 - D. D.
I'm going to count that as a second to Dr. B. Theory. Well, you could try the same high temperature, but with a lower top P. Yeah. Might turn down the token count, because this thing is getting it.

28:48 - A. B.
Look, it put an emoji in there.

28:51 - D. D.
And it would be interesting to see if any of these words make sense. You know, this might actually be, you know, witty. But I can't rate it.

29:08 - D. B.
I wonder if a space counts as a token because it seems like sometimes it's deleting spaces.

29:17 - A. B.
I know the timestamps are gone.

29:20 - D. D.
It's totally gotten rid of those. I found it hard to believe that it would even do this. I didn't see that one. There's a mailbox right there. Order something coulda Dr. Trotter long miss contact her savory point where services or Zitzer dent yeah it's just it's you know the AI knows what it's saying maybe or maybe it's making very low probability words frequently yeah very low probability that is a true story yeah I think that's it I think I think we're going to have to go to a version 5 and turn that top down. So let's turn it all the way down, just the extreme. Let's leave it at this. It's a little bit lower than what I had. And we'll turn the temperature all the way back up. Paste in the transcript and see what it does now.

30:25 - A. B.
I don't see any other language. Well, I think it's only picking the top probability word now.

30:29 - D. B.
Right now that the top P is set down, exactly.

30:32 - A. B.
You could do the same thing with a temperature of 0.

30:36 - D. B.
Well, it wasn't the temperature. I was curious.

30:40 - A. B.
On the last go, we had a very high top P. And we saw a bunch of different language references and stuff. And that's why I was wondering if that was contributing to that. So then we split the top P down.

30:54 - D. D.
Let's see if we can get this.

30:57 - Unidentified Speaker
Yeah.

30:57 - D. B.
All right.

30:58 - D. D.
Let's take a look at this.

31:01 - D. B.
Does anyone want to ask D. to try some other configuration there?

31:07 - D. D.
Okay. This is the one we're looking at right here. So this is identical conference room. Speaker two, hi everyone. Unidentified, hello. Dr. M., hello everyone. Hello everybody, hello. Happy Friday, all right. Looks like it's holding out. V. got changed. Look at that.

31:45 - D. B.
I mean, it's because it's... That's why I have to, yeah.

31:50 - D. D.
That's why I have to, yeah.

31:52 - D. B.
If you set the top P to, you know, maybe 0.8 or something, then it'll only pick the top probability possibilities. If you have a high temperature, a P that's not too close to one, it'll pick a lot of random words, reasonable random words So this is kind of a this is kind of a double Double whammy, I mean there's two ways of Okay, so can I turn the top down you it seems like you?

32:33 - D. D.
You get the same result if you turn the temperature all the way down set the top to two point point eight, right?

32:45 - D. B.
All right, so it's at top 2.8.

32:51 - D. D.
Alright, hold on, I need to...

33:38 - A. B.
Hello. Hello. Hi. Hello. Hello.

33:43 - D. D.
Yep. Everybody happy. All right. Can you go to that?

33:50 - A. B.
4-3 timestamp with Dr. B. Or yep, I was just trying to find a longer one to see if there's more like a discrepancy with a woman 0 colon 4-3 Those things are pretty close All right, make the top P 0.9, 0.95, or do 0.95.

34:24 - D. B.
If you do 0.95, that'll keep out all that, all those foreign language words and stuff. Probably 0.99 even, get rid of most of that stuff.

34:40 - D. D.
Yeah, it looks like that it's holding.

34:45 - D. D.
At Tom.

34:47 - D. B.
All right, try 0.9, 0.99. I think it's a pretty sensitive.

34:58 - Unidentified Speaker
Yeah, it's good enough, whatever, 0.8, 0.98, whatever. Whoa.

35:08 - D. D.
Where did this go?

35:13 - Unidentified Speaker
Control V.

35:18 - A. B.
I didn't cut it.

35:29 - D. D.
That's good enough. I don't know. I'm just guessing.

35:39 - D. B.
Your guess is as good as mine. Let's try it.

35:41 - D. D.
Let's see if we can get 90.

36:07 - D. B.
So 0.43 grant we just applied. Yeah, it's pretty similar thing. And it was at 100 that we ran it and we got like the foreign That's so interesting.

36:35 - D. D.
Yeah.

36:36 - A. B.
It's like, I want, we have to do 99 now.

36:41 - D. D.
I mean, let's look at two. This is a pretty, this is a pretty good long one right here.

36:52 - Unidentified Speaker
245.

36:52 - Unidentified Speaker
Let's look at 245. All right. Okay. I found the video.

36:57 - D. D.
We'll get it. Let me go back and show you. Well, it deleted a space between today, period of Oh, it deleted a couple of, it's deleting a lot of spaces.

37:20 - D. B.
It's deleting all the spaces after the periods. I don't think it did that before. Or some of the spaces. Yeah, so it's down the one space rule instead of two spaces.

37:43 - D. D.
I think they changed that. It used to be, yeah, it used to be you're supposed to put two spaces after and then somebody finally came along and said, stop it, it's crazy.

38:07 - A. B.
The Oxford comma people.

38:10 - Unidentified Speaker
We're done.

38:11 - D. D.
We're going to quit wasting all this space. We got to save those bits. All right, so we're going to do the top. Just one little tweak. Temperature all the way up.

38:41 - Unidentified Speaker
Hey, everybody.

38:43 - D. D.
Do you guys want to see the large language models that I have? Through Olamo? We have time. We're almost out of time.

38:57 - Unidentified Speaker
Maybe another time I could show you.

39:03 - A. B.
Sorry, is this something that you did some one-shot Retraining or front?

39:11 - D. D.
No, I just downloaded models on a llama on another computer that I can access through here.

39:18 - A. B.
Oh, okay.

39:19 - D. D.
Local large language models that I can access through a chat UI similar to this one.

39:26 - A. B.
Oh, through llama, gotcha. I don't know if y'all have seen that.

39:37 - A. B.
Yeah, I mean, it's interesting. So pretty responsive. Like again, you're, I would like to demonstrate it sometime whenever, whenever Dr. B. can let me, I'll show you all what I did.

39:51 - D. D.
It's very exciting. Kind of, uh, well, really it was E. He kind of pushed me in this direction to do it.

40:00 - A. B.
And then I just went ahead and did it.

40:04 - D. B.
Now, what is it?

40:06 - D. D.
Tell me about it. It's a so I Downloaded I have three models Three large language models, I think they're 30 something billion parameter models And One of them is a coder And the other ones are just chat.

40:34 - Unidentified Speaker
But I have the models I'm running.

40:39 - D. D.
I put them on Lynx machine because I have two 3060 graphics cards. And the, that gets me 24 gigs of RAM. And I picked the models that can, I have VRAM, I'm sorry, not regular RAM, but I have a lot more regular RAM. But the VRAM, I have enough to run, I think it's 30 something billion parameter models. So I have three, and I think I have like one of the deep seek models and then a Quinn coder and a Quinn chat.

41:21 - D. B.
Yeah, we just let me and do it, we'll schedule it.

41:26 - D. D.
I'm ready.

41:26 - D. B.
Maybe we can not do it next week, just because we did something similar this week.

41:33 - Unidentified Speaker
OK.

41:33 - D. D.
Maybe the week after. Just tell me when, and everything's set up and ready to go.

41:39 - D. B.
OK, what should I call this demo? Give it a title or something.

41:44 - D. D.
Local Large Language Models? Are you talking about the future demo? Yeah, the one you've just been talking about. Yeah. Uh, just, you know, local large language models.

41:57 - D. B.
They're just low.

41:59 - Unidentified Speaker
Yeah.

41:59 - A. B.
So on this topic that I'm circling back here, so no big changes as far as, uh, like words for language or anything like that. So this is that, this is that 0.99. It's the one, it's the one top. Yeah, that's so, that's fascinating. Can we try it one more time at the 100 just to see if that's what, like it.

42:29 - Unidentified Speaker
Yeah.

42:29 - A. B.
See if it wasn't like some glitch. Yeah, right.

42:32 - D. D.
Like it's so random. It just for one, just for one thing. It just, you know, it's like, and I imagine that the, that it probably has a play in it, you know, somewhere when the temperatures, I don't I mean my directive is pretty tight though, right? You know, so You know that it's a setting that's oh, yeah, I gotta redo this It's a setting that you know that they have that then I Okay, temperature all the way, top is set at one.

43:27 - Unidentified Speaker
Give it the push through.

43:35 - D. D.
It's one. 100. That's interesting.

43:39 - Unidentified Speaker
So there must be in that one percentage point or whatnot, that must be where it all drops off, and there's just a bunch of junk.

43:49 - A. B.
I guess that's, hmm. Can the temperature be above 2?

43:53 - D. B.
I mean, I thought, I know the slider bar only goes up to 2, but maybe using an API, it can get higher.

44:03 - D. D.
I don't think so. I mean, I think probably if you put it higher, it would probably default to whatever default is.

44:15 - A. B.
Yeah, that's really fascinating on the top view though.

44:21 - Unidentified Speaker
I see.

44:22 - D. D.
High temperature is two is what they're showing here.

44:27 - Unidentified Speaker
I'm sure that that's it. I mean, I don't I bet it doesn't.

44:39 - D. D.
I mean, I'm sure there is a way though.

44:47 - A. B.
But yeah, I don't know.

44:52 - D. D.
I mean, I had I would not have thought in a million years that, did I close that down? Oh, I didn't, okay. I would have thought that just 100th of the top would make the difference and it wouldn't matter what my temperature was.

45:14 - A. B.
I would have never thought that.

45:17 - D. B.
It's actually, I mean, I'm gonna claim that it's not a very good, they haven't, figured that this interface very well because it didn't for it to demonstrate this well here now we're bringing in this direct prompt this instruction is supposed to be followed right yeah so it should check periodically again to make sure that it's doing that.

45:55 - D. D.
It's supposed to make no other comments.

46:00 - Unidentified Speaker
It's supposed to just do this one thing.

46:05 - D. D.
You can't honestly say that the system prompt might have something to do with it, some factor. I don't know if we have time, but if we've got time, I say we put the temperature at one.

46:36 - D. B.
Put the top at five.

46:40 - D. D.
This probably isn't super scientific, I'm just going to delete that and I'll delete this again and then it probably still remembers what happened though. Previously I don't know but let's just see if there's Is there anything notable here?

47:07 - A. B.
Seems like a pretty normal. It does look normal.

47:20 - Unidentified Speaker
When it stops, I'll check it. Yeah.

47:30 - A. B.
I was kind of curious, I know we're kind of out of time, but like the high top piece, have you still selected the highest top P and then like a zero, like no, no temperature. I wonder if it would still do the same thing. I'm thinking it would, right? Like as far as like the references to the other, you know, languages and symbols and stuff. Cause it's, that seems to be the factor that's going and grabbing all these You want to do a high top and a zero temperature?

48:04 - D. D.
Right. Yeah.

48:05 - A. B.
That's the way I do it.

48:07 - D. D.
Oh, I thought you had a high temperature the last time we ran it.

48:13 - A. B.
Okay. You want to do a zero top and a high temperature?

48:17 - D. D.
No, no.

48:18 - A. B.
I was thinking the other way. I thought you had it high temperature and high top P the last time.

48:27 - D. D.
last time that we ran it yeah that was the one that was super confusing that's the top is if the top is one yeah then and the high temperature it it just gives us garbage but I haven't really been changing the top so that what this one is zero top He says on the top is zero.

49:00 - D. B.
Okay, so what is this right here?

49:08 - D. D.
We 34 That just Three four V's not baby Yeah, V's not V's not doing this thing today oh V's not V's not doing this thing yeah that's actually pretty good oh we had a kid issue crash it's a computer crash remember yeah so I mean it really it really just had everything to do with the top the whole time It's interesting. Yeah. And so in order to compensate for the top being at one, you have to turn the temperature all the way down to zero. But wait, so did, okay. So we did do that.

50:07 - A. B.
So that, so remind me what this, this output is at. This is just both in the middle.

50:16 - D. D.
Okay. So which one do you want to do? So you said that we did temperature at zero and top P at one Yeah, that's what that's what I do that's when I turn them in I turn them in that way right there And then that doesn't result in any odd like symbols and the the first time when I went through there one time I did it and there was just some slight changes and with the temperature at one. OK.

50:48 - A. B.
And so that's when I started changing the temperature to zero.

50:54 - D. D.
I saw just that just I mean, some subtle changes that really didn't change a lot. And and then we discovered that if you turn the temperature all the way up, that it just gives you some kind of, you know, gibberish.

51:14 - A. B.
Yeah. And, uh, God makes sense. Okay. Yeah.

51:18 - D. D.
And so the, but it's really the, it's really the, in order to offset the top of wine, the temperature has to be low, very low. And that, and that's, and it's just really 100 on the top that we saw. I mean, you know, there would be a lot more, um, we'd have to really You know, really check it because like I said, when the temperature is one, I mean, I saw the, where it just kind of, it just, you wouldn't even notice.

51:53 - Unidentified Speaker
I mean, if you just, you know, wasn't looking at it. No, this has been really helpful.

51:59 - Unidentified Speaker
Like I said, I've never got under the hood and tested in this way.

52:04 - D. D.
So I have to, um, I learned something today. Yeah.

52:08 - A. B.
Is this, do you just have a normal, do I have like a, the $20 paid account with, um, check duty?

52:14 - D. D.
So how this works, all right, so this is not normal, no.

52:18 - A. B.
Let me see if I can get back to that.

52:24 - D. D.
So this right here, so when I get to here, so this is $20 a month or whatever the subscription is, right? But when you have this subscription, you can put money in the cloud, so to speak. And so I think I lose, I lose money every year. Or I lost money the first year. I put $20 on there. And I mean, and I did a lot of stuff. I mean, I fine-tuned some models, like three or four models I've done, you know, just all kinds of just experiments and stuff. And I ended up, I think they kept like $12 or $13 and made me pay another amount of money, whatever it was, $20 or something after a year. So it's not very expensive to be in the API.

53:19 - A. B.
So if you have to pay an account, you still have to, you have to pay a little bit to, to go into the, to get the API.

53:31 - D. D.
I've heard now. So, Oh, I got an email. From P. that you know that P. now has the API but I don't know exactly where it is I haven't found it is one of the things that I was gonna do but I think my subscription to P. I was paying you know whatever their subscription was and I thought it was just a great deal and then they lowered it they had a they actually sent me an email saying you you can lower your subscription to $8 And so I have access to all this stuff and tons of models and it's like $8 a month. I mean, you know, AI right now is the candy of the internet. I would really like to have subscriptions to all of them. These right here, this one right here, This is a really good model. And this is a free plan.

54:44 - A. B.
It's really good.

54:48 - D. D.
And I think G. is pretty good. This is a free plan. I don't know to how much, but I just get them and just test them out. Just random topics or whatever. They seem pretty good for you. Yeah, I use the A. one a bit too.

55:22 - A. B.
I need to do some more with G. I did this one right here.

55:30 - D. D.
This was talking about like a teacher grading app. I was at the hospital or the hall, whatever they call it, nursing home. I was at the nurse home visiting my dad. And he just kind of sits there and hangs out. So I just went on here and just constructed this thing with it. And I was just amazed. You know, this is kind of the high level engineering part of building an app. But I mean, I got the makings of the documentation to get started. I think in enough time, I think I could, you know, I could build this whole app. You know, and then, you know, once I have it engineered, I could just go into the pair programming. And I've Read a book, where the guy's suggesting that you use something like Copilot and then something like ChatGPT. If Copilot gives you something that you don't understand, you run it in ChatGPT and ChatGPT tells you what that code is doing. You can still be the overlord of what's happening, but you have somebody that's actually writing the code for you and somebody that's actually explaining the code to you. You're just directing and managing the operation. It's way more efficient, way more efficient. Requires access to two large language models, and at least one coder. And I think they're saying that the coders, they actually have models that are just built for coding that might be better than JetGPT. But I guess I'm done with that presentation. I will gladly do another one at any time.

57:48 - D. B.
do another one. You know, I, I actually I don't know if I'll be available next week. I'm traveling back to Little Rock around then but but I'll let you know if it is canceled. But anyway, I just, you know, instead of doing two in a row, I'll won't make it.

58:07 - D. D.
We'll put a break in the middle somehow.

58:10 - D. B.
Okay, definitely happy to do the next one too. I think these practical is kind of practical hands on demos are going to be popular because people, you know, people, even experts are trying to figure out how best to use this stuff.

58:23 - D. D.
All right, well, thanks. Did we lose R?

58:28 - A. B.
I think so.

58:29 - D. B.
Oh, that other R. That is the other R.

58:33 - D. D.
Yeah, I see you there. I see you there. Your mic is off if you're talking to us.

58:40 - D. B.
It's on. OK, folks. Well, yeah, thanks. Thanks to D., and we'll see you all next time, hopefully next week.

58:49 - A. B.
All right, guys. Take care. Have a good weekend. Bye.
 


Friday, July 25, 2025

7/25/25: Evaluate some potential readings

Artificial Intelligence Study Group

Welcome! We meet from 4:00-4:45 p.m. Central Time on Fridays. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many. 
Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes (171st meeting, July 25, 2025)

Table of Contents
* Agenda and minutes
* Appendix: Transcript (when available)

Agenda and Minutes
  • Announcements, updates, questions, etc. as time allows: none.
  • DD has generously agreed to do a demo on generating the transcripts of these meetings. Here is one of the problems he encountered and can discuss:
    • When I [...] went to ChatGPT [I] discovered it changed models and I had to import my prompts. The model settings were lost and the new model's context window was too short. I changed to an older model and the model made up new entries for the transcript. I adjusted the temperature and got it figured out. It has been an interesting week... 
  • EG and DD are working on slides surveying different ML models.
  • VW will demo his wind tunnel system soon. 
  • If anyone has an idea for an MS project where the student reports to us for a few minutes each week for discussion and feedback - a student could likely be recruited! Let me know
    • JH suggests a project in which AI is used to help students adjust their resumes to match key terms in job descriptions, to help their resumes bubble to the top when the many resumes are screened early in the hiring process.
    • JC suggested: social media are using AI to decide what to present to them, the notorious "algorithms." Suggestion: a social media cockpit from which users can say what sorts of things they want. Screen scrape the user's feeds from social media outputs to find the right stuff. Might overlap with COSMOS. Project could be adapted to either tech-savvy CS or application-oriented IS or IQ students.
    • We discussed book projects but those aren't the only possibilities.
      • VW had some specific AI-related topics that need books about them.  
    • DD suggests having a student do something related to Mark Windsor's presentation. He might like to be involved, but this would not be absolutely necessary.
      • markwindsorr@atlas-research.io writes on 7/14/2025:
        Our research PDF processing and text-to-notebook workflows are now in beta and ready for you to try.
        You can now:
        - Upload research papers (PDF) or paste in an arXiv link and get executable notebooks
        - Generate notebook workflows from text prompts
        - Run everything directly in our shared Jupyter environment
        This is an early beta, so expect some rough edges - but we're excited to get your feedback on what's working and what needs improvement.
        Best, Mark
        P.S. Found a bug or have suggestions? Hit reply - we read every response during beta.
        Log In Here: https://atlas-research.io
  • Any questions you'd like to bring up for discussion, just let me know.
  • Anyone read an article recently they can tell us about next time?
  • Any other updates or announcements?
  • Hoping for a summary/review of the book at some point from [ebsherwin@ualr], who wrote: 
    Greetings all, 
      In a recent session on working with AI, Dr brian Berry (VP Research and Dean of GradSchool) recommended this book:
      The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions
    by Geoff Woods
      I just bought it based on his recommendation and if anyone is interested will gladly meet to talk about the book. Nothing "heavy duty" just an accountability group.
       If you have read the book already and if the group forms, you are welcome to join the discussion.
      I'll wait till Monday morning before I start reading -- so if you do not see this message immediately, do reach out!
       Best,
  • Here is the latest on future readings and viewings
    • Let me know of anything you'd like to have us evaluate for a fuller reading.
    • 7/25/25: eval was 4.5 (over 4 people). https://transformer-circuits.pub/2025/attribution-graphs/biology.html.
    • https://arxiv.org/pdf/2001.08361. 5/30/25: eval was 4. 7/25/25: vote was 2.5.
    • We can evaluate https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10718663 for reading & discussion. 7/25/25: vote was 3.25 over 4 people.
The meeting ended here. 
  • Schedule back burner "when possible" items:
    • TE is in the informal campus faculty AI discussion group. SL: "I've been asked to lead the DCSTEM College AI Ad Hoc Committee. ... We’ll discuss AI’s role in our curriculum, how to integrate AI literacy into courses, and strategies for guiding students on responsible AI use."
    • Anyone read an article recently they can tell us about?
    • If anyone else has a project they would like to help supervise, let me know.
    • (2/14/25) An ad hoc group is forming on campus for people to discuss AI and teaching of diverse subjects by ES. It would be interesting to hear from someone in that group at some point to see what people are thinking and doing regarding AIs and their teaching activities.
    • The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates now and then. 
Appendix: Transcript 

Artificial Intelligence Study Group
Fri, Jul 25, 2025

0:32 - Unidentified Speaker
Oh.

1:05 - D. B.
All right, well it's 5.01. We'll give folks another minute because I think D. was tentatively going to present something today, but things have been sort of relaxed in terms of the schedule, so he may actually not even show up. I don't know. We'll give him another minute. OK, well, I'm just going to assume that we have a small group. And one of the nice things about the design of these meetings is that there's always a fallback, which will be to do some readings and or video viewings. And today we can check a few abstracts or whatever and kind of evaluate different next readings. So we'll see about D. next time or some other future time. E., you and D. are gonna work on some slides surveying different ML models. Is that kind of on the back burner? Should I move that from sort of definite plans to sort of future or what do you think?

2:56 - E. G.
Well, it's tough enough plans, but it's future. We've already got the slides. I've given him some verbiage. My goal was to help D. along the path of better understanding models, which models to use and when. Okay. All right.

3:12 - D. B.
Well, then I'll leave it on this list, which means we'll check back on at each meeting until either it happens we decided to move it into the back burner category. Similarly, V. was offered to demo his wind tunnel system, but he's not here. And as always, anyone has ideas for MS projects, I'll add them to the list here. We've got a list of a few. And when the semester starts and I can start getting students who need projects, I'll suggest that they come to a meeting and we'll talk about the projects with them and see if they want to do something. If anyone has any other master's projects they want to propose, just let me know, and I'll add it to the list.

4:02 - Unidentified Speaker
OK.

4:03 - J. C.
I have an idea. Yeah. I don't know whether you want it now or write it up and send it to you.

4:12 - D. B.
Why don't you give me a hint, and then you can write it up later.

4:18 - J. C.
OK. I was at a meeting. At Cosmos earlier this week, and talking about social media. So my, my thought was that all the social media companies use AI use algorithms to decide what to present to you. And they decide for their purposes. And so every day, I see people frustrated by what Facebook is feeding them or, you know, or some other media and it crossed my mind, what if you turned it around? What if you had a social media cockpit and you used AI to only get the feeds that you wanted? Facebook would be out there, the AI would look at it, and if you say you don't want political things, you only want things from your friends, or you only want things on three topics, that's what you get. And you'd get it from whatever social media you subscribe to. Maybe it's Facebook and LinkedIn, and you get the feeds you want from those. And I think it would be, I love the idea.

5:37 - E. G.
I think that would be wonderful, but they would have to, make their fees available versus API, otherwise you're building screen scrapers that would have to walk the pages.

5:51 - E. G.
I've built screen scrapers before, and they are an ugly way to do it.

5:58 - J. C.
Yeah, and there are APIs, but they change. Not saying it would be easy, but I've been trying to work with Copilot on screen scrapers and have had occasional success. So maybe a new world on screen scrapers too with AI assistance. And that might be, I mean, that would be a related topic that'd be fairly narrow. You know, like, can you use AI to help you write a book? Can you use AI to help you screen scrape, to help you go through page after page?

6:42 - D. B.
So screen scrape social media outputs to find the right stuff.

6:48 - J. C.
But I think if I think of it on my machine, it would have my sign on to Facebook, my sign on to LinkedIn. And so it could do anything that I would do on Facebook or LinkedIn and just throw away the junk. You know, not not present me with my my favorite this last week is for some reason, I am getting push up bra ads for older women. And then the other thing I've been getting, I can't remember all the jargon, but high performance, oil oil-fed, screw-driven air compressors. And it's always got this whole set of keywords. You know, it's not just air compressors, it's this lengthier subset of air compressors. And somehow, between that and push-up bras, that's defining me this week.

8:01 - D. B.
Okay. Well, I think...

8:03 - Unidentified Speaker
Go ahead.

8:04 - D. B.
Yeah, I tried to take a couple of notes here, so feel free to send me an update.

8:14 - J. C.
But it might make a good overlap between the Cosmos people and other efforts.

8:22 - E. G.
OK, I just had Claude and OpenAI. And they said they can't help with scraping or LinkedIn directly as this would violate their terms of service and could raise privacy and legal issues.

8:41 - E. G.
There are things you can do.

8:43 - D. B.
For example, if you don't want a work, actual working system, you can develop a concept where you paste in the pages and show that it can, in principle, scrape them properly or something like that. There's a lot of variations that would make this a project, not necessarily a marketable product, but an investigation.

9:05 - J. C.
In the long run, somebody could develop an AI. Again, you're not going to be looking at anything that I don't get to look at. It's going to look at my Facebook page, my Facebook feed. It's not taking every page in Facebook and using it somehow to train an AI or, you know, figure out how to blow up motorboats or bicycles or something.

9:39 - D. B.
Does this sound to you like something that would be sort of most suitable for a tech-savvy grad student, like a computer science student, or maybe somebody who's a kind of application-oriented, like a information science student, I think you could do it either.

10:04 - J. C.
First off, I've been using PowerShell, which I know nothing about, but seems to be pretty handy and Copilot has successfully coached me through using it. So I think you have to be less tech savvy in some areas these days. You can focus on the application and get the code or the right tool suggestion from elsewhere. OK.

10:41 - D. B.
Well, one thing that we could do is if I start getting students who want to do projects, I can suggest they come to these meetings And we can show them this list and try to explain and see.

10:59 - J. C.
I think that'd be sort of a marketplace of ideas would be useful and interesting. Yeah, we can see what they want to do.

11:08 - D. B.
We are probably going to have fewer grad students going forward than we've had in the past, but hopefully some of them will be wanting to do projects. And when they come to me asking for projects, I always sort of don't know what to say. What to say. I'll say, are you free at 4 PM on Fridays? That'll be my first question. OK. Anything else on this MS Project activity? All right. Let me just save the page so we don't lose that. Any other questions? You ever have any questions you want to bring up for general discussion, just let me know and I'll add them to the list. E. S., the psychology professor, notified me that she finished reading the book. So at some point, I'll get her to Get her here to give us a review. That was this book.

12:26 - Unidentified Speaker
All right, well, we're sort of in between.

12:28 - D. B.
We finished all those videos, and we're sort of figuring out what to do next. So I thought we could go through the process of reading a few abstracts or maybe a few clips from different videos or both and evaluating them. So here's one of them. Honestly, I don't even remember what this is, but I figure we'll just take a look at the first paragraph of the abstract, talk about it, and vote on it. So I'm going to bring it up here. And let's see what this is. Is.

13:12 - Unidentified Speaker
Here is a paragraph.

13:15 - D. B.
And let's just Read from here to here, and then we can decide. I'll just We'll just see what we think. Of it. Any comments or thoughts?

14:08 - Unidentified Speaker
I think we are not that far off.

14:14 - E. G.
I mean, if you take a look at human, and I don't mean to go against anybody's senses, but human evolution over the hundreds of thousands of years, we're seeing the same thing in large language models in a far more compressed time frame.

14:44 - D. B.
Yeah, considering the human brain or animal brains for that matter, I mean, this is happening fast.

14:54 - E. G.
I mean, look where AI was five years ago. Large language models. Granted, we had feed-forward neural net. We had recumbent neural net. We had neural net. Now, with large language models, if you take a look at it, it's neural net in a multidimensional layer with intermediary pieces for aggregators.

15:23 - D. B.
All right. Any other thoughts on this? Do you want to evaluate it with kind of a vote, or do you want to Read another paragraph?

15:39 - E. G.
All right, we'll Read another paragraph.

15:51 - D. B.
Sheesh, I can't highlight anymore.

15:58 - Unidentified Speaker
One more time.

16:01 - Unidentified Speaker
Didn't work.

16:02 - Unidentified Speaker
All right.

16:04 - Unidentified Speaker
This is the paragraph.

16:08 - D. B.
Oh, there it

16:11 - D. B.
All right. We'll Read that.

16:22 - Unidentified Speaker
Comments or questions?

16:36 - J. C.
What tool would you use? I mean, there have been tools in computer science that let you monitor what code in a system actually got executed and how often. Well, I don't think it's tools, but it's paradigms.

16:58 - E. G.
When computers first came out, they had one processor that could do one thing at a time. Then the advent of parallel processing occurred. We had a processor with multiple or a unit with multiple processors on it so it could run things concurrently. We now have AI chips that have tens of thousands of processors. I think those are the tools. Now, in biology, we're not governing biology. We're monitoring it. In AI, we're not monitoring it only. We're governing it.

17:47 - Unidentified Speaker
All right.

17:47 - E. G.
So I think this would be a great paper to Read.

17:54 - D. B.
OK. I'm just pleased by the site.

18:00 - J. C.
there. And am I right that the chips are analog for some parts of AI processes? Are what?

18:13 - D. B.
Chips are what?

18:15 - J. C.
That they do analog computing.

18:19 - Unidentified Speaker
Digital.

18:19 - E. G.
Analog would be wavelength. Digital is It is digital.

18:28 - D. B.
They use these things called ReLU, rectified linear, what does ReLU stand for? Rectified linear something, to decide, you know, to make decisions. But they do it.

18:50 - J. C.
But for instance, Greg, processing? Is that not analog?

18:54 - D. B.
I mean, I think under, you know, at the basic level, it's using digital circuits. But actually, you know, when you're, when you're, you know, like we saw with these, with these videos, when you're doing these calculations, and finding how, what the probability is of a given word, it's effectively an analog number, right? I mean, in the sense... Any number along a scale. I'm just curious.

19:26 - J. C.
I'm old enough that when I took Fortran and then took a computer science course that was everything about computers taught then, we used analog computers as one of our units of work.

19:44 - E. G.
It was the same with you, I I worked on, my first computer had vacuum tubes.

19:52 - Unidentified Speaker
Yeah.

19:52 - D. B.
I mean, neurons in a brain are sort of somewhat analog in some ways.

19:59 - J. C.
Yeah. And I guess I had had an understanding that maybe not the heart of large language models, but that related to new chips and stuff was that voice and images and other things were or partially analog processing, maybe just to digitize them or to, you know, for input or for output. Yeah.

20:28 - D. B.
All right.

20:29 - D. B.
Any other thoughts anyone wants to mention before we evaluate it? All right. Well, here's how we've done it in the past. So, you know, the one to five star thing for rating Amazon products or rating courses in university typically, where, so we're going to say that one means you definitely don't want to Read this together, five means you definitely do want to Read it together, three means you don't know, and then two and four are sort of leaning one way or the other. So if you can go to your chat window and just...

21:11 - R. S.
title again of this paper?

21:14 - D. B.
Hang on let me just, I gotta, all right let's see what the title is.

21:22 - Unidentified Speaker
Okay and when when was this paper published?

21:26 - D. B.
I don't know. It's a where it's a kind of a well 2020 it's listed as 2025.

21:33 - E. G.
But it's on CLAWD 3.5 so it's only going to be at the most a couple of years old.

21:42 - J. C.
Well, what is it? What what is the haiku mean in that pod 3.5 haiku?

21:47 - D. B.
Isn't that a one of the so they have several they have multiple models. And I think maybe one of the leaner one of the models is called haiku.

22:01 - J. C.
Yeah.

22:01 - J. C.
Okay.

22:02 - D. B.
Oh, it says anthropic is lightweight production model. So it's a It's the more efficient but less powerful model, perhaps. All right, well, yeah. So go ahead in the chat, just put in your number from 1 to 5, and we'll go from there. OK, we've got one vote.

22:39 - J. C.
Now, maybe the biology is changing on that side too, because we're dinking around with that to modify people to avoid birth defects and so forth. D., are you a member on this or not?

23:00 - R. S.
Are you voting on this or not?

23:04 - D. B.
You want me to?

23:06 - R. S.
Yeah.

23:07 - R. S.
Because we have less people today. Yeah.

23:10 - D. B.
I'll give it a four.

23:12 - Unidentified Speaker
So that's an average.

23:13 - D. B.
We've got two fives and two fours. That's an average of 4.5. I'll mention A. is. I just jumped in to see you guys because I've missed so many meetings and wanted to know what you were up to.

23:28 - H. J.
Oh, yeah, yeah.

23:29 - D. B.
We just Read three paragraphs of an article and we were voting want to Read it in more detail.

23:37 - H. J.
Yeah, so I think that's, I think I don't even deserve to vote.

23:41 - D. B.
All right, well we'll do another, we'll do another article and you can vote on that one. No, it's okay. I mean we are, that's the next plan anyway, so since today is kind of a sort of evaluating of multiple possibilities day, as it turns out. All right, well, let's then go to the next article. And that's this one. This back in May, but AI is moving with the speed of light, so I think we should probably reevaluate. Maybe we can average the two evaluations or something like that. So let's go to here. And this is a, looks like a archive.org preprint. Written by a bunch of people in the AI field, in OpenAI, called Scaling Laws for Neural Language Models. Any questions or thoughts or comments on the title?

25:25 - R. S.
What are scaling laws? The size of the neural network?

25:31 - D. B.
Maybe, I don't know. It's changing the size of something. The abstract has a little bit more in it.

25:45 - Unidentified Speaker
All right, let's take a look.

25:48 - Unidentified Speaker
I want to scroll down a little bit, yeah.

25:53 - Unidentified Speaker
Is this a newer article?

25:57 - H. J.
What year is this one from 2020.

26:03 - Unidentified Speaker
2020.

26:04 - D. B.
Because this has been a really big topic. I mean, in computing, it's always a question, especially AI, even 40 years ago, AI and AI scaling was always a key research question. Anyway, let's take a look. Let's Read through the abstract, and then we'll see if there's any discussions or questions. Thanks. All right, this is dense. I'm thinking we should go through it sentence by sentence. But before we do, does anyone have any questions or comments on the whole thing? All right, let's go through it one sentence at a time. All right, let's start with that sentence. Comments or questions? Well, I'm baffled by what is cross-entropy loss? Does anybody know? Anyone into looking it up? Anybody?

28:01 - E. G.
Log loss is a measure used in machine learning to evaluate performance of classification models.

28:08 - D. B.
What is it again?

28:10 - E. G.
It's also known as log loss. It's a measure used in machine learning to evaluate the performance of a classification model by quantifying the difference between the prediction probabilities and the actual labels. Sounds like it's a version of R-squared.

28:32 - D. B.
Yeah, so it's just a measure of how well the thing classifies, does classification.

28:39 - D. B.
Expected versus the anticipated.

28:41 - D. B.
All right, so it's, we don't really need to know what it is technically, it's just a measure of how well the Classifier works.

28:52 - Unidentified Speaker
OK.

28:53 - D. B.
Do you all know D. R.?

28:57 - H. J.
Sorry, do you all know D. R.?

29:02 - Unidentified Speaker
No, I don't.

29:04 - H. J.
Yeah, he's a PhD candidate at UA Little Rock and he's been working on and even published and now is getting ready to launch a product that basically does this. It analyzes your AI tools to see whether or not they have a lot of entropy, to see if they're still targeted for high quality output. Low error rates. So I just think this stuff is moving so fast. I don't know how. You guys are trying to figure out if you want these to be topics for this group?

29:58 - D. B.
Whether we want to Read this paper in more detail, yeah.

30:02 - H. J.
Yeah. I think that reading anything that is older like this is not going to be very useful. But I'm not the technical person who be reading these things anyway. But I do think that that suggests that D. R. would be a great person to get in here to talk about his work on compliance and because what he's doing is, you know, he's trying to, he's taking away the black boxness, black boxness, you know, of AI programming, specifically in the area of regulation and compliance.

30:40 - D. B.
Uh-huh. That'd be great if he wants to present his project. Yeah. I can certainly tell him about it.

30:48 - H. J.
I think it'd be good. Yeah. Do you mind doing that?

30:52 - D. B.
I was going to send him an email.

30:55 - H. J.
I'll send him an email right now.

30:58 - D. B.
OK. Sounds great.

30:59 - H. J.
Yeah. I don't know that I would be helpful in evaluating these more technical discussion.

31:05 - H. J.
Well, I mean, the evaluation.

31:07 - D. B.
is of whether to Read them in more detail, and if the audience, you know, to the degree that the audience has people in it, you know, like you, who are not technical, you would be perfectly legitimate to say you're not interested in it.

31:26 - E. G.
Yeah, okay. I think the only benefit is to, like Moore's law, is to see whether or not what they postulated came to fruition? Are we actually seeing that type of performance and convergence?

31:42 - D. B.
Yeah, I think the question of how much AI is improving over time is really interesting. This is not a paper about that, but maybe it is. As the independent variable when you're talking about how fast the field is improving?

32:08 - E. G.
Well, that, and we'll get to that sentence, but that's in the last sentence where it talks about optimally compute-efficient training involves training in very large models on relatively modest data, stopping it significantly, stopping significantly before convergence by identifying where they're not getting a large accuracy increase on the continued training, where it goes up to a couple of sentences before, where you're overfitting. And you're now not improving the model, but degrading it.

32:54 - D. B.
Yeah, right. Any other thoughts? On related to this particular sentence? All right, well, let's. The one.

33:20 - Unidentified Speaker
Any comments?

33:35 - J. C.
We are writing this one?

33:38 - D. B.
Yes.

33:39 - Unidentified Speaker
Okay.

33:45 - D. B.
All right, let's look. The next one. Any comments or questions on this one?

34:15 - J. C.
I'm not seeing a new one.

34:19 - D. B.
Oh, I just, I'm highlighting the sentence.

34:22 - J. C.
Oh, just the sentences.

34:24 - J. C.
I see.

34:25 - D. B.
Yeah. All right, to me, this is suggesting, you know, you can look at how, how increasing the model size slows down the training speed, or how over fitting changes as the model size increases or the data increases? I mean, you more ordinarily think of a model as being better if it's bigger, right? But then it's also more subject to overfitting. Any other thoughts or comments? All right, next sentence.

35:32 - Unidentified Speaker
Questions or comments on that one?

35:44 - E. G.
I think it falls in the yes, of course category. Identifying the relationship for a fixed budget, yes. I mean, that falls in the, like, doing a study, do teenage boys spend a lot of time thinking about girls or sports.

36:13 - D. B.
It's a classic computing problem, how you optimize, you trade off one thing against another to optimize based on fixed resources, right? Memory versus time and algorithms, electric power, versus some measure of computing performance for battery-operated devices. All right, let's try the last sentence. Any questions or comments? Is this saying that the bigger the model, the less data you need? Well, OK. That's what I'm reading.

37:45 - E. G.
Yeah.

37:46 - D. B.
Any other thoughts before we evaluate it? If not, go ahead and type in your number. Again, one means you definitely don't want to Read this paper together, five means you definitely do, three means you don't either way, and two and four are Yes or no?

38:11 - R. S.
In the title of the paper, can you show us the title again, D.?

38:29 - Unidentified Speaker
All right.

38:35 - D. B.
Two votes? Surely we can get more than two votes. All right. M., you're more than welcome to vote, but it's not required. All right, I'm going to give it a four. So we got, let's see, two, one, two, three, four. Is 2.5. That'd be pretty harsh for 2.5. All right. Okay, well, I think we have time to probably do one more. Let's try this one. This is another paper in the IEEE Digital Library. Don't know which one it was.

39:57 - J. C.
Oh, this is by my What is the date of publication of this?

40:10 - R. S.
What year is this?

40:14 - D. B.
I'm not sure. It's on the bottom there.

40:20 - R. S.
October 2024. Yeah.

40:23 - D. B.
Several months old.

40:26 - R. S.
Can I see the title again?

40:34 - D. B.
Let's see how big I can get it so we can Read it. I'm going to go full on here 29. Okay, here's the paragraph. In PDF, so I can't, it's harder to highlight. Let's Read it and then talk it. Any comments or questions? Shall we Read another little bit before we decide? All right, well, if nobody wants to Read any more of it, to see, we'll go ahead and evaluate it now.

42:23 - J. C.
I'd be interested in the next paragraph.

42:27 - Unidentified Speaker
OK.

42:28 - D. B.
All right, well, I think we can probably Read to here, which would be a question and a part of an answer. So let's start with... I can fit this all on one page.

42:53 - Unidentified Speaker
Okay.

42:54 - D. B.
Let's start first. Comments or questions? I mean, to me, isn't this what intelligent, maybe not particularly benign, but intelligent human actors do all the time? I believe so.

43:57 - E. G.
In fact, there's been arguments that a lot of the AI systems, LLMs, have safeguards to prevent this sort of activity.

44:12 - Unidentified Speaker
All right.

44:14 - D. B.
Any other comments? All right, well, then let's go ahead and Green. Sorry, I'm just trying to make it bigger. There we go. Comments or questions?

45:11 - Unidentified Speaker
All right.

45:13 - D. B.
We'll finish his answer for the first question. One. Anything?

45:48 - Unidentified Speaker
That's interesting.

45:49 - Unidentified Speaker
Yeah.

45:50 - J. C.
Let's see how much more there is here.

45:56 - D. B.
OK, so another paragraph, but it's a half paragraph on each page, so we'll have to live with that. All right. Comments? Yeah. Sorry, I didn't understand. You were muffled when you said you were meeting somebody or Yeah, I was on the phone with my Zoom meeting right now.

46:48 - Unidentified Speaker
One of my papers got accepted. I was on the phone with the editor.

46:56 - D. B.
All right. Any comments on that? All right. Let's Read the rest of the paragraph.

47:04 - Unidentified Speaker
Any comments?

47:08 - D. B.
Well, my comment is that basically we're talking about what happens when the AIs get smarter than people, and this is one thing, you know, only one of many things that would become sort of problematic. All right, well, let's go ahead and evaluate it. Again, it's one to five. Five means you definitely want to Read the whole thing. One means you definitely don't. Hello? R., any input?

48:27 - Unidentified Speaker
Give it a four.

48:31 - D. B.
All right, we've got two threes and a four. R., are you going to vote? All right, so getting back You got. All right, well, I think we're at a good stopping point. And maybe next week or next time we have time to do this, we'll Read a few more and then pick the best one for more detailed reading. Any last thoughts before we adjourn?

49:45 - Unidentified Speaker
D., I did vote.

49:47 - R. S.
Oh, what did you vote?

49:49 - D. B.
What was your vote? Three. So it's 3.25.

49:53 - R. S.
I had my audio off, so I guess I said it multiple times.

50:01 - D. B.
OK. OK. Anything else anyone wants to bring up before we adjourn? I hope you have better attendance next week.

50:14 - R. S.
Yeah.

50:15 - D. B.
Well, sometimes you have a small group you can tune things to the interest better and gives us a little more influence on what we Read next, right? Okay. All right. Thank you.

50:30 - R. S.
All right.

50:31 - D. B.
Take care, everyone. Have a good weekend.