Friday, March 14, 2025

3/14/25: Finish chapter 6 video

 Artificial Intelligence Study Group

Welcome! We meet from 4:00-4:45 p.m. Central Time. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many. 
Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes  (154th meeting, March 14, 2025)

Table of Contents
* Agenda and minutes
* Appendix 1: Syllabus of new proposed 4000/5000 level applied AI course
* Appendix 2: Transcript (when available)

Agenda and minutes
  • Announcements, updates, questions, presentations, etc. as time allows
    • Today: video viewing, etc.
    • NVIDIA conference online next week. It's free! MM can send a link on request. 
    • Fri. March 21: DD will informally present. His topic will be NLP requirements analysis and the age of AI.
    • Fri. April 18: YP will informally present his draft AI course outline and welcomes comment. See Appendix 1 below.
    • TE is on the informal faculty AI discussion group.
    • News: SL writes: "I’m excited to share that I’ve been asked to lead the DCSTEM College AI Ad Hoc Committee. A call for department representatives went out a few months ago, but since so much time has passed, we're going to start fresh. Nick S[...] and I will be co-leading this initiative on behalf of the DCSTEM College.

      This committee will bring together faculty to collaborate on AI initiatives in teaching, learning, and research. Ideally, each department will have 1–2 representatives who are familiar with their department’s instructors and willing to serve as a liaison between them and the committee. These representatives will help share knowledge, facilitate discussions, and explore AI’s impact on our curriculum.

      If you’re interested, please join our first informational/organizational meeting on Tuesday, March 18, at 3:00 p.m. on Zoom. Formal membership can be adjusted after this meeting.

      Sign up here: [Google Form]
      Once you submit the form, you’ll receive a calendar invite with the Zoom link.

      We’ll discuss AI’s role in our curriculum, how to integrate AI literacy into courses, and strategies for guiding students on responsible AI use. Your input is invaluable, and we look forward to working together to shape AI’s role in our college.

      Hope to see you there!

      Sandra & Nick

  • Recall the masters project that some students are doing and need our suggestions about:
    1. Suppose a generative AI like ChatGPT or Claude.ai was used to write a book or content-focused website about a simply stated task, like "how to scramble an egg," "how to plant and care for a persimmon tree," "how to check and change the oil in your car," or any other question like that. Interact with an AI to collaboratively write a book or an informationally near-equivalent website about it!
      • ET: Growing vegetables from seeds. (2/21/25)
        • Working on proposal (+ project)
        • Making AI images for the book (inc. cover)
          • Getting better supported! You can use verbal commands to modify its pictures. Prompts are a challenge.
          • TOC is together. (We're back to the book goal!)
        • Found an online course on prompt engineering
        • Gemini: writes well compared to ChatGPT
  • We finished the Chapter 6 video, https://www.youtube.com/watch?v=eMlx5fFNoYc. We decided to run through this one again.
  • Schedule back burner "when possible" items:
    • If anyone else has a project they would like to help supervise, let me know.
    • (2/14/25) An ad hoc group is forming on campus for people to discuss AI and teaching of diverse subjects by ES. It would be interesting to hear from someone in that group at some point to see what people are thinking and doing regarding AIs and their teaching activities.
    • The campus has assigned a group to participate in the AAC&U AI Institute's activity "AI Pedagogy in the Curriculum." IU is on it and may be able to provide updates now and then.
  • Here is the latest on future readings and viewings

Appendix 1: New proposed 4000/5000 level applied AI course

In today's AI-driven world, professionals across all levels—graduate, undergraduate, and PhD students—must develop a comprehensive understanding of AI technologies, business applications, and governance frameworks to remain competitive. The Applied AI for Functional Leaders course is designed to bridge the gap between AI innovation and responsible implementation, equipping students with technical skills in AI development, strategic business insights, and expertise in governance, compliance, and risk management.

 

With industries increasingly relying on AI for decision-making, automation, and innovation, graduates with AI proficiency are in high demand across finance, healthcare, retail, cybersecurity, and beyond. This course offers hands-on training with real-world AI tools (Azure AI, ChatGPT, LangChain, TensorFlow), enabling students to develop AI solutions while understanding the ethical and regulatory landscape (NIST AI Risk Framework, EU AI Act).

 

Why This Course Matters for Students:

 

v Future-Proof Career Skills – Gain expertise in AI, ML, and Generative AI to stay relevant in a rapidly evolving job market.

v Business & Strategy Integration – Learn how to apply AI for business growth, decision- making, and competitive advantage.

v Governance & Ethics – Understand AI regulations, ethical AI implementation, and risk management frameworks.

v Hands-on Experience – Work on real-world AI projects using top industry tools (Azure AI, ChatGPT, Python, LangChain).

Why UALR Should Adopt This Course Now:

 

v Industry Demand – AI-skilled professionals are a necessity across sectors, and universities must adapt their curricula.

v Cutting-Edge Curriculum – A balanced mix of technology, business strategy, and governance makes this course unique.

v Reputation & Enrollment Growth – Offering a governance-focused AI course positions UALR as a leader in AI education.

v Cross-Disciplinary Impact – AI knowledge benefits students in business, healthcare, finance, cybersecurity, and STEM fields.

By implementing this course, UALR can produce graduates ready to lead in the AI era, making them highly sought after by top employers while ensuring AI is developed and used responsibly and ethically in business and society.


Applied AI (6 + 8 Weeks Course, 2 Hours/Week)

5-month Applied Artificial Intelligence course outline tailored for techno-functional, functional or technical leaders, integrating technical foundations, business use cases, and governance frameworks.

 

This can be split in 6 weeks certification plus additional funds for credit course with actual use case.

 

I have also leveraged insights from leading universities such as Purdue’s Applied Generative AI Specialization and UT Austin’s AI & ML Executive Program.




 

Balance: 1/3 Technology | 1/3 Business Use Cases | 1/3 Governance, Compliance & AI Resistance




 

Module 1: Foundations of AI and Business Alignment (Weeks 1-4)

 

v Technology: AI fundamentals, Machine Learning, Deep Learning

v Business: Industry Use Cases, AI for Competitive Advantage

v Governance: AI Frameworks, Risk Management, Compliance

 

·         Week 1: Introduction to AI for Business and Leadership

o    Overview of AI capabilities (ML, DL, Generative AI)

o    Business impact: AI-driven innovation in finance, healthcare, and retail

o    Introduction to AI governance frameworks (NIST, EU AI Act)

·         Week 2: AI Lifecycle and Implementation Strategy

o    AI model development, deployment, and monitoring

o    Case study: AI adoption in enterprise settings

o    AI governance structures and risk mitigation strategies

·         Week 3: Key AI Technologies and Tools

o    Supervised vs. Unsupervised Learning

o    Python, Jupyter Notebooks, and cloud-based AI tools (Azure AI Studio, AWS SageMaker)

o    Governance focus: AI compliance and regulatory challenges

·         Week 4: AI for Business Growth and Market Leadership

o    AI-driven automation and decision-making

o    Case study: AI-powered business analysis and forecasting

o    Compliance focus: Ethical AI and responsible AI adoption





 

v Technology: NLP, Computer Vision, Reinforcement Learning

v Business: AI in business functions - Marketing, HR, Finance

v Governance: Bias Mitigation, Explainability, AI Trust

 

·         Week 5: Natural Language Processing (NLP) & AI in Customer Experience

o    Sentiment analysis, text classification, and chatbots

o    Business case: AI in customer service (chatbots, virtual assistants)

o    Governance focus: Privacy and data security concerns (GDPR, CCPA)

·         Week 6: AI for Operational Efficiency

o    Business use cases: AI for fraud detection, surveillance, manufacturing automation

o    Compliance focus: AI security and adversarial attacks

·         Week 7: Reinforcement Learning & AI in Decision-Making

o    Autonomous systems, robotics, and self-learning models

o    Business case: AI-driven investment strategies and risk assessment

o    Resistance focus: Overcoming corporate fear of AI adoption

·         Week 8: AI in Marketing, HR, and Business Optimization

o    AI-driven personalization, recommendation engines

o    Business case: AI in recruitment, talent management

o    Compliance focus: AI bias mitigation and fairness in hiring




 

Module 3: AI Governance, Compliance & Ethics (Weeks 7-10)

 

v Technology: Secure AI Systems, Explainability

v Business: Regulatory Compliance, AI Risk Management

v Governance: Responsible AI, Transparency, Algorithm Audits

 

·         Week 9: AI Governance Frameworks & Global Regulations

o    NIST AI Risk Management, ISO/IEC 23894, EU AI Act

o    Industry-specific regulations (HIPAA for healthcare AI, SEC for AI in finance)

o    AI governance tools (audit logs, explainability reports)

·         Week 10: AI Explainability & Bias Management

o    Interpretable AI techniques

o    Case study: Bias in AI hiring systems and credit risk models

o    Business responsibility in AI model transparency

·         Week 11: AI Security, Privacy, and Risk Management

o    Secure AI model deployment strategies

o    Governance: AI trust frameworks (eg: IBM AI Fairness 360)

o    Case study: Managing AI risks in cloud-based solutions

·         Week 12: AI Resistance and Corporate Change Management

o    Strategies for AI adoption in enterprises


o    Business case: AI integration in legacy systems

o    Ethics: Impact of AI on jobs, social responsibility, and legal liabilities




 

Module 4: AI Strategy, Implementation, and Future Trends (Weeks 11-12)

 

v Technology: AI Product Development

v Business: AI Implementation, Enterprise AI Strategy

v Governance: AI Regulatory Compliance & Future Legislation

 

·         Week 13: Overview of AI Deployment and Scalability

o    Deploying AI models on cloud (Azure AI Studio, AWS, GCP)

o    Business case: Scaling AI solutions in enterprise environments

o    Compliance: AI model monitoring, drift detection

·         Week 14: AI for Competitive Advantage & Industry-Specific Applications

o    AI in industry : e.g.: supply chain, autonomous vehicles, healthcare diagnostics

o    Case study: e.g.: AI-driven drug discovery and logistics optimization

o    Compliance: AI liability and regulatory accountability

·         Week 15: AI Governance and Responsible Innovation

o    Innovating with AI : e.g. financial services (algorithmic trading, fraud detection)

o    Ethics: Ensuring fairness and avoiding discrimination in AI models

o    Risk assessment frameworks for enterprise AI adoption

·         Week 16: The Future of AI: Trends, Risks & Opportunities

o    Generative AI (DALL-E, ChatGPT, LangChain applications)

o    AI and Web3, decentralized AI governance

o    Case study: AI-powered governance in blockchain ecosystems




 

Module 5: Capstone Project & Final Presentations (Weeks 12-14. Process starts in Week 7/8)

 

v Technology: Hands-on AI Application Development

v Business: AI Use Case in Industry

v Governance: Compliance Strategy & Ethical AI

 

·         Weeks 17-19: AI Capstone Project

o    Develop an AI-driven business solution with governance compliance

o    AI application areas: Business analytics, customer engagement, fraud detection

o    Report: Governance strategy and AI risk mitigation plan

·         Week 20: Final Project Presentations & Certification

o    Peer review and feedback

o    Industry guest panel discussion on AI’s role in future business strategies

o    Course completion certification

Tools & Technologies Covered:

·         AI Development: Python, TensorFlow, PyTorch, Scikit-learn, GenAI models

·         Cloud AI Platforms: Azure AI Studio, AWS AI Services, GCP Vertex AI

·         NLP & Generative AI: ChatGPT, DALL-E, LangChain, BERT, Stable Diffusion

·         AI Governance & Risk: SHAP, LIME, AI fairness toolkits

Appendix 2: Transcript

 
3:19 - D. B.
Well, I guess we can go ahead and get started. Welcome everyone. A couple of announcements. Go ahead.

3:34 - M. M.
We have, all of us, I think that are interested about Nvidia main conference next week 17 to 21 everything that is online sessions and main talk Kino speech talks sessions online are free. Everything is free. Just register, please online Okay If you need the link, I can And I prepare one, but our colleague Yogit is not here. I start stress and AI kind of topic. It's not exactly about the mental health, but it's kind of related with this people are losing jobs.

4:35 - D. B.
People needs to change jobs.

4:39 - M. M.
They need to learn new stuff. Today, even there was an article about the teenagers, how they are kind of confused, I will say. Not all of them going to depression, but probably a lot of confusion. How can they manage? Do they need to learn? Do they need education? So there are many discussions in this topic. Preparing the stuff for you guys about the stress in AI.

5:12 - D. B.
Okay, yeah, I'd like to hear more about it.

5:16 - M. M.
Yeah, and books, there are some recent books. And I think that if you have young people in home, it's an obvious problem how they will accept the education, future education, and what kind of what they think, and I don't know if you... What do you think? It's an interesting topic, D., say, Ernest.

5:42 - E. G.
I think, depending upon the individual, because neurodivergent people found it difficult, well, at least in my experience, found it difficult to learn in a classroom setting. But in a virtual setting, I was I was able to understand. So when a topic is brought up, like with D. B., first thing I did, I didn't understand it. I pulled it up on another screen and I was able to, because we like to understand from a foundational point of view to make connections, abstractions. And I think, yes, it makes a huge difference.

6:28 - D. B.
All right. So next Friday, D. B. will informally present on natural language processing requirements analysis and the age of AI. Thank you, D. And then following month, Y. will informally present a draft of his AI course outline. So he's arranged with our chair to teach a course, a 4,000, 5,000 level course in applied AI. He doesn't have a background in computer science or information science, so it'll be an applied course. He does have an MBA, I understand. Anyway, the draft of his AI course outline is below in Appendix 1. If you want to check it out, feel free. I put it there because otherwise I'd lose. Keep it on hand for when he presents it. And then when he's done presenting it, I will remove it from Appendix 1. Otherwise, I've got to save it in a directory and hunt it down when I need it. I don't want to deal with that. One of the computer science faculty, S. L., is on a college AI ad hoc committee. So I'm hoping at some point I can maybe ask her to join us and just tell us what the committee is going to do. So now the department College has a committee and I believe the campus has a committee and there's a cross campus multi college committee which I. U. who's here I think is is on and there's a faculty there's a informal faculty discussion group run by L. S., which tie. Ty, are you in that group? Have you managed to join it?

8:20 - Unidentified Speaker
Ty, your mic is off. Sorry about that.

8:24 - D. B.
I couldn't find my mute button.

8:27 - T. E.
Yeah, I'm in the group, but I have not attended a session at this point. So I think that one was on the 20th, maybe. 21st, something like that.

8:42 - D. B.
OK, good. I'm going to make a note that. So this is not like our group, but this group is for like, how do you teach? How do you use AI in teaching or how does it impact teaching? It's like L. S. is in the psychology department. He's not a techie. But anyway, for all of your information, Ty is considering using this group as a way to help leverage his potential dissertation topic on using Socratic questioning for pedagogical purposes using an AI as Socrates, essentially. I'm hoping we can get some good data from people actually using this technique in their courses and so on, and it'll provide data to analyze and so on. And so, Ty, I might also ask you to kind of update us every now and then on what's going on in the group, just so we know what's, have our finger on the pulse of things.

10:01 - R. S.
OK. Has T. E. formulated his dissertation committee yet?

10:04 - T. E.
Well, my goal This semester was the candidacy exam. So I'm actually kind of working on that right now before the end of the semester.

10:16 - R. S.
I hope you remember that we were successful in getting a paper published in the proceedings of the WMCI and also selected for their JSCI journal. Yeah. With you as lead author.

10:34 - T. E.
That's right.

10:35 - D. B.
Yeah.

10:36 - R. S.
So I'm just saying that, you know, if, if you were interested in having me on your thesis committee, we possibly could generate more publications.

10:48 - T. E.
I appreciate that.

10:50 - D. B.
Thank you very much. All right. Well, thanks. Um, okay. Um, there are some masters students, um, that were, that signed up to do a project. In which they were using an AI to write a book or the informational website equivalent of a book. And of the three that we kind of started out with, I'm going to just start deleting stuff because I haven't heard from Lamont recently. And there's another guy that we also kind of dropped out, but E. T. is here. Hello, E. T. And do you want to give us a quick, what your status is, if you have any questions, how's it going?

11:36 - E. T.
Sure. Hello. So this week, mostly I was working on my proposal, but I did spend some time on my project as well. The exciting thing I have for my project is that I started generating pictures, like for the cover of the book and added to the book. And I realized that Chachapiti has improved its images a lot. Like I was amazed actually with the picture quality, they were very realistic. And they also added another feature where like, like Photoshop, so it shows it introduced a tool at first and it you can select part of the and ask Chachapiti to regenerate it or remove some stuff from that area. So I tried a few, a couple of things.

12:36 - Unidentified Speaker
One was very successful. The other one was, so on the picture, the tomatoes on the garden was kind of laying on the ground.

12:44 - E. T.
And I asked Chachapiti to add a tomato plant and put the tomatoes on the tomato plant instead of just laying on the ground. It completely removed all tomatoes instead of adding so I'm planning to improve my prompts to get exactly what I want on the picture so that was kind of exciting to use those picture generative prompts. Another thing I got my table of content together so I'm kind of trying to get my book together and Yes, that was pretty much it.

13:27 - Multiple Speakers
OK.

13:28 - E. T.
And are you going to be doing a book or a website? I turned back to book, I'm sorry.

13:42 - D. B.
Okay. Okay, that's fine. Okay. I'm just gonna streamline things Yogi, I want to, I'm sorry, I just come back from Washington DC.

13:58 - M. M.
Yogi, I was thinking that we have a meeting, but I don't remember when. We were supposed to reschedule.

14:10 - Yogi’s iPhone
I said Friday was not possible for me, but we'll reschedule. On the calendar for next week.

14:20 - M. M.
For next week, okay. I just mentioned that we have research in mental health, different for teenagers, for adults, for many of depression stuff, but I also mentioned in the beginning my interest is about stress and AI.

14:38 - Unidentified Speaker
Yeah.

14:38 - M. M.
How this works, maybe it's not exactly going to dementia, but it's very stressful to change and change tools and change jobs and stuff like this. So we need to- We'll catch up.

14:52 – Y.’s iPhone
Yes. Yes, ma'am. Yes.

14:54 - M. M.
I just worry if I miss the meeting, but I'm okay. No, no, no.

14:59 – Y.’s iPhone
I didn't miss. Okay, thank you. All right.

15:03 - D. B.
Oh, just getting back to E. T.'s project for a moment. R. S., you mentioned wanting to be on committee. So I'm hoping you'll be on E. T.'s master's committee. I guess we could talk about that offline. But anyway, it's definitely an option. OK.

15:21 - R. S.
Well, I'm willing to serve. OK.

15:23 - D. B.
And E. T., there are other people in this group who might also be on your committee. You need three people, including me. You can pick whoever you want. But for example, Dr. M. M. might be available, and Dr. R. is probably probably on the list of possibilities and so on. And I think.

15:46 - Unidentified Speaker
I think that would be a great resource too, is.

15:50 - Multiple Speakers
Definitely, I don't think V. has affiliate graduate faculty status, but he could get it if he wanted to be on your on the committee.

16:00 - M. M.
He has it, then he could do it.

16:03 - D. B.
Doctor Doctor K. is also on this call. He could. He could do it. Actually graduate faculty status.

16:10 - M. M.
Yeah, we have outside people. So because people like me and Dan, we're really very busy.

16:17 - D. B.
Yeah.

16:18 - R. S.
I'm just, I also want to reiterate that we are able to generate conference proceeding publications for, I've had 100% batting average, as I said to Dan.

16:30 - D. B.
Okay, yeah, that's a good possibility. Okay, what else? Does anyone else have anything before before we go to the video? I actually have a question.

16:41 - E. T.
Sure. So I was wondering if there is any other online self-paced prompt engineering courses that I can, I mean, that you guys can suggest me to finish.

16:53 - Multiple Speakers
Besides this one, which you found? Yes.

16:56 - E. T.
Yes, I already finished that one. It really helped me. I'm using NOVA system still, which there is a discussion Continuity expert and and they're experts from different fields which AI finds and he kind of creates a discussion and each expert gives their idea and finally summarize the experts thoughts

17:21 - M. M.
and process I give you always the free code for From engineering from NVIDIA and you can compare and give us your opinion which courses you like it better. Actually, do we have the code? I don't. Then do you want to share with everybody again the code or should I? I sent to you. I can, I can send again. That's right, I do have a question for you.

17:56 - E. G.
Are you bumping up against Because I I've been trying to play with it. See how to approach. And I've been coming up against word caps.

18:11 - E. T.
I've tried again. It was this week. I mostly focused on generating pictures. To be honest, I tried that these weeks and the most words that I got from one prompt.

18:28 - Unidentified Speaker
was 4,000 something like that. It never passed 5,000.

18:34 - E. G.
OK, because what I ended up doing is building kind of like a A feedback loop, so what I did is I had to create a Almost like, A biography or bibliography. Then I had it pass in. Each as a topic, but I would give it the rest as history. So it would have that as previous history and then have it write out its own topic.

19:17 - Unidentified Speaker
So then I was able to start having 30, 40,000 word I'm out. That's impressive. I'll definitely try that.

19:30 - D. B.
And also, yeah, if you want that lead to that other course, just check with Dr. M. M. if she didn't get it right in the moment.

19:46 - Unidentified Speaker
Sure.

19:47 - E. T.
I would love to. OK, good.

19:50 - Unidentified Speaker
OK.

19:52 - D. B.
have anything else they want to bring up?

19:58 - Unidentified Speaker
Because if not, we're going to go to the video.

20:05 - D. B.
As usual, I have to juggle the screen. And we're up to, let's see, minute time 19 And it's been a while, maybe we should start from the beginning. Maybe not, I just don't know. I really could start from the beginning of this little chapter here at 1822. Okay, so let me just adjust the sound level.

20:42 - D. B.
Screen. I'm going to just do a sound check.

20:48 - Unidentified Speaker
Do you all get that? The sound come out OK? No.

20:55 - R. S.
It's kind of faint.

20:57 - D. B.
Well, I'm not sure what to do, but let's see. Let me see.

21:07 - Unidentified Speaker
E. T., please send me your email.

21:14 - M. M.
Just send me email. I cannot find you here in the system.

21:27 - D. B.
Right.

21:28 - Unidentified Speaker
How about now? Is that better? Anybody? Everything described so far is what people would call a self-attention head. I just, I just.

21:44 - D. B.
It's light, but I can barely hear it.

21:49 - D. D.
Yeah, same. What?

21:51 - D. B.
Can barely hear it. Yeah, it's light, but you can.

21:57 - E. G.
and you can hear it. OK, well, I don't know.

22:02 - Multiple Speakers
I turned up. Maybe I can pull out a headset.

22:06 - D. B.
I'm just kind of guessing. Can you hear me now? Can you hear me OK right now?

22:14 - Unidentified Speaker
I can. Well, I don't know.

22:17 - D. B.
I don't know what to do. I got my sounds all. And it can be heard, so it's not like you, it's just, try it again. I can turn the volume up.

22:31 - D. D.
All right.

22:32 - D. B.
I'm going to, I'm going to exit this YouTube and just try one more time, but I'm not hopeful. We'll try it. You You won't believe how old this couple is.

23:16 - Unidentified Speaker
Now I got ads to deal with.

23:21 - D. D.
Well, that's coming through loud and clear.

23:26 - D. B.
Screening the overall value map to be a low-rank transformation, turning back to the parameter count, all four of these matrices have the same size.

23:58 - Unidentified Speaker
And adding them all up, we get about 6.3 million parameters for one attention head. As a quick side note, to be a little more accurate, everything described so far is what people would call a self-attention head to distinguish it from a variation that comes up in other models that's called cross-attention.

24:15 - D. B.
This isn't relevant to our GPT example, but if you're curious, cross-attention involves models that process two distinct types of data, like text in one language and text in another language that's part of an ongoing generation of a translation, or maybe audio input and an ongoing transcription. A cross-attention head looks almost identical. The only difference is that the key and query maps act on different datasets. In a model doing translation, for example, the keys might come from one language while the queries come from another, and the attention pattern could describe which words from one language correspond to which words in another. And in this setting, there would typically be no masking, since there's not really any notion of later tokens affecting earlier ones. Staying focused on self-attention, though, if you understood everything so far, and if we were to stop here, you would come away with the essence of what attention really is. All that's really left to us is to lay out the sense in which you do this many, many different times. In our central example, we focused on adjectives updating. OK, any comments or questions so far. This has to do with deep, deep learning, yes?

25:34 - R. S.
Dan?

25:35 - Unidentified Speaker
Yes, of course, of course.

25:50 - Unidentified Speaker
Can you hear anything now? Hello, can you hear me? I hear you.

25:55 - R. S.
Yeah, we can hear you.

25:57 - Unidentified Speaker
OK.

25:58 - D. B.
Yeah, I don't know what's, I'm sorry about the sound. If it's not usable, we'll just, we could do something else.

26:07 - D. D.
It's working.

26:08 - Unidentified Speaker
Better?

26:08 - D. D.
Yeah, it started out kind of light and then I turned my volume up and then it got really, really loud. I turned it down, it was at normal volume.

26:21 - D. B.
I could hear it just fine. OK, so that has something to do with me unplugging my headset.

26:29 - Unidentified Speaker
But can you hear me now? I can hear you. Loud and clear.

26:34 - R. S.
All right, well, I don't know how to, I mean, I got all these settings I can change.

26:41 - D. B.
I can take my headset, unplug it, plug it back in. Go to the Zoom sound and select a microphone. I don't know. I really don't know what to do. All right, well, you may have to turn your volume way up. Let's try it again. Let's try it some more. But of course, there are lots of different ways that context can influence the meaning of a word. Word. If the words they crashed thee preceded the word car, it has implications for the shape and the structure of that car. Associations might be less grammatical. If the word wizard is anywhere in the same passage as Harry, it suggests that this might be referring to Harry Potter, whereas if instead the words Queen, Sussex, and William were in that passage, then perhaps the embedding of Harry should instead be updated to refer to the prince.

27:37 - Unidentified Speaker
For every different type of contextual updating that you might imagine, the parameters of these key and query matrices would be different to capture the different attention patterns.

27:50 - D. B.
And the parameters of our value map would be different based on what should be added to the embeddings. All right, any comments, questions? Is the audio getting any better? No? Fine for me.

28:08 - Multiple Speakers
OK.

28:08 - D. B.
And in practice, the true behavior of these maps is much more difficult to interpret, where the weights are set to do whatever the model needs them to do to best accomplish its goal of predicting the next token. As I said before, everything we described is a single head of attention.

28:26 - Unidentified Speaker
And a full attention block inside a transformer consists of what's called multi-headed attention, where you run a lot of these operations in parallel, each with its own distinct key query and value maps. Any comments or questions? GPT-3, for example, uses 96 attention heads inside each block.

28:46 - D. B.
Considering that each one is already a bit confusing, it's certainly a lot to hold in your head. Just to spell it all out very explicitly, this means you have 96 distinct key and query matrices producing 96 distinct attention patterns. Each head has its own distinct value matrices used to produce 96 sequences of value vectors. These are all added together using the corresponding attention patterns as weights. What this means is that for each position in the context, each token, every one of these heads produces a proposed change to be added to the embedding in that position. So what you do is you sum together all of those proposed changes, one for each head, and you add the result to the original embedding of that position. This entire sum here would be one slice of what's outputted from this multi-headed attention block, a single one of those refined embeddings that pops out the other end of it. Again, this is a lot to think about, so don't worry at all if it takes some time to sink in. The overall idea is that by running many distinct heads in parallel, you're giving the model the capacity to learn many distinct ways that context changes meaning. Pulling up our running tally for parameter count, with 96 heads, each including its own variation of these four matrices, each block of multi-headed attention ends up with around 600 million parameters. There's one added, slightly annoying thing that I should really mention for any of you who go on to read more about transformers. You remember how I said that the value map is factored out into these two distinct matrices, which I labeled as the value down and the value up matrices. The way that I framed things would suggest that you see this pair of matrices inside each attention head, and you could absolutely implement it this way. That would be a valid design. But the way that you see this written in papers, and the way that it's implemented in practice, looks a little different. All of these value-up matrices for each head appear stapled together in one giant matrix that we call the output matrix, associated with the entire multi-headed attention block. And when you see people referring to the value matrix for a given attention head, they're typically only referring to this first step, the one that I was labeling as the value down projection into the smaller space. For the curious among you, I've left an on-screen note about it. It's one of those details that runs the risk of distracting from the main conceptual points, but I do want to call it out just so that you know if you read about this in other sources. Setting aside...

31:26 - E. G.
I have a question. Dr. M. M., you may be able to help me more understand this. This is all occurring in parallel, but if each word in context supplies information to the other heads, how can they all operate in parallel? They would have to operate in sequence, so that way it can provide the information to the other heads.

32:01 - M. M.
If I understand the question, but if it's a parallel, this word participates many times in different matrices, the same word. So, obviously, if the word participates many times, this will give some idea about how important is this world. Because it's working in parallel, but all of them, they are participating many times. It's not only one time.

32:36 - E. G.
But in his analogy, the Harry Potter and the Prince Harry, the context of the terms provided valuable information to the other heads.

32:49 - Multiple Speakers
Sorry.

32:50 - E. G.
How would that parallel functionality occur when the context of some previous terms provided detail for subsequent heads?

33:03 - M. M.
But remember that we have already embedding that already cluster the sentiment. Remember that this attention is coming after that. OK.

33:20 - E. G.
I don't think it's coming before.

33:23 - M. M.
It's coming after when you're ready. Cluster initially cluster. The vector space is already created. OK, OK, that makes sense at that point.

33:35 - E. G.
Yeah, if the vector space is created, I didn't. Yeah, and all they're going through is basically all of the the vectors. That show up with the context provided, and it's going through giving a weighting at that point.

33:56 - M. M.
Exactly. So remember, the first step is embedding. So we already have this similarity of actors.

34:03 - E. G.
Now it makes sense.

34:05 - Unidentified Speaker
Yeah.

34:06 - M. M.
Now we are actually just comparing with the Curie for his question, who is the prince, or just comparing with the question by answer. System, but the embedding is done, the clustering is done.

34:23 - E. G.
So at that point, you'd have things like Harry Potter, Prince William, World War One. So you could basically have that context already done. Then at that point, it goes through the multithreading in parallel to figure out, oh, World War One, you mean?

34:47 - M. M.
This prints. Exactly. Embedding is done. Don't worry.

34:52 - Unidentified Speaker
OK.

34:53 - Multiple Speakers
Yeah. OK.

34:54 - D. B.
By the way, I increased my own sound volume by going to the Zoom microphone icon and pulling up the audio settings at the bottom. And that fixed it. I'm sharing my screen. I don't think you can see.

35:17 - Unidentified Speaker
Can you see the pop down menu with the highlighted audio settings?

35:22 - D. B.
No. No.

35:23 - Unidentified Speaker
OK.

35:23 - D. D.
Well, anyway, if you go to the microphone, it made a difference. It made a difference when I fixed it? Yeah. Yeah. It's been running really good.

35:35 - Unidentified Speaker
OK.

35:35 - D. B.
Well, that was kind of a maze. Oh, OK. I'm thinking we got another three minutes in this video, but we might want to backtrack because I got kind of lost at the beginning, plus the sound wasn't good. What do you all want to do? You want to do the last three minutes or go back to where we started? Any opinions? We may want to run through the whole video again at some point. We can do that. We'll continue here. Setting aside all the technical nuances, in the preview from the last chapter, we saw how data flowing through a transformer doesn't just flow through a single attention block. For one thing, it also goes through these other operations called multilayer perceptrons. We'll talk more about those in the next chapter, and then it repeatedly goes through many, many copies of both of these operations. What this means is that after a given word imbibes some of its context, there are many more chances for this more nuanced embedding to be influenced by its more nuanced surroundings. The further down the network you go, with each embedding taking in more and more meaning from all the other embeddings, which themselves are getting more and more nuanced, the hope is that there's the capacity to encode higher-level and more abstract ideas about a given input beyond just descriptors and grammatical structure. So let's see, what was the key term he kept talking about? Embeddings. I forgot what an embedding was exactly. Creating this vector.

37:20 - Multiple Speakers
And the first step before all of this attention, they have to create these vectors.

37:28 - M. M.
And the embedding Not only, they have also positioning embedding, position of the word in the sentence. Not only they have a number for the word embedding, but also they have a position of the word in the sentence. But creating this vector is the embedding.

37:50 - D. B.
So what's shown in this graph here on the screen?

37:55 - M. M.
How the sentence can change the position So, for example, one less traveler bears or whatever, they use the vectors that represent the words and create a new position.

38:13 - E. G.
So, in this case, one was the term and it looked at the context around it, a less traveler, So less traveled means road, movement, and took is a phrase, so symbolizing choice. So one at this point isn't a number, it's symbolizing a choice.

38:40 - D. B.
OK, so that's how the context is modifying the word one to mean not a number but a choice about a number.

38:53 - M. M.
influence another vector of two roads, influence this one, and when you do the operations with these vectors, you receive something like an output that they show.

39:09 - D. B.
Different, different.

39:10 - M. M.
Like they show you what was the king and queen and, you know, the vector representation. Of the words.

39:21 - D. B.
From way back in a previous video.

39:25 - Unidentified Speaker
Yeah.

39:25 - M. M.
All right.

39:26 - D. B.
Any other questions or anything anyone want to talk about? All right, we'll continue. Things like sentiment and tone and whether it's a poem and what underlying scientific truths are relevant to the piece and things like that. Turning back one more time to our scorekeeping, GPT-3 96 distinct layers, so the total number of key query and value parameters is multiplied by another 96, which brings the total sum to just under 58 billion distinct parameters devoted to all of the attention heads. Any comments on that? I understand that people are talking about how scaling is running out of steam, and 57 billion, you know, 57 trillion might not do much better than 57 billion, And DeepSeek, the Chinese recent AI is supposedly much more efficient and lean compared to American ones. And it could be that AI algorithm development is going to sort of lead to lower computations rather than higher scaling, like DeepSeek.

40:50 - D. D.
Yeah, that's a big if, if DeepSeek didn't just pull the information right off of ChatGPT4.

41:02 - D. B.
Well, it may have leveraged ChatGPT4, but that's OK, in a sense, because it's doing what it's doing with less computation.

41:17 - D. D.
Well, but it's riding on the shirt tail of the millions and millions of dollars and hours and hours of computation that ChatGPT4 has already done.

41:31 - M. M.
So in essence, it's really, it's really extra.

41:35 - D. D.
It's, it's just a little bit more that's marginally less than ChatGPT, four. It's not as good.

41:45 - Multiple Speakers
I wouldn't even say it's marginally less. I liken it to basically you're copying off the guy next to you's test sheet.

41:57 - E. G.
No, that was what I understood, how DeepSeek was able to get there so quickly with such minimal hardware is basically they pulled what, and reverse engineered basically what ChatGPT had done. Yeah, that's it.

42:15 - D. B.
Yeah, I just, you know, before we sort of take this as a way to vilify DeepSeek, I'd like to point out that ChatGPT is cribbing off all the content on the web that people have laboriously created by hand and are not compensated for. Well, that's different.

42:37 - Multiple Speakers
It's true, but it's different.

42:40 - E. G.
They actually use that to build the vectors. So so they actually took that information and computationally built out all of the vectors, all of the layers, all of the perceptrons. But and then got the answers, then DeepSeek says, Oh, so those are the vectors and the answers. You don't need to calculate.

43:08 - D. D.
If they hadn't, if they hadn't went out there and got all the stuff off the web, then ChatGPT would be, you know, good at maybe whatever free stories they could have gotten. You wouldn't be able to go and get coding and expert opinion. You know what? You wouldn't have a whole of different expert chats that you could do. So without all that data, then what is it, garbage in, garbage out. If you don't have some quality data to put in, we wouldn't have a large language model that was worth using.

43:56 - Unidentified Speaker
All right, we'll continue.

43:59 - D. B.
That is a lot, to be sure, but it's only about a third of the 175 billion that are in the network in total. So even though attention gets all of the attention, the majority of parameters come from the blocks sitting in between these steps. In the next chapter, you and I will talk more about those other blocks, and also a lot more about the training process. A big part of the story for the success of the attention mechanism is not so much any specific kind of behavior that it enables, but the fact that it's parallelizable, meaning that you can run a huge number of computations in a short time using GPUs. Given that one of the big lessons about deep learning in the last decade or two has been that scale alone seems to give huge qualitative improvements in model performance, there's a huge advantage to parallelizable architectures that let you do this. Any comments?

44:47 - D. D.
Well is that kind of what we were just talking about that there's kind of this tipping point where you you scale up so far than getting more data doesn't seem to help. And I think that's more of a training method problem. I think logically more data should help. It's just, you know, they have to work with it. And I think that's a lot of the expense is experiment. You know, when they go, oh, well, we tried this and now it's not as good as it was before. And there goes, you know, 250,000. And then they're back in there, you know trying a different way so that they can scale it I Think they'll be able to train off the top and that's you know kind of I Still think we're in our infancy

45:41 - E. G.
here what we're doing today is not what we're going to be doing tomorrow This the parallelization is a lot of times a brute force I mean, it's what we used to do to crack encodings, ciphers. And to me, this is nothing more than trying to crack a cipher in how to find the relationship between what I've requested and what I want.

46:13 - D. D.
And yeah, the relationship between the training and the data size and Yeah, that's right. It's there. Yeah. And I agree. I think in five years when we come back and look at this, we're going to go, oh, wow, can you believe that? You know, it'll be a whole lot throughout this video. And they're already they're already making the transformer better.

46:40 - Unidentified Speaker
Yeah.

46:41 - E. G.
Already figuring out how to do things. I mean, it's look at the transformers of what a year ago, three years ago.

46:55 - D. B.
All right, let's finish up. If you want to learn more about this stuff, I've left lots of links in the description. In particular, anything produced by Andrej Karpathy or Chris Ola tend to be pure gold. In this video, I wanted to just jump into attention in its current form, but if you're curious about more of the history for how we got here and how you might reinvent this idea for yourself, my friend Vivek just put up a couple videos giving a lot more of that motivation. Also, Britt Cruz from the channel The Art of the Problem has a really nice video about the history of large language models. I think we should go back. Redo this one before we go on to the next? Well, I mean go back to the videos that they were showing.

47:38 - D. D.
Oh, yeah. BuildChatGPT from scratch. I think almost every video that he suggested we should vote on.

47:46 - D. B.
OK. I can make a mention of it. I like watching it.

47:51 - D. D.
Again, um, because every time I watch it, it seems like I get a little bit better understanding I fully agree. Yeah, so if you want to go back and watch it again, absolutely I feel like this one.

48:07 - D. B.
I could definitely get more out of if I watch it again Any other any other uh opinions on that question Because if not, I would just say, uh, uh we decided to run through this one again, especially since I kind of forgot the first 19 minutes since it's been several weeks. All right, so I say- I'm up next week, right?

48:41 - D. D.
Yes. Okay. Looking forward for your presentation, D., Yeah, I think I'm I think so I'll probably have I don't know exactly how long my presentation will be for my paper But it's probably like three minutes or something. It'll be very short. So I'm gonna bill I'm gonna build a you know, a lengthy presentation and then cut it off and use it for my Use I guess in three months three or four months I present for my paper Present In the

49:14 - D. B.
conference, he's accepted in the conference.

49:17 - M. M.
Oh, OK.

49:18 - D. B.
So you're going to three minutes next week?

49:21 - Multiple Speakers
Three minutes? No.

49:23 - D. D.
I'm going to build a nice, robust presentation, and then I'll chop it down to three minutes later this summer. Oh, later, OK. Yeah, yeah. Why three minutes?

49:35 - D. B.
Shouldn't I shoot for about 30 minutes?

49:38 - D. D.
That's what I was thinking.

49:41 - D. B.
I don't have any, I mean, I just, you know, you want to get your message across. That's all, whether it takes five minutes or 35 minutes is fine with me.

49:55 - Multiple Speakers
Okay.

49:56 - D. B.
Well, I'll, I'll, I'll shoot for 20 minutes then.

50:00 - D. D.
Okay. And you want to get some feedback too, right?

50:05 - D. B.
Right. Sure. All right. Well, uh, I guess.

50:12 - M. M.
E. T., I sent you the link in the code.

50:24 - D. D.
I buy prompt engineering courses on Udemy. They vary. It kind of depends on what you're after. But I mean, there's a lot of free stuff out there on the internet. I would just sit down one night and just say, hey, I'm going to work on, I'm going to learn about prompt engineering for two hours. And I would just read all the free stuff that you could get, just treat it as a brainstorming exercise.

51:02 - M. M.
Yeah, you too. YouTube has a lot of stuff.

51:07 - Multiple Speakers
This is YouTube.

51:09 - D. B.
Yeah, it's correct.

51:11 - D. D.
Or as banded to teach you and let me hear that man can make a robust prompt. Hands down, yeah.

51:22 - E. G.
Well, he uses a prompt to make a prompt to make a prompt.

51:30 - D. D.
Yeah. That is not the That's a whole nother level right there. But Doctor Warren V. W., you may I I had him in classes. Or you know with him, we were both students, but you might you might have met him.

52:15 - E. G.
I don't know. But uh. One of the things that I found really useful is in the prompt engineering.

52:24 - T. E.
Maintaining a context.

52:25 - E. G.
Because, uh, I go through and I elicit information, I maintain a context, a history. It's almost like a search history. So as that context continues, it knows it's refining its answers based on answers it's already given or prompts that you originally gave it.

52:52 - D. D.
You're writing a book, right? Yes.

52:55 - E. T.
And so you have an outline of what you want the book to be, right?

53:02 - Unidentified Speaker
Yes.

53:03 - E. T.
Did you get the AI to help you write the outline?

53:08 - Multiple Speakers
Yes.

53:09 - Unidentified Speaker
OK.

53:09 - Multiple Speakers
Is it not possible just to go through and just get the AI to write each section and then maybe go back And so, like, let's say you get a summary of the previous section and use that as you build

53:28 - E. T.
to the next section so you don't filter out information, you know, so it can kind of build on what's been happening.

53:36 - Multiple Speakers
That's an idea. I haven't tried to write a book, so you probably know more about it than I do at this point.

53:45 - E. T.
So here's an exercise we can try.

53:48 - Multiple Speakers
we can ask ChatGPT to tutor us in prompt engineering using the Socratic method. I wonder how much, I guess the question is, how much does ChatGPT know about prompt engineering? And I guess it would know, in a sense, know whatever it can digest, whatever it can digest from all the stuff that's been written about it on the web. Yeah, I just don't know how current it is.

54:21 - E. T.
They still have that hard shut-off date, don't they?

54:26 - Multiple Speakers
I don't know.

54:27 - E. T.
For GPT, another model might be better. Gemini, because Gemini supposedly is integrated with the Google search engine, which tries to keep up to date. Retrieval-oriented generation. All right, folks.

54:44 - Multiple Speakers
Thanks for joining in and we'll see you same time, same place next week.

54:50 - E. T.
Thanks, guys.

54:51 - Multiple Speakers
Thank you. Thank you. Wonderful. Take care.

54:54 - E. T.
Take care.


 

 

3:19 - D. B.

Well, I guess we can go ahead and get started. Welcome everyone. A couple of announcements. Go ahead.

 

3:34 - M. M.

We have, all of us, I think that are interested about Nvidia main conference next week 17 to 21 everything that is online sessions and main talk Kino speech talks sessions online are free. Everything is free. Just register, please online Okay If you need the link, I can And I prepare one, but our colleague Yogit is not here. I start stress and AI kind of topic. It's not exactly about the mental health, but it's kind of related with this people are losing jobs.

 

4:35 - D. B.

People needs to change jobs.

 

4:39 - M. M.

They need to learn new stuff. Today, even there was an article about the teenagers, how they are kind of confused, I will say. Not all of them going to depression, but probably a lot of confusion. How can they manage? Do they need to learn? Do they need education? So there are many discussions in this topic. Preparing the stuff for you guys about the stress in AI.

 

5:12 - D. B.

Okay, yeah, I'd like to hear more about it.

 

5:16 - M. M.

Yeah, and books, there are some recent books. And I think that if you have young people in home, it's an obvious problem how they will accept the education, future education, and what kind of what they think, and I don't know if you... What do you think? It's an interesting topic, D., say, Ernest.

 

5:42 - E. G.

I think, depending upon the individual, because neurodivergent people found it difficult, well, at least in my experience, found it difficult to learn in a classroom setting. But in a virtual setting, I was I was able to understand. So when a topic is brought up, like with D. B., first thing I did, I didn't understand it. I pulled it up on another screen and I was able to, because we like to understand from a foundational point of view to make connections, abstractions. And I think, yes, it makes a huge difference.

 

6:28 - D. B.

All right. So next Friday, D. B. will informally present on natural language processing requirements analysis and the age of AI. Thank you, D. And then following month, Y. will informally present a draft of his AI course outline. So he's arranged with our chair to teach a course, a 4,000, 5,000 level course in applied AI. He doesn't have a background in computer science or information science, so it'll be an applied course. He does have an MBA, I understand. Anyway, the draft of his AI course outline is below in Appendix 1. If you want to check it out, feel free. I put it there because otherwise I'd lose. Keep it on hand for when he presents it. And then when he's done presenting it, I will remove it from Appendix 1. Otherwise, I've got to save it in a directory and hunt it down when I need it. I don't want to deal with that. One of the computer science faculty, S. L., is on a college AI ad hoc committee. So I'm hoping at some point I can maybe ask her to join us and just tell us what the committee is going to do. So now the department College has a committee and I believe the campus has a committee and there's a cross campus multi college committee which I. U. who's here I think is is on and there's a faculty there's a informal faculty discussion group run by L. S., which tie. Ty, are you in that group? Have you managed to join it?

 

8:20 - Unidentified Speaker

Ty, your mic is off. Sorry about that.

 

8:24 - D. B.

I couldn't find my mute button.

 

8:27 - T. E.

Yeah, I'm in the group, but I have not attended a session at this point. So I think that one was on the 20th, maybe. 21st, something like that.

 

8:42 - D. B.

OK, good. I'm going to make a note that. So this is not like our group, but this group is for like, how do you teach? How do you use AI in teaching or how does it impact teaching? It's like L. S. is in the psychology department. He's not a techie. But anyway, for all of your information, Ty is considering using this group as a way to help leverage his potential dissertation topic on using Socratic questioning for pedagogical purposes using an AI as Socrates, essentially. I'm hoping we can get some good data from people actually using this technique in their courses and so on, and it'll provide data to analyze and so on. And so, Ty, I might also ask you to kind of update us every now and then on what's going on in the group, just so we know what's, have our finger on the pulse of things.

 

10:01 - R. S.

OK. Has T. E. formulated his dissertation committee yet?

 

10:04 - T. E.

Well, my goal This semester was the candidacy exam. So I'm actually kind of working on that right now before the end of the semester.

 

10:16 - R. S.

I hope you remember that we were successful in getting a paper published in the proceedings of the WMCI and also selected for their JSCI journal. Yeah. With you as lead author.

 

10:34 - T. E.

That's right.

 

10:35 - D. B.

Yeah.

 

10:36 - R. S.

So I'm just saying that, you know, if, if you were interested in having me on your thesis committee, we possibly could generate more publications.

 

10:48 - T. E.

I appreciate that.

 

10:50 - D. B.

Thank you very much. All right. Well, thanks. Um, okay. Um, there are some masters students, um, that were, that signed up to do a project. In which they were using an AI to write a book or the informational website equivalent of a book. And of the three that we kind of started out with, I'm going to just start deleting stuff because I haven't heard from Lamont recently. And there's another guy that we also kind of dropped out, but E. T. is here. Hello, E. T. And do you want to give us a quick, what your status is, if you have any questions, how's it going?

 

11:36 - E. T.

Sure. Hello. So this week, mostly I was working on my proposal, but I did spend some time on my project as well. The exciting thing I have for my project is that I started generating pictures, like for the cover of the book and added to the book. And I realized that Chachapiti has improved its images a lot. Like I was amazed actually with the picture quality, they were very realistic. And they also added another feature where like, like Photoshop, so it shows it introduced a tool at first and it you can select part of the and ask Chachapiti to regenerate it or remove some stuff from that area. So I tried a few, a couple of things.

 

12:36 - Unidentified Speaker

One was very successful. The other one was, so on the picture, the tomatoes on the garden was kind of laying on the ground.

 

12:44 - E. T.

And I asked Chachapiti to add a tomato plant and put the tomatoes on the tomato plant instead of just laying on the ground. It completely removed all tomatoes instead of adding so I'm planning to improve my prompts to get exactly what I want on the picture so that was kind of exciting to use those picture generative prompts. Another thing I got my table of content together so I'm kind of trying to get my book together and Yes, that was pretty much it.

 

13:27 - Multiple Speakers

OK.

 

13:28 - E. T.

And are you going to be doing a book or a website? I turned back to book, I'm sorry.

 

13:42 - D. B.

Okay. Okay, that's fine. Okay. I'm just gonna streamline things Yogi, I want to, I'm sorry, I just come back from Washington DC.

 

13:58 - M. M.

Yogi, I was thinking that we have a meeting, but I don't remember when. We were supposed to reschedule.

 

14:10 - Yogi’s iPhone

I said Friday was not possible for me, but we'll reschedule. On the calendar for next week.

 

14:20 - M. M.

For next week, okay. I just mentioned that we have research in mental health, different for teenagers, for adults, for many of depression stuff, but I also mentioned in the beginning my interest is about stress and AI.

 

14:38 - Unidentified Speaker

Yeah.

 

14:38 - M. M.

How this works, maybe it's not exactly going to dementia, but it's very stressful to change and change tools and change jobs and stuff like this. So we need to- We'll catch up.

 

14:52 – Y.’s iPhone

Yes. Yes, ma'am. Yes.

 

14:54 - M. M.

I just worry if I miss the meeting, but I'm okay. No, no, no.

 

14:59 – Y.’s iPhone

I didn't miss. Okay, thank you. All right.

 

15:03 - D. B.

Oh, just getting back to E. T.'s project for a moment. R. S., you mentioned wanting to be on committee. So I'm hoping you'll be on E. T.'s master's committee. I guess we could talk about that offline. But anyway, it's definitely an option. OK.

 

15:21 - R. S.

Well, I'm willing to serve. OK.

 

15:23 - D. B.

And E. T., there are other people in this group who might also be on your committee. You need three people, including me. You can pick whoever you want. But for example, Dr. M. M. might be available, and Dr. R. is probably probably on the list of possibilities and so on. And I think.

 

15:46 - Unidentified Speaker

I think that would be a great resource too, is.

 

15:50 - Multiple Speakers

Definitely, I don't think V. has affiliate graduate faculty status, but he could get it if he wanted to be on your on the committee.

 

16:00 - M. M.

He has it, then he could do it.

 

16:03 - D. B.

Doctor Doctor K. is also on this call. He could. He could do it. Actually graduate faculty status.

 

16:10 - M. M.

Yeah, we have outside people. So because people like me and Dan, we're really very busy.

 

16:17 - D. B.

Yeah.

 

16:18 - R. S.

I'm just, I also want to reiterate that we are able to generate conference proceeding publications for, I've had 100% batting average, as I said to Dan.

 

16:30 - D. B.

Okay, yeah, that's a good possibility. Okay, what else? Does anyone else have anything before before we go to the video? I actually have a question.

 

16:41 - E. T.

Sure. So I was wondering if there is any other online self-paced prompt engineering courses that I can, I mean, that you guys can suggest me to finish.

 

16:53 - Multiple Speakers

Besides this one, which you found? Yes.

 

16:56 - E. T.

Yes, I already finished that one. It really helped me. I'm using NOVA system still, which there is a discussion Continuity expert and and they're experts from different fields which AI finds and he kind of creates a discussion and each expert gives their idea and finally summarize the experts thoughts

 

17:21 - M. M.

and process I give you always the free code for From engineering from NVIDIA and you can compare and give us your opinion which courses you like it better. Actually, do we have the code? I don't. Then do you want to share with everybody again the code or should I? I sent to you. I can, I can send again. That's right, I do have a question for you.

 

17:56 - E. G.

Are you bumping up against Because I I've been trying to play with it. See how to approach. And I've been coming up against word caps.

 

18:11 - E. T.

I've tried again. It was this week. I mostly focused on generating pictures. To be honest, I tried that these weeks and the most words that I got from one prompt.

 

18:28 - Unidentified Speaker

was 4,000 something like that. It never passed 5,000.

 

18:34 - E. G.

OK, because what I ended up doing is building kind of like a A feedback loop, so what I did is I had to create a Almost like, A biography or bibliography. Then I had it pass in. Each as a topic, but I would give it the rest as history. So it would have that as previous history and then have it write out its own topic.

 

19:17 - Unidentified Speaker

So then I was able to start having 30, 40,000 word I'm out. That's impressive. I'll definitely try that.

 

19:30 - D. B.

And also, yeah, if you want that lead to that other course, just check with Dr. M. M. if she didn't get it right in the moment.

 

19:46 - Unidentified Speaker

Sure.

 

19:47 - E. T.

I would love to. OK, good.

 

19:50 - Unidentified Speaker

OK.

 

19:52 - D. B.

have anything else they want to bring up?

 

19:58 - Unidentified Speaker

Because if not, we're going to go to the video.

 

20:05 - D. B.

As usual, I have to juggle the screen. And we're up to, let's see, minute time 19 And it's been a while, maybe we should start from the beginning. Maybe not, I just don't know. I really could start from the beginning of this little chapter here at 1822. Okay, so let me just adjust the sound level.

 

20:42 - D. B.

Screen. I'm going to just do a sound check.

 

20:48 - Unidentified Speaker

Do you all get that? The sound come out OK? No.

 

20:55 - R. S.

It's kind of faint.

 

20:57 - D. B.

Well, I'm not sure what to do, but let's see. Let me see.

 

21:07 - Unidentified Speaker

E. T., please send me your email.

 

21:14 - M. M.

Just send me email. I cannot find you here in the system.

 

21:27 - D. B.

Right.

 

21:28 - Unidentified Speaker

How about now? Is that better? Anybody? Everything described so far is what people would call a self-attention head. I just, I just.

 

21:44 - D. B.

It's light, but I can barely hear it.

 

21:49 - D. D.

Yeah, same. What?

 

21:51 - D. B.

Can barely hear it. Yeah, it's light, but you can.

 

21:57 - E. G.

and you can hear it. OK, well, I don't know.

 

22:02 - Multiple Speakers

I turned up. Maybe I can pull out a headset.

 

22:06 - D. B.

I'm just kind of guessing. Can you hear me now? Can you hear me OK right now?

 

22:14 - Unidentified Speaker

I can. Well, I don't know.

 

22:17 - D. B.

I don't know what to do. I got my sounds all. And it can be heard, so it's not like you, it's just, try it again. I can turn the volume up.

 

22:31 - D. D.

All right.

 

22:32 - D. B.

I'm going to, I'm going to exit this YouTube and just try one more time, but I'm not hopeful. We'll try it. You You won't believe how old this couple is.

 

23:16 - Unidentified Speaker

Now I got ads to deal with.

 

23:21 - D. D.

Well, that's coming through loud and clear.

 

23:26 - D. B.

Screening the overall value map to be a low-rank transformation, turning back to the parameter count, all four of these matrices have the same size.

 

23:58 - Unidentified Speaker

And adding them all up, we get about 6.3 million parameters for one attention head. As a quick side note, to be a little more accurate, everything described so far is what people would call a self-attention head to distinguish it from a variation that comes up in other models that's called cross-attention.

 

24:15 - D. B.

This isn't relevant to our GPT example, but if you're curious, cross-attention involves models that process two distinct types of data, like text in one language and text in another language that's part of an ongoing generation of a translation, or maybe audio input and an ongoing transcription. A cross-attention head looks almost identical. The only difference is that the key and query maps act on different datasets. In a model doing translation, for example, the keys might come from one language while the queries come from another, and the attention pattern could describe which words from one language correspond to which words in another. And in this setting, there would typically be no masking, since there's not really any notion of later tokens affecting earlier ones. Staying focused on self-attention, though, if you understood everything so far, and if we were to stop here, you would come away with the essence of what attention really is. All that's really left to us is to lay out the sense in which you do this many, many different times. In our central example, we focused on adjectives updating. OK, any comments or questions so far. This has to do with deep, deep learning, yes?

 

25:34 - R. S.

Dan?

 

25:35 - Unidentified Speaker

Yes, of course, of course.

 

25:50 - Unidentified Speaker

Can you hear anything now? Hello, can you hear me? I hear you.

 

25:55 - R. S.

Yeah, we can hear you.

 

25:57 - Unidentified Speaker

OK.

 

25:58 - D. B.

Yeah, I don't know what's, I'm sorry about the sound. If it's not usable, we'll just, we could do something else.

 

26:07 - D. D.

It's working.

 

26:08 - Unidentified Speaker

Better?

 

26:08 - D. D.

Yeah, it started out kind of light and then I turned my volume up and then it got really, really loud. I turned it down, it was at normal volume.

 

26:21 - D. B.

I could hear it just fine. OK, so that has something to do with me unplugging my headset.

 

26:29 - Unidentified Speaker

But can you hear me now? I can hear you. Loud and clear.

 

26:34 - R. S.

All right, well, I don't know how to, I mean, I got all these settings I can change.

 

26:41 - D. B.

I can take my headset, unplug it, plug it back in. Go to the Zoom sound and select a microphone. I don't know. I really don't know what to do. All right, well, you may have to turn your volume way up. Let's try it again. Let's try it some more. But of course, there are lots of different ways that context can influence the meaning of a word. Word. If the words they crashed thee preceded the word car, it has implications for the shape and the structure of that car. Associations might be less grammatical. If the word wizard is anywhere in the same passage as Harry, it suggests that this might be referring to Harry Potter, whereas if instead the words Queen, Sussex, and William were in that passage, then perhaps the embedding of Harry should instead be updated to refer to the prince.

 

27:37 - Unidentified Speaker

For every different type of contextual updating that you might imagine, the parameters of these key and query matrices would be different to capture the different attention patterns.

 

27:50 - D. B.

And the parameters of our value map would be different based on what should be added to the embeddings. All right, any comments, questions? Is the audio getting any better? No? Fine for me.

 

28:08 - Multiple Speakers

OK.

 

28:08 - D. B.

And in practice, the true behavior of these maps is much more difficult to interpret, where the weights are set to do whatever the model needs them to do to best accomplish its goal of predicting the next token. As I said before, everything we described is a single head of attention.

 

28:26 - Unidentified Speaker

And a full attention block inside a transformer consists of what's called multi-headed attention, where you run a lot of these operations in parallel, each with its own distinct key query and value maps. Any comments or questions? GPT-3, for example, uses 96 attention heads inside each block.

 

28:46 - D. B.

Considering that each one is already a bit confusing, it's certainly a lot to hold in your head. Just to spell it all out very explicitly, this means you have 96 distinct key and query matrices producing 96 distinct attention patterns. Each head has its own distinct value matrices used to produce 96 sequences of value vectors. These are all added together using the corresponding attention patterns as weights. What this means is that for each position in the context, each token, every one of these heads produces a proposed change to be added to the embedding in that position. So what you do is you sum together all of those proposed changes, one for each head, and you add the result to the original embedding of that position. This entire sum here would be one slice of what's outputted from this multi-headed attention block, a single one of those refined embeddings that pops out the other end of it. Again, this is a lot to think about, so don't worry at all if it takes some time to sink in. The overall idea is that by running many distinct heads in parallel, you're giving the model the capacity to learn many distinct ways that context changes meaning. Pulling up our running tally for parameter count, with 96 heads, each including its own variation of these four matrices, each block of multi-headed attention ends up with around 600 million parameters. There's one added, slightly annoying thing that I should really mention for any of you who go on to read more about transformers. You remember how I said that the value map is factored out into these two distinct matrices, which I labeled as the value down and the value up matrices. The way that I framed things would suggest that you see this pair of matrices inside each attention head, and you could absolutely implement it this way. That would be a valid design. But the way that you see this written in papers, and the way that it's implemented in practice, looks a little different. All of these value-up matrices for each head appear stapled together in one giant matrix that we call the output matrix, associated with the entire multi-headed attention block. And when you see people referring to the value matrix for a given attention head, they're typically only referring to this first step, the one that I was labeling as the value down projection into the smaller space. For the curious among you, I've left an on-screen note about it. It's one of those details that runs the risk of distracting from the main conceptual points, but I do want to call it out just so that you know if you read about this in other sources. Setting aside...

 

31:26 - E. G.

I have a question. Dr. M. M., you may be able to help me more understand this. This is all occurring in parallel, but if each word in context supplies information to the other heads, how can they all operate in parallel? They would have to operate in sequence, so that way it can provide the information to the other heads.

 

32:01 - M. M.

If I understand the question, but if it's a parallel, this word participates many times in different matrices, the same word. So, obviously, if the word participates many times, this will give some idea about how important is this world. Because it's working in parallel, but all of them, they are participating many times. It's not only one time.

 

32:36 - E. G.

But in his analogy, the Harry Potter and the Prince Harry, the context of the terms provided valuable information to the other heads.

 

32:49 - Multiple Speakers

Sorry.

 

32:50 - E. G.

How would that parallel functionality occur when the context of some previous terms provided detail for subsequent heads?

 

33:03 - M. M.

But remember that we have already embedding that already cluster the sentiment. Remember that this attention is coming after that. OK.

 

33:20 - E. G.

I don't think it's coming before.

 

33:23 - M. M.

It's coming after when you're ready. Cluster initially cluster. The vector space is already created. OK, OK, that makes sense at that point.

 

33:35 - E. G.

Yeah, if the vector space is created, I didn't. Yeah, and all they're going through is basically all of the the vectors. That show up with the context provided, and it's going through giving a weighting at that point.

 

33:56 - M. M.

Exactly. So remember, the first step is embedding. So we already have this similarity of actors.

 

34:03 - E. G.

Now it makes sense.

 

34:05 - Unidentified Speaker

Yeah.

 

34:06 - M. M.

Now we are actually just comparing with the Curie for his question, who is the prince, or just comparing with the question by answer. System, but the embedding is done, the clustering is done.

 

34:23 - E. G.

So at that point, you'd have things like Harry Potter, Prince William, World War One. So you could basically have that context already done. Then at that point, it goes through the multithreading in parallel to figure out, oh, World War One, you mean?

 

34:47 - M. M.

This prints. Exactly. Embedding is done. Don't worry.

 

34:52 - Unidentified Speaker

OK.

 

34:53 - Multiple Speakers

Yeah. OK.

 

34:54 - D. B.

By the way, I increased my own sound volume by going to the Zoom microphone icon and pulling up the audio settings at the bottom. And that fixed it. I'm sharing my screen. I don't think you can see.

 

35:17 - Unidentified Speaker

Can you see the pop down menu with the highlighted audio settings?

 

35:22 - D. B.

No. No.

 

35:23 - Unidentified Speaker

OK.

 

35:23 - D. D.

Well, anyway, if you go to the microphone, it made a difference. It made a difference when I fixed it? Yeah. Yeah. It's been running really good.

 

35:35 - Unidentified Speaker

OK.

 

35:35 - D. B.

Well, that was kind of a maze. Oh, OK. I'm thinking we got another three minutes in this video, but we might want to backtrack because I got kind of lost at the beginning, plus the sound wasn't good. What do you all want to do? You want to do the last three minutes or go back to where we started? Any opinions? We may want to run through the whole video again at some point. We can do that. We'll continue here. Setting aside all the technical nuances, in the preview from the last chapter, we saw how data flowing through a transformer doesn't just flow through a single attention block. For one thing, it also goes through these other operations called multilayer perceptrons. We'll talk more about those in the next chapter, and then it repeatedly goes through many, many copies of both of these operations. What this means is that after a given word imbibes some of its context, there are many more chances for this more nuanced embedding to be influenced by its more nuanced surroundings. The further down the network you go, with each embedding taking in more and more meaning from all the other embeddings, which themselves are getting more and more nuanced, the hope is that there's the capacity to encode higher-level and more abstract ideas about a given input beyond just descriptors and grammatical structure. So let's see, what was the key term he kept talking about? Embeddings. I forgot what an embedding was exactly. Creating this vector.

 

37:20 - Multiple Speakers

And the first step before all of this attention, they have to create these vectors.

 

37:28 - M. M.

And the embedding Not only, they have also positioning embedding, position of the word in the sentence. Not only they have a number for the word embedding, but also they have a position of the word in the sentence. But creating this vector is the embedding.

 

37:50 - D. B.

So what's shown in this graph here on the screen?

 

37:55 - M. M.

How the sentence can change the position So, for example, one less traveler bears or whatever, they use the vectors that represent the words and create a new position.

 

38:13 - E. G.

So, in this case, one was the term and it looked at the context around it, a less traveler, So less traveled means road, movement, and took is a phrase, so symbolizing choice. So one at this point isn't a number, it's symbolizing a choice.

 

38:40 - D. B.

OK, so that's how the context is modifying the word one to mean not a number but a choice about a number.

 

38:53 - M. M.

influence another vector of two roads, influence this one, and when you do the operations with these vectors, you receive something like an output that they show.

 

39:09 - D. B.

Different, different.

 

39:10 - M. M.

Like they show you what was the king and queen and, you know, the vector representation. Of the words.

 

39:21 - D. B.

From way back in a previous video.

 

39:25 - Unidentified Speaker

Yeah.

 

39:25 - M. M.

All right.

 

39:26 - D. B.

Any other questions or anything anyone want to talk about? All right, we'll continue. Things like sentiment and tone and whether it's a poem and what underlying scientific truths are relevant to the piece and things like that. Turning back one more time to our scorekeeping, GPT-3 96 distinct layers, so the total number of key query and value parameters is multiplied by another 96, which brings the total sum to just under 58 billion distinct parameters devoted to all of the attention heads. Any comments on that? I understand that people are talking about how scaling is running out of steam, and 57 billion, you know, 57 trillion might not do much better than 57 billion, And DeepSeek, the Chinese recent AI is supposedly much more efficient and lean compared to American ones. And it could be that AI algorithm development is going to sort of lead to lower computations rather than higher scaling, like DeepSeek.

 

40:50 - D. D.

Yeah, that's a big if, if DeepSeek didn't just pull the information right off of ChatGPT4.

 

41:02 - D. B.

Well, it may have leveraged ChatGPT4, but that's OK, in a sense, because it's doing what it's doing with less computation.

 

41:17 - D. D.

Well, but it's riding on the shirt tail of the millions and millions of dollars and hours and hours of computation that ChatGPT4 has already done.

 

41:31 - M. M.

So in essence, it's really, it's really extra.

 

41:35 - D. D.

It's, it's just a little bit more that's marginally less than ChatGPT, four. It's not as good.

 

41:45 - Multiple Speakers

I wouldn't even say it's marginally less. I liken it to basically you're copying off the guy next to you's test sheet.

 

41:57 - E. G.

No, that was what I understood, how DeepSeek was able to get there so quickly with such minimal hardware is basically they pulled what, and reverse engineered basically what ChatGPT had done. Yeah, that's it.

 

42:15 - D. B.

Yeah, I just, you know, before we sort of take this as a way to vilify DeepSeek, I'd like to point out that ChatGPT is cribbing off all the content on the web that people have laboriously created by hand and are not compensated for. Well, that's different.

 

42:37 - Multiple Speakers

It's true, but it's different.

 

42:40 - E. G.

They actually use that to build the vectors. So so they actually took that information and computationally built out all of the vectors, all of the layers, all of the perceptrons. But and then got the answers, then DeepSeek says, Oh, so those are the vectors and the answers. You don't need to calculate.

 

43:08 - D. D.

If they hadn't, if they hadn't went out there and got all the stuff off the web, then ChatGPT would be, you know, good at maybe whatever free stories they could have gotten. You wouldn't be able to go and get coding and expert opinion. You know what? You wouldn't have a whole of different expert chats that you could do. So without all that data, then what is it, garbage in, garbage out. If you don't have some quality data to put in, we wouldn't have a large language model that was worth using.

 

43:56 - Unidentified Speaker

All right, we'll continue.

 

43:59 - D. B.

That is a lot, to be sure, but it's only about a third of the 175 billion that are in the network in total. So even though attention gets all of the attention, the majority of parameters come from the blocks sitting in between these steps. In the next chapter, you and I will talk more about those other blocks, and also a lot more about the training process. A big part of the story for the success of the attention mechanism is not so much any specific kind of behavior that it enables, but the fact that it's parallelizable, meaning that you can run a huge number of computations in a short time using GPUs. Given that one of the big lessons about deep learning in the last decade or two has been that scale alone seems to give huge qualitative improvements in model performance, there's a huge advantage to parallelizable architectures that let you do this. Any comments?

 

44:47 - D. D.

Well is that kind of what we were just talking about that there's kind of this tipping point where you you scale up so far than getting more data doesn't seem to help. And I think that's more of a training method problem. I think logically more data should help. It's just, you know, they have to work with it. And I think that's a lot of the expense is experiment. You know, when they go, oh, well, we tried this and now it's not as good as it was before. And there goes, you know, 250,000. And then they're back in there, you know trying a different way so that they can scale it I Think they'll be able to train off the top and that's you know kind of I Still think we're in our infancy

 

45:41 - E. G.

here what we're doing today is not what we're going to be doing tomorrow This the parallelization is a lot of times a brute force I mean, it's what we used to do to crack encodings, ciphers. And to me, this is nothing more than trying to crack a cipher in how to find the relationship between what I've requested and what I want.

 

46:13 - D. D.

And yeah, the relationship between the training and the data size and Yeah, that's right. It's there. Yeah. And I agree. I think in five years when we come back and look at this, we're going to go, oh, wow, can you believe that? You know, it'll be a whole lot throughout this video. And they're already they're already making the transformer better.

 

46:40 - Unidentified Speaker

Yeah.

 

46:41 - E. G.

Already figuring out how to do things. I mean, it's look at the transformers of what a year ago, three years ago.

 

46:55 - D. B.

All right, let's finish up. If you want to learn more about this stuff, I've left lots of links in the description. In particular, anything produced by Andrej Karpathy or Chris Ola tend to be pure gold. In this video, I wanted to just jump into attention in its current form, but if you're curious about more of the history for how we got here and how you might reinvent this idea for yourself, my friend Vivek just put up a couple videos giving a lot more of that motivation. Also, Britt Cruz from the channel The Art of the Problem has a really nice video about the history of large language models. I think we should go back. Redo this one before we go on to the next? Well, I mean go back to the videos that they were showing.

 

47:38 - D. D.

Oh, yeah. BuildChatGPT from scratch. I think almost every video that he suggested we should vote on.

 

47:46 - D. B.

OK. I can make a mention of it. I like watching it.

 

47:51 - D. D.

Again, um, because every time I watch it, it seems like I get a little bit better understanding I fully agree. Yeah, so if you want to go back and watch it again, absolutely I feel like this one.

 

48:07 - D. B.

I could definitely get more out of if I watch it again Any other any other uh opinions on that question Because if not, I would just say, uh, uh we decided to run through this one again, especially since I kind of forgot the first 19 minutes since it's been several weeks. All right, so I say- I'm up next week, right?

 

48:41 - D. D.

Yes. Okay. Looking forward for your presentation, D., Yeah, I think I'm I think so I'll probably have I don't know exactly how long my presentation will be for my paper But it's probably like three minutes or something. It'll be very short. So I'm gonna bill I'm gonna build a you know, a lengthy presentation and then cut it off and use it for my Use I guess in three months three or four months I present for my paper Present In the

 

49:14 - D. B.

conference, he's accepted in the conference.

 

49:17 - M. M.

Oh, OK.

 

49:18 - D. B.

So you're going to three minutes next week?

 

49:21 - Multiple Speakers

Three minutes? No.

 

49:23 - D. D.

I'm going to build a nice, robust presentation, and then I'll chop it down to three minutes later this summer. Oh, later, OK. Yeah, yeah. Why three minutes?

 

49:35 - D. B.

Shouldn't I shoot for about 30 minutes?

 

49:38 - D. D.

That's what I was thinking.

 

49:41 - D. B.

I don't have any, I mean, I just, you know, you want to get your message across. That's all, whether it takes five minutes or 35 minutes is fine with me.

 

49:55 - Multiple Speakers

Okay.

 

49:56 - D. B.

Well, I'll, I'll, I'll shoot for 20 minutes then.

 

50:00 - D. D.

Okay. And you want to get some feedback too, right?

 

50:05 - D. B.

Right. Sure. All right. Well, uh, I guess.

 

50:12 - M. M.

E. T., I sent you the link in the code.

 

50:24 - D. D.

I buy prompt engineering courses on Udemy. They vary. It kind of depends on what you're after. But I mean, there's a lot of free stuff out there on the internet. I would just sit down one night and just say, hey, I'm going to work on, I'm going to learn about prompt engineering for two hours. And I would just read all the free stuff that you could get, just treat it as a brainstorming exercise.

 

51:02 - M. M.

Yeah, you too. YouTube has a lot of stuff.

 

51:07 - Multiple Speakers

This is YouTube.

 

51:09 - D. B.

Yeah, it's correct.

 

51:11 - D. D.

Or as banded to teach you and let me hear that man can make a robust prompt. Hands down, yeah.

 

51:22 - E. G.

Well, he uses a prompt to make a prompt to make a prompt.

 

51:30 - D. D.

Yeah. That is not the That's a whole nother level right there. But Doctor Warren V. W., you may I I had him in classes. Or you know with him, we were both students, but you might you might have met him.

 

52:15 - E. G.

I don't know. But uh. One of the things that I found really useful is in the prompt engineering.

 

52:24 - T. E.

Maintaining a context.

 

52:25 - E. G.

Because, uh, I go through and I elicit information, I maintain a context, a history. It's almost like a search history. So as that context continues, it knows it's refining its answers based on answers it's already given or prompts that you originally gave it.

 

52:52 - D. D.

You're writing a book, right? Yes.

 

52:55 - E. T.

And so you have an outline of what you want the book to be, right?

 

53:02 - Unidentified Speaker

Yes.

 

53:03 - E. T.

Did you get the AI to help you write the outline?

 

53:08 - Multiple Speakers

Yes.

 

53:09 - Unidentified Speaker

OK.

 

53:09 - Multiple Speakers

Is it not possible just to go through and just get the AI to write each section and then maybe go back And so, like, let's say you get a summary of the previous section and use that as you build

 

53:28 - E. T.

to the next section so you don't filter out information, you know, so it can kind of build on what's been happening.

 

53:36 - Multiple Speakers

That's an idea. I haven't tried to write a book, so you probably know more about it than I do at this point.

 

53:45 - E. T.

So here's an exercise we can try.

 

53:48 - Multiple Speakers

we can ask ChatGPT to tutor us in prompt engineering using the Socratic method. I wonder how much, I guess the question is, how much does ChatGPT know about prompt engineering? And I guess it would know, in a sense, know whatever it can digest, whatever it can digest from all the stuff that's been written about it on the web. Yeah, I just don't know how current it is.

 

54:21 - E. T.

They still have that hard shut-off date, don't they?

 

54:26 - Multiple Speakers

I don't know.

 

54:27 - E. T.

For GPT, another model might be better. Gemini, because Gemini supposedly is integrated with the Google search engine, which tries to keep up to date. Retrieval-oriented generation. All right, folks.

 

54:44 - Multiple Speakers

Thanks for joining in and we'll see you same time, same place next week.

 

54:50 - E. T.

Thanks, guys.

 

54:51 - Multiple Speakers

Thank you. Thank you. Wonderful. Take care.

 

54:54 - E. T.

Take care.

 


Displaying jan10FinishedTranscript.txt.

No comments:

Post a Comment