AIn't What It Used to Be: 1/5/24: Reading

Machine Learning Study Group

Welcome! We meet from 4:00-4:45 p.m. CT. Anyone can join. Feel free to attend any or all sessions, or ask to be removed from the invite list as we have no wish to send unneeded emails of which we all certainly get too many.

Contacts: jdberleant@ualr.edu and mgmilanova@ualr.edu

Agenda & Minutes

95th Meeting, Jan. 5, 2024

Today

Announcements, questions, etc.?

Readings: We are reading

Sparks of Artificial General Intelligence: Early experiments with GPT-4 (https://arxiv.org/abs/2303.12712).

We already examined Figures 1.1 and 1.2. Next is to continue with the partial paragraph at the bottom of p. 4.

We read chapter 1 part 4: https://huggingface.co/learn/nlp-course/chapter1/4 up to "Unfortunately, training a model, especially a large one, requires a large amount of data."

Here is the chat record for today:
VW to Everyone 4:05 PM
“Orchestrating RAG: Retrieval, Canopy, & Pinecone,” on Jan 9 at 12:00pm ET via Zoom. Register now.

Hi L V,

Join us for a technical deep dive into Pinecone's open-source RAG framework: Canopy. With Canopy, you can build and launch Production GenAI apps quickly and easily. By taking care of chunking, vectorization, LLM orchestration, and prompt engineering, Canopy abstracts away the heavy lifting of building RAG pipelines, leaving you with the energy to focus what's important: building your end product. And since it's completely open source, you can extend and customize Canopy to meet any use case.

We will end with a live demo of Canopy, building a RAG pipeline around an excerpt of Arxiv.org articles from our HuggingFace datasets.

There will be time for plenty of questions, so read up on Canopy and come prepared to have a great time!
https://bit.ly/4aIO8uV

VW to Everyone 4:19 PM
https://techcrunch.com/2024/01/05/5-steps-to-ensure-startups-successfully-deploy-llms

I posit this is based on the fallacy of the ma and pa LLM.
Auto-regressive Models:

Definition: Auto-regressive models are a type of neural network architecture used primarily for sequence generation tasks. They predict the next element in a sequence based on the previous elements.
Key Characteristics:
Sequential Prediction: They operate in a sequential manner, predicting one part of the output at a time.
Conditional Probability: Each output element's prediction is conditioned on the previously generated elements.
Use Cases: Common in language modeling (like GPT series), where each word is predicted based on the preceding text.
Example: GPT (Generative Pre-trained Transformer) models are auto-regressive in nature, generating text one word at a time based on all the previous words.
VW to Everyone 4:23 PM
Auto-encoders:

Definition: Auto-encoders are unsupervised learning models used primarily for data compression and feature learning.
Key Characteristics:
Encoder-Decoder Structure: They consist of two main parts - an encoder that compresses the input data into a latent-space representation, and a decoder that reconstructs the input data from this representation.
Dimensionality Reduction: Often used for reducing the dimensionality of data while preserving its essential features.
Use Cases: Common in image processing, anomaly detection, and sometimes in natural language processing for representation learning.
Example: Variational Auto-encoders (VAEs) are a type of auto-encoder often used in generative tasks.
Sequence-to-Sequence (Seq2Seq) Models:

Definition: Sequence-to-sequence models are a type of neural network architecture designed for transforming one sequence into another, commonly used in tasks where both the input and output are sequences.
Key Characteristics:
Encoder-Decoder Architecture: Similar to auto-encoders, seq2seq models also have an encoder and a decoder, but both are geared towards sequence generation.
Alignment and Translation: They can align parts of the input sequence with the output sequence, making them ideal for tasks like machine translation.
Use Cases: Prominent in machine translation, speech recognition, and text summarization.
Example: The Transformer model, when used in a seq2seq configuration, is common in machine translation tasks (like the original implementation of Google's BERT).

AIn't What It Used to Be

Friday, January 5, 2024

1/5/24: Reading

No comments:

Post a Comment