Skip to content

Latest commit

 

History

History
129 lines (102 loc) · 4.43 KB

File metadata and controls

129 lines (102 loc) · 4.43 KB

GitHub Twitter @data4sci GitHub top language GitHub repo size GitHub last commit

Graphs For Science Data Science Briefing

LangChain for Generative AI Pipelines

Code and slides to accompany the online series of webinars: https://data4sci.com/langchain by Data For Science.

LangChain is the state-of-the-art framework for developing Large Language Model (LLM) based applications. It provides a wide range of Lego-like components to streamline the integration of various LLM functionalities into functional pipelines without requiring in-depth expertise in ML.

In this course, students will get an in-depth view of the structure of LangChain and its various components. You will learn how to apply these components to Information Retrieval and the development of chatbots. An overview of the pros and cons of LLMs from OpenAI, HuggingFace, and Anthropic, as well as a primer on Prompt Engineering, will also be provided to empower students to make the best use possible of the capabilities that LangChain puts at their fingertips.

Schedule

1.Generative AI

  • Overview of Generative Models
  • Comparison of GPT to other LLMs
  • Text to Image Models

2. LangChain

  • LangChain structure
  • Understanding Chains
  • Exploring Agents
  • Using tools to interact with the world

3. Information Processing

  • Understanding Text Summarization
  • Information Extraction Applications
  • Developing a Question Answering

4. Chatbots

  • Information Retrieval and Vectors
  • Retrievers in LangChain
  • Implementing a simple Chatbot

5. Prompt Engineering

  • Overview of Prompt Engineering Techniques
  • Comparison of Zero-Shot and Few-Shot Prompting
  • Understanding Chain of Thought prompts
  • Developing Tree of Thought prompts

6. LangGraph

  • Chains vs Graphs
  • State
  • Cycles
  • Tool Calling Agents
  • Streaming

Repository Contents

Notebooks (webinar progression)

  • 1. Generative Models.ipynb - direct OpenAI usage, HuggingFace pipelines, and model basics
  • 2. LangChain.ipynb - LCEL chains, prompts, tools, SQL, and message history
  • 3. Information Processing.ipynb - summarization, extraction, and text splitting workflows
  • 4. ChatBots.ipynb - retrieval, vector stores, and chatbot pipeline construction
  • 5. Prompt Engineering.ipynb - zero-shot, few-shot, and chain-of-thought prompting
  • 6. LangGraph.ipynb - graph-based agents, state, cycles, and streaming

Slides

  • slides/LangChain.pdf

Data and assets

  • data/Northwind_small.sqlite
  • data/trump.csv
  • data/pg43548-h.zip
  • images/ (generated output images)
  • d4sci.mplstyle (custom notebook plotting style)

Environment Setup

Requirements

  • Python >=3.13
  • uv (recommended) or pip

Install dependencies (uv)

uv sync

Install dependencies (pip fallback)

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run notebooks (from project root)

jupyter lab

Core Dependencies

Key libraries used in this repo include:

  • langchain, langchain-core, langchain-community
  • langchain-openai, langchain-anthropic, langchain-huggingface
  • langgraph
  • chromadb, sentence-transformers
  • transformers, torch
  • jupyter, pandas, matplotlib
  • duckduckgo-search

Author

Bruno Gonçalves

Bruno Gonçalves

Data For Science, Inc.

Web: www.data4sci.com
Twitter/X: @bgoncalves
LinkedIn: @bmtgoncalves
Email: info@data4sci.com
Schedule a Call: https://data4sci.com/call