ConfBot

ConfBot crawls conference paper metadata, optionally generates keywords with an LLM, and visualizes the collected data in a Streamlit dashboard.

The project now uses uv for dependency management and Playwright for browser automation.

Features

Crawl paper titles, authors, and abstracts from conference program pages
Save normalized metadata into a CSV file
Generate keywords with an LLM-based pipeline
Explore results in a Streamlit analysis dashboard

Requirements

Python 3.11+
uv

Setup

uv sync
uv run playwright install chromium

If keyword generation depends on environment variables, create a .env file before running the keyword pipeline.

Example variables:

API_KEY
BASE_URL
MODEL

Usage

Run the crawler and keyword pipeline:

uv run python main.py

Run only the crawler for a specific track:

uv run python main.py --urls "https://conf.researchr.org/track/fse-2025/fse-2025-research-papers" --no-keyword

Launch the dashboard:

uv run streamlit run app.py

Validation

Compile the main entrypoints:

uv run python -m py_compile crawler.py test_playwright.py main.py

Run the Playwright smoke test:

uv run python test_playwright.py

If Playwright reports that the browser executable is missing, run:

uv run playwright install chromium

Project Files

crawler.py - Playwright-based crawler
main.py - CLI entrypoint for crawling and keyword generation
genkw.py - keyword generation logic
analysis.py - analysis helpers
app.py - Streamlit dashboard

Disclaimer

This project is intended strictly for academic research and educational use. You are responsible for complying with the target website's Terms of Service and robots.txt policy. The developer is not liable for any misuse or legal issues caused by using this code.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README-zh.md		README-zh.md
README.md		README.md
analysis.py		analysis.py
app.py		app.py
crawler.py		crawler.py
data.py		data.py
genkw.py		genkw.py
main.py		main.py
prompt.txt		prompt.txt
pyproject.toml		pyproject.toml
test_playwright.py		test_playwright.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConfBot

Features

Requirements

Setup

Usage

Validation

Project Files

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ConfBot

Features

Requirements

Setup

Usage

Validation

Project Files

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages