Twitter(X) Profile Bio ICP Classifier - Bio Keywords Extractor

Analyze Twitter (X) profile bios to automatically classify users based on keyword matching and intent signals. This project helps transform unstructured bio text into actionable audience segments, making Twitter bio keyword classification practical and scalable. It delivers clear insights for teams building ICPs, researching audiences, or qualifying leads.

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for twitter-x-profile-bio-icp-classifier-bio-keywords-extractor you've just found your team — Let’s Chat. 👆👆

Introduction

This project analyzes public Twitter (X) profile bios and classifies profiles by matching bio text against customizable keyword groups. It solves the problem of manually reviewing bios by turning free-text descriptions into structured, searchable signals. It is designed for marketers, founders, analysts, and community managers who need fast and consistent audience segmentation.

Bio-Driven Audience Classification

Parses profile bios from structured profile datasets
Matches bios against configurable keyword categories
Assigns one or more matched keywords per profile
Flags profiles with no matches as Unclassified
Outputs clean, structured results ready for analysis

Features

Feature	Description
Keyword Group Matching	Match bios against multiple keyword categories such as SaaS, Marketing, or Developers.
Multi-Label Classification	Assign more than one keyword when a bio matches multiple categories.
Custom Taxonomies	Define and adjust keyword groups to fit any ICP or audience model.
Structured Output	Produces normalized JSON records for easy downstream processing.
Noise Reduction	Filters irrelevant bios by clearly marking unclassified profiles.

What Data This Scraper Extracts

Field Name	Field Description
name	Display name of the Twitter (X) profile.
bio	Raw biography text written by the user.
followers_count	Total number of followers for the profile.
following_count	Total number of accounts the profile follows.
profile_url	Direct URL to the Twitter (X) profile.
matched_keywords	List of keywords detected in the bio text.

Example Output

[
      {
        "name": "Spencer Walden",
        "bio": "I build things: Building software to help founders",
        "followers_count": 795,
        "following_count": 378,
        "profile_url": "https://x.com/Swaldy",
        "matched_keywords": [
              "founder"
        ]
      },
      {
        "name": "Yash Desai 🚀 Shipr.Dev",
        "bio": "Founder @ShiprDev | Simplifying SaaS Development",
        "followers_count": 2201,
        "following_count": 2139,
        "profile_url": "https://x.com/yhdesai",
        "matched_keywords": [
              "saas",
              "founder"
        ]
      }
]

Directory Structure Tree

Twitter(X) Profile Bio ICP Classifier - Bio Keywords Extractor/
├── src/
│   ├── runner.py
│   ├── classifier/
│   │   ├── keyword_matcher.py
│   │   └── normalizer.py
│   ├── io/
│   │   ├── dataset_loader.py
│   │   └── output_writer.py
│   └── config/
│       └── keywords.example.json
├── data/
│   ├── input.sample.json
│   └── output.sample.json
├── requirements.txt
└── README.md

Use Cases

Marketing teams use it to segment Twitter audiences, so they can run highly targeted outreach campaigns.
Founders use it to identify peers and potential partners, so they can build stronger networks faster.
Growth analysts use it to classify large user datasets, so they can understand niche positioning trends.
Community managers use it to tag members by role, so they can personalize engagement and content.
Lead generation teams use it to qualify prospects, so they focus only on high-intent profiles.

FAQs

Can I customize the keyword categories? Yes. Keyword groups are fully configurable, allowing you to define your own labels and matching terms based on your ICP or market.

What happens if a bio matches multiple categories? The profile will be assigned all matching keywords, enabling multi-label classification for richer analysis.

How are unclassified profiles handled? Profiles with no keyword matches are clearly labeled as Unclassified, making it easy to filter or review them separately.

Is this suitable for large datasets? Yes. The classification logic is lightweight and designed to scale efficiently across large profile datasets.

Performance Benchmarks and Results

Primary Metric: Processes thousands of profiles per minute with keyword matching latency measured in milliseconds per bio.

Reliability Metric: Consistent classification results with stable matching behavior across repeated runs.

Efficiency Metric: Low memory footprint due to simple text normalization and dictionary-based matching.

Quality Metric: High precision for clearly defined keywords, producing clean and interpretable classification outputs suitable for ICP modeling.

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter(X) Profile Bio ICP Classifier - Bio Keywords Extractor

Introduction

Bio-Driven Audience Classification

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Twitter(X) Profile Bio ICP Classifier - Bio Keywords Extractor

Introduction

Bio-Driven Audience Classification

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages