@@ -18,16 +18,21 @@ FunStrings is a comprehensive Python package that provides a wide range of funct
1818 - [ Text Analysis Functions] ( #text-analysis-functions )
1919 - [ String Transformation Functions] ( #string-transformation-functions )
2020 - [ Pattern-based Functions] ( #pattern-based-functions )
21+ - [ Data Cleaning Functions] ( #data-cleaning-functions )
22+ - [ Text Analysis Helpers] ( #text-analysis-helpers )
23+ - [ ML/NLP Preprocessing] ( #mlnlp-preprocessing )
24+ - [ Validation Functions] ( #validation-functions )
2125- [ Installation] ( #installation )
2226- [ Quick Start] ( #quick-start )
2327- [ Documentation] ( #documentation )
2428- [ For Students and Educators] ( #for-students-and-educators )
2529- [ Contributing] ( #contributing )
2630- [ License] ( #license )
31+ - [ Connect] ( #connect )
2732
2833## Features
2934
30- FunStrings includes 24 utility functions organized into four categories:
35+ FunStrings includes 44 utility functions organized into eight categories:
3136
3237### Basic String Operations
3338- ** Reverse String:** Return the reversed string
@@ -60,6 +65,33 @@ FunStrings includes 24 utility functions organized into four categories:
6065- ** Mask Sensitive:** Mask all but last n chars with '* '
6166- ** Find Repeated Words:** Find all repeated words in text
6267
68+ ### Data Cleaning Functions
69+ - ** Remove HTML Tags:** Strip all HTML tags from text
70+ - ** Remove Emojis:** Remove emojis from text
71+ - ** Remove Special Characters:** Keep only letters and numbers
72+ - ** Expand Contractions:** Convert "don't" → "do not"
73+ - ** Correct Whitespace:** Remove weird spaces, tabs, newlines
74+
75+ ### Text Analysis Helpers
76+ - ** Unique Words:** Return list of unique words
77+ - ** Most Common Word:** Return most frequent word
78+ - ** Sentence Count:** Number of sentences in text
79+ - ** Average Sentence Length:** Average words per sentence
80+ - ** Character Ratio:** Uppercase/lowercase/number ratio
81+
82+ ### ML/NLP Preprocessing
83+ - ** Generate N-grams:** Generate list of n-grams
84+ - ** Strip Accents:** Remove accents (café → cafe)
85+ - ** Lemmatize Text:** Reduce words to base form
86+ - ** Is ASCII:** Check if text only contains ASCII
87+
88+ ### Validation Functions
89+ - ** Is Valid Email:** Validate if a string is a proper email
90+ - ** Is Valid URL:** Validate if a string is a proper URL
91+ - ** Is Valid IP:** Check if string is a valid IP address
92+ - ** Is Valid Date:** Check if a string matches a date format
93+ - ** Contains Special Characters:** Check if special symbols are present
94+
6395## Installation
6496
6597You can install FunStrings directly from PyPI:
@@ -96,6 +128,23 @@ print(funstrings.snake_to_camel(snake)) # helloWorldExample
96128# Pattern-based
97129text_with_emails = " Contact us at info@example.com or support@example.org"
98130print (funstrings.extract_emails(text_with_emails)) # ['info@example.com', 'support@example.org']
131+
132+ # Data cleaning
133+ html_text = " <p>Hello <b>World</b></p>"
134+ print (funstrings.remove_html_tags(html_text)) # Hello World
135+ print (funstrings.expand_contractions(" I don't know" )) # I do not know
136+
137+ # Text analysis helpers
138+ print (funstrings.sentence_count(" Hello! How are you? I'm fine." )) # 3
139+ print (funstrings.most_common_word(" hello world hello python" )) # hello
140+
141+ # ML/NLP preprocessing
142+ print (funstrings.generate_ngrams(" hello" , 2 )) # ['he', 'el', 'll', 'lo']
143+ print (funstrings.strip_accents(" café" )) # cafe
144+
145+ # Validation
146+ print (funstrings.is_valid_email(" user@example.com" )) # True
147+ print (funstrings.is_valid_url(" https://example.com" )) # True
99148```
100149
101150## Documentation
@@ -126,3 +175,8 @@ Please make sure to update tests as appropriate.
126175## License
127176
128177This project is licensed under the MIT License - see the [ LICENSE] ( LICENSE ) file for details.
178+
179+ ## Connect
180+
181+ - GitHub: [ nilkanth02] ( https://github.com/nilkanth02/ )
182+ - LinkedIn: [ Nilkanth Ahire] ( https://www.linkedin.com/in/nilkanthahire )
0 commit comments