I develop tools and methods for computational text analysis, focusing on corpus linguistics, rhetorical analysis, and statistical approaches to language variation. My work spans from desktop applications for researchers to R and Python packages for text analysis workflows.
Development Packages (GitHub): mda.biber β’ quanteda.extras β’ vnc β’ ngramr.plus
Desktop & Web Applications for Corpus Analysis and Concordancing
- DocuScope CA Desktop - Standalone desktop application combining part-of-speech tagging with DocuScope rhetorical analysis
- DocuScope CA Online - Web-based version for corpus analysis with frequency tables, KWIC, and comparative analysis
Features: Corpus processing, frequency analysis, keyword-in-context tables, corpus comparison, advanced plotting
- mda.biber - Multi-Dimensional Analysis (MDA) for linguistic variation across genres and registers
- pseudobibeR - Extract 67 lexicogrammatical features from parsed text data for register analysis
- quanteda.extras - Extended corpus functions for keyness, dispersion, and collocational analysis
- vnc - Variability-Based Neighbor Clustering for data-driven periodization in historical linguistics
- ngramr.plus - Extract frequency data from Google Books Ngram datasets across multiple English varieties
- spell.replacer - Fast probabilistic spelling correction based on COCA frequency data
- docuscospacy - spaCy models trained on DocuScope and CLAWS7 tagset for rhetorical analysis
- pybiber - Python implementation of Biber's linguistic feature extraction
- google_ngrams - Process Google Ngram data with Variability-Based Neighbor Clustering
- moodswing - Sentiment trajectories analysis
- en_docusco_spacy - Custom spaCy model trained on DocuScope and CLAWS7 tagset, powering the DocuScope applications above
- HAP-E: Human-AI Parallel Corpus - Parallel corpus of human and AI-generated texts for comparative analysis
- HAP-E Mini - Smaller version of the Human-AI parallel corpus for quick testing
Browse all resources: browndw on Hugging Face
- cmu.textstat - R package for Carnegie Mellon's Special Topics in Statistics & Data Science course
- textstat_docs - Comprehensive documentation and tutorials for statistical text analysis
- cmu-textstat-docs - Course documentation and lab materials
- DocuScope Documentation - Comprehensive guides for DocuScope rhetorical analysis tools
- Presentations - Conference presentations and workshop materials
My work centers on developing computational methods for:
- Multi-Dimensional Analysis - Statistical approaches to linguistic variation
- Corpus Linguistics - Tools for large-scale text analysis and comparison
- Rhetorical Analysis - Computational approaches to discourse analysis
- Register & Genre Analysis - Automated classification of text types
- Historical Linguistics - Quantitative approaches to language change
-
DeLuca, L. S., Reinhart, A., Weinberg, G., Laudenbach, M., Miller, S., & Brown, D. W. (2025). Developing Students' Statistical Expertise Through Writing in the Age of AI. Journal of Statistics and Data Science Education, 1-13. https://doi.org/10.1080/26939169.2025.2497547
-
Reinhart, A., Markey, B., Laudenbach, M., Pantusen, K., Yurko, R., Weinberg, G., & Brown, D. W. (2025). Do LLMs write like humans? Variation in grammatical and rhetorical styles. Proceedings of the National Academy of Sciences, 122(8), e2422455122. https://doi.org/10.1073/pnas.2422455122
-
Markey, B., Brown, D. W., Laudenbach, M., & Kohler, A. (2024). Dense and disconnected: Analyzing the sedimented style of ChatGPT-generated text at scale. Written Communication, 41(4), 571-600. https://doi.org/10.1177/07410883241263528
-
Laudenbach, M., Brown, D. W., Guo, Z., Ishizaki, S., Reinhart, A., & Weinberg, G. (2024). Visualizing formative feedback in statistics writing: An exploratory study of student motivation using DocuScope Write & Audit. Assessing Writing, 60, 100830. https://doi.org/10.1016/j.asw.2024.100830
- Brown, D. W. (2024). Dictionaries, Language Ideologies, and Language Attitudes. In E. Finegan & M. Adams (Eds.), The Cambridge Handbook of the Dictionary (pp. 277-300). Cambridge University Press. https://doi.org/10.1017/9781108864435.015
- π¬ Ask me about: Corpus linguistics, text analysis methods, R/Python for linguistics
- οΏ½ Collaborate on: Open-source text analysis tools, corpus linguistics research
- π Teaching: Statistical methods for text analysis, computational linguistics


