Select columns and click 'Apply' to unite text columns
Configure options and click 'Apply' to tokenize texts
Select stopwords and click 'Apply' to remove common words
Configure settings and click 'Apply' to detect multi-words
Select n-grams and click 'Apply' to compound multi-words
Click 'Process' to create document-feature matrix
Click 'Apply' to run spaCy linguistic analysis, or 'Skip' to use standard tokenization.
First, click 'Apply' on the Word Forms tab to run spaCy analysis. Then configure and click 'Apply' here to view POS tags.
First, run POS tagging on the Word Forms tab. Then select morphological features in the sidebar and click 'Analyze Morphology' to extract features.
Select Case, Mood, or Aspect checkboxes from the sidebar to display additional morphological features.
Click 'Apply' on the Word Forms tab to run spaCy analysis and extract dependency parsing data.
Height of the plot
Select terms, continuous variable, and click 'Plot Terms' to analyze frequency trends
Select a grouping variable and click 'Calculate' to compare word usage between categories.
Select terms in the sidebar and click 'Analyze' to visualize their dispersion across documents.
Dispersion Metrics
Load data and process documents in the 1. Setup tab first
Dimensions of the plot
Configure settings and click 'Plot Network' to visualize word co-occurrence
Dimensions of the plot
Configure settings and click 'Plot Network' to visualize word correlation
Process documents in the 1. Setup tab first
Configure settings and click 'Calculate' to begin analysis
Upload data to enable Comparative Analysis
Run Comparative Analysis from the sidebar to compare categories and identify unique content, gaps, and cross-category opportunities.
Comparative Analysis Results
Cross-category similarity heatmap comparing reference documents against other categories.
Reference items with low similarity to all comparison categories (distinctive content).
Comparison category items not well-covered by reference category (content gaps).
Items with moderate similarity - potential for cross-category learning or transfer.
Process documents in the 1. Setup tab first
Enter a search query and click 'Search' to see results
Dimensions of the plots
Overall Score = Coherence(z) + Exclusivity(z) - Residual(z) + Heldout(z)
Coherence: How interpretable topics are based on co-occurring words
Exclusivity: How distinctive topics are from each other
Residual: Model fit to data (lower is better)
Heldout: Model's ability to generalize to new data
Configure K range and click 'Search K' to find optimal topic numbers
Configure settings and click 'Run Model' to discover topics
Hybrid model combines STM probabilistic topics with semantic embeddings
STM Metrics: Based on statistical topic modeling
Coherence: How interpretable topics are
Exclusivity: How distinctive topics are
Heldout Likelihood: Generalization performance
Configure K range and click 'Search K' to find optimal topic numbers
Dimensions of the plot
Search K, and then click 'Display' to view word-topic distributions
Run model, then click 'Display' to view topic keywords
Search K, and then click 'Display' to view word-topic distributions
Run topic modeling first, then configure settings in the sidebar and click 'Generate Content' to create content from your topics.
Dimensions of the plot
Complete Word-Topic tab to view document-topic distributions
Complete Word-Topic tab to view document-topic distributions
Complete Word-Topic tab to view document-topic distributions enhanced with semantic embeddings
Run topic model and select a topic to view representative quotes
Click 'Estimate' button to generate effect estimates
Dimensions of the plot
Estimate effects, select categorical covariate, then click 'Display' to visualize topic prevalence by categories
Dimensions of the plot
Estimate effects, select continuous covariate, then click 'Display' to visualize topic prevalence trends