Select columns and click 'Apply' to unite text columns
Configure options and click 'Apply' to tokenize texts
Select stopwords and click 'Apply' to remove common words
Configure settings and click 'Apply' to detect multi-words
Select n-grams and click 'Apply' to compound multi-words
Click 'Process' to create document-feature matrix
Click 'Apply' to run spaCy linguistic analysis, or 'Skip' to use standard tokenization.
First, click 'Apply' on the Word Forms tab to run spaCy analysis. Then configure and click 'Apply' here to view POS tags.
Click 'Apply' on the Word Forms tab to run spaCy analysis and extract named entities.
Click 'Apply' on the Word Forms tab to run spaCy analysis and extract dependency parsing data.
Height of the plot
Select terms, continuous variable, and click 'Plot Terms' to analyze frequency trends
Embedding Generation
Generate embeddings for advanced semantic analyses (Document Similarity, Search, Clustering).
Load data and process documents in the 1. Setup tab first
Process documents in the 1. Setup tab first
Configure settings and click 'Calculate' to begin analysis
Process documents in the 1. Setup tab first
Enter a search query and click 'Search' to see results
Dimensions of the plot
Configure settings and click 'Plot Network' to visualize word co-occurrence
Dimensions of the plot
Configure settings and click 'Plot Network' to visualize word correlation
Click 'Reduce Dimensionality' to generate visualization
Explore Groups
Top Terms
Sample Documents
Click 'Reduce Dimensionality' then optionally 'Apply Clustering' to create document groups
Label Generation
Click 'Reduce Dimensionality' then 'Apply Clustering' to create groups for labeling
Dimensions of the plots
Overall Score = Coherence(z) + Exclusivity(z) - Residual(z) + Heldout(z)
Coherence: How interpretable topics are based on co-occurring words
Exclusivity: How distinctive topics are from each other
Residual: Model fit to data (lower is better)
Heldout: Model's ability to generalize to new data
Configure K range and click 'Search K' to find optimal topic numbers
Configure settings and click 'Run Model' to discover topics
Hybrid model combines STM probabilistic topics with semantic embeddings
STM Metrics: Based on statistical topic modeling
Coherence: How interpretable topics are
Exclusivity: How distinctive topics are
Heldout Likelihood: Generalization performance
Configure K range and click 'Search K' to find optimal topic numbers
Dimensions of the plot
Search K, and then click 'Display' to view word-topic distributions
Run model, then click 'Display' to view topic keywords
Search K, and then click 'Display' to view word-topic distributions
Dimensions of the plot
Complete Word-Topic tab to view document-topic distributions
Complete Word-Topic tab to view document-topic distributions
Complete Word-Topic tab to view document-topic distributions enhanced with semantic embeddings
Run topic model and select a topic to view representative quotes
Click 'Estimate' button to generate effect estimates
Dimensions of the plot
Estimate effects, select categorical covariate, then click 'Display' to visualize topic prevalence by categories
Dimensions of the plot
Estimate effects, select continuous covariate, then click 'Display' to visualize topic prevalence trends