Phase 4

Analyze

Data → Visualizations

You gathered the data. Now make it speak. MuggsOfViz lets you upload your dataset, ask questions about it in plain English, and create publication-ready charts for your final paper.

📐 Open MuggsOfViz
📚 Browse Chart Library
How It Works
0

Data Import

Upload a CSV, browse the Course Library, or upload a codebook. Select variables, preview your data, and scan for patterns with one click.

1

Clean Data

Recode text responses to numbers. Likert scales, ordinal ranges, and multi-select columns are auto-detected so you can start analyzing faster.

2

Analyze

Review each variable with descriptive stats, then build charts with 18 types. Run thematic analysis on open-ended text responses using AI-generated codebooks.

3

Curate & Share

Pick a color theme, customize fonts and labels, pin your best charts to a dashboard, and publish a shareable link or export as PNG.

🔎
Pattern Scanning

Rule-based engine detects skewness, outliers, tight clustering, and cross-variable correlations across your entire dataset.

🧠
AI Guidance

After scanning, an AI model interprets your data patterns in plain English and suggests what to investigate next.

📝
Thematic Analysis

Code open-ended survey responses into themes. AI generates an initial codebook, then you tag each response and create a new variable.

📊
18 Chart Types

Bar, grouped bar, stacked bar, radar, pie, box plot, funnel, scatter, line, heatmap, treemap, sunburst, stacked area, Sankey, parallel coordinates, choropleth map, pictorial bar, and gauge.

🎨
Theming & Export

Five built-in color palettes plus custom colors, font selection, and toggle controls. Export pinned charts as PNG or generate a shareable dashboard link.

📋
Codebook Support

Upload a CSV or TXT codebook to pre-fill variable descriptions. Descriptions appear throughout the tool so you always know what each column measures.

See It In Action
CFU 23: How To — Data Sovereignty Data Sovereignty at New Harmony High

0 Data Import
📋
NHH Wave 1 Survey
Pre-loaded for your account
410 observations 12 variables ✓ Codebook attached
Codebook descriptions will appear throughout the tool so you always know what each variable measures.
  • Your NHH Wave 1 survey dataset is pre-loaded when you open the tool
  • The codebook attaches automatically, providing question text and response options for every variable
  • You can also upload your own CSV or browse the Course Library for other datasets
0 Select Variables
Parish
Age Group ordinal
Transportation Mode
Safety Rating
Timestamp
Household Income
Neighborhood Feelings
6 of 12 selected
  • Check the boxes next to the variables you want to analyze
  • Each variable shows its type (ordinal, nominal, likert) from the codebook
  • Uncheck variables you don't need — you can always add them back later
  • Click "Scan for Patterns" to run an automated analysis of your selections
0 Scan for Patterns
Sweep Findings — Age Group
Roughly uniform distribution across 5 categories (no single group dominates).
No missing values — 410 of 410 responses present.
Modal category: 25–34 (28% of responses).
  • The pattern scanner runs entirely in your browser — no data leaves your computer
  • It checks for skewness, outliers, clustering, and correlations across all selected variables
  • Blue dots = distribution insights, green dots = data quality checks
  • Click "Get AI Guidance" for a plain-English explanation of what the patterns mean
1 Clean Data
Age Group ✓ Auto-detected
ResponseCode
18–241
25–342
35–443
45–544
55+5
  • Survey responses are text ("Strongly Agree", "18-24") but statistics need numbers
  • The tool auto-detects Likert scales and ordinal ranges and suggests numeric codes
  • Review each mapping and accept or customize it
  • All charts downstream will use your recoded values
2 Univariate Review
Age Group
N
410
Unique
5
Mode
25–34
Mode %
28%
18–24
72
25–34
115
35–44
98
45–54
78
55+
47
  • Every selected variable gets a summary card with count, mean, median, mode, and a mini chart
  • Confirm which variables to carry forward to chart building
  • This step helps you spot unexpected distributions before building charts
2 Build a Visual
Primary Variable: Age Group
Choose a chart type:
▮ Bar
◉ Pie
▽ Funnel
▦ Treemap
★ Bar suggested for category frequencies
  • Select a variable and choose from 18 chart types
  • The tool suggests compatible chart types based on your variable's data
  • Bar charts work best for category frequencies; scatter plots for two numeric variables
2 Your Chart
Age Group Distribution
72
18–24
115
25–34
98
35–44
78
45–54
47
55+
25–34: 115 (28%)
  • Charts render instantly with interactive tooltips
  • Click the pin button to save a chart to your dashboard
  • Build multiple charts and compare them side by side
3 Curate & Share
NHH Survey Analysis
📷 Export PNG 🔗 Publish
Theme: Celtic Gold Ocean Forest Mardi Gras
Age Group Distribution
18–24
72
25–34
115
35–44
98
45–54
78
55+
47
  • Choose from 5 color themes or create custom colors
  • Export charts as PNG images for your research paper
  • Generate a shareable link so your teacher and classmates can view your dashboard
  • Everything auto-saves — come back anytime and pick up where you left off
Check for Understanding
What Is Bivariate Analysis?
Univariate — one variable
A
B
C
Bivariate — two variables together
  • “Bi” means two — bivariate analysis examines the relationship between two variables
  • Instead of “what does this variable look like?” you ask “how do these two variables relate?”
  • Example: Age Group alone tells you who responded. Age Group × Transportation tells you whether older people drive more.
When to Use Bivariate Analysis
Variable Pairing Suggested Charts
Categorical × Categorical Grouped bar, Stacked bar, Heatmap
Categorical × Numeric Box plot, Grouped bar (means)
Numeric × Numeric Scatter plot, Line chart
  • Use bivariate when your research question involves a comparison or relationship between two specific variables
  • The chart type depends on what kind of data each variable contains
  • MuggsOfViz suggests the best chart type based on your variables' data types
0 Select Two Variables
Bivariate = two variables at once
Age Group ordinal
Transportation Mode nominal
Parish
Safety Rating
2 variables selected for bivariate analysis
  • Check exactly two variables to explore their relationship
  • "Bivariate" means examining how two variables interact — do older respondents prefer different transportation?
  • The tool labels each variable's type (ordinal, nominal) from the codebook
1 Clean Both Variables
Age Group ✓ Auto
18–241
25–342
35–443
Transportation ✓ Auto
Car1
Bus2
Walk3
  • Both variables get recoded in the Clean Data step
  • Auto-detection works for each variable independently
  • Review and accept or customize each mapping before proceeding
2 Review Each Variable First
Age Group
18–24
72
25–34
115
35–44
98
Transportation Mode
Car
189
Bus
110
Walk
80
  • Before combining variables, review each one individually
  • Univariate stats help you understand the shape of each variable on its own
  • This prevents surprises when you combine them in a bivariate chart
2 Build a Grouped Bar Chart
Primary: Age Group Secondary: Transportation Mode
Choose a chart type:
▮▮ Grouped Bar
▦ Stacked Bar
◉ Heatmap
⋯ Scatter
★ Grouped Bar suggested for two categorical variables
  • With two variables selected, bivariate chart types become available
  • Grouped bar places bars side by side for easy comparison across categories
  • The primary variable goes on the x-axis; the secondary variable creates the groups
2 Your Chart
Transportation Mode by Age Group
32
18–24
22
18–24
18
18–24
58
25–34
36
25–34
21
25–34
Car Bus Walk
  • Each age group shows three bars: one per transportation mode
  • You can instantly see that car use increases with age while walking decreases
  • Hover any bar for a tooltip with exact count and percentage
3 Interpret & Pin
Younger respondents (18–24) walk and take the bus at higher rates than older groups.
Car usage rises steadily with age: from 44% in 18–24 to 68% in 55+.
This bivariate pattern may reflect income, access, or urban density differences across age groups.
📌 Pin to Dashboard 📷 Export PNG
  • After building your chart, write an interpretation — what does the pattern mean?
  • Pin your best charts to the dashboard for your final paper
  • Bivariate charts tell the story of how two variables relate to each other
0 Select Three Variables
Multivariate = three or more variables
Safety Rating primary
Age Group secondary
Parish group-by
Transportation Mode
3 variables: primary + secondary + group-by
  • Pick a primary variable (what you're measuring), a secondary (what you're comparing), and a group-by (how you split the data)
  • The group-by variable creates separate charts or layers for each category
  • Example: "How does safety perception vary by age, broken down by parish?"
1 Clean All Three Variables
Safety
Very Safe5
Safe4
Neutral3
Age Group
18–241
25–342
35–443
Parish nominal
Orleans1
Jefferson2
St. Tammany3
  • All three variables are recoded in Step 1 before analysis
  • Likert scales (Safety) and ordinal ranges (Age) are auto-detected
  • Nominal variables like Parish get simple numeric codes for grouping
2 Build a Stacked Bar Chart
Primary: Safety Rating Secondary: Age Group Group by: Parish
Choose a chart type:
▮ Stacked Bar
◉ Heatmap
⋯ Parallel
◉ Sankey
★ Stacked Bar suggested for 3 categorical variables
  • With three variables, multivariate chart types unlock: stacked bar, heatmap, parallel coordinates, Sankey
  • Stacked bars show composition within each group — how safety ratings stack up by age in each parish
  • The group-by variable (Parish) creates separate chart panels for comparison
2 Your Chart
Safety Rating by Age Group (by Parish)
Orleans Jefferson St. Tammany
18–24
25–34
35–44
45–54
Very Safe Safe Neutral
  • Each bar shows the composition of safety ratings for one age group in Orleans Parish
  • Click the parish tabs to switch between panels and compare the same age groups across parishes
  • Stacked bars make it easy to see proportional differences at a glance
2 Compare Across Groups
In Orleans, younger respondents (18–24) report feeling less safe than those 35+.
In St. Tammany, safety ratings are high across all age groups — less age variation.
The third variable (Parish) reveals patterns invisible in a two-variable chart.
  • Multivariate analysis reveals what two variables alone can't show
  • The group-by variable lets you ask: "Is this pattern the same everywhere, or does it depend on context?"
  • This is the key skill: moving from description to explanation
3 Dashboard
NHH Survey Analysis
📷 Export PNG 🔗 Publish
Age Group Distribution
univariate · bar chart
Transportation by Age
bivariate · grouped bar
Safety × Age × Parish
multivariate · stacked bar
  • Pin multiple charts to build a complete analysis dashboard
  • Mix univariate, bivariate, and multivariate charts to tell a layered story
  • Export individual charts or the whole dashboard for your final paper
2 Select a Text Variable
Univariate
Charts
Thematic
Open-Ended Text Columns
Neighborhood Feelings text
Improvement Suggestions text
"Why do you feel safe or unsafe in your neighborhood?"
  • In Step 2's Thematic tab, select an open-ended text column from your dataset
  • Thematic analysis works on free-text responses where students wrote answers in their own words
  • The codebook description tells you what question was asked
2 AI Generates Codebook
Proposed Themes
Safety & Crime
Mentions of crime, police, feeling safe or unsafe at night
Community & Neighbors
References to knowing neighbors, community events, mutual trust
Infrastructure
Streetlights, roads, sidewalks, public transit access
Environment
Green spaces, flooding, noise, pollution, cleanliness
AI proposed 4 themes from 410 responses
  • The AI reads all responses and proposes a set of themes (a "codebook")
  • Each theme has a name and description so you know what it covers
  • The AI is a starting point — you'll review and edit the themes next
2 Review & Edit Themes
Edit theme names, merge themes, or add your own:
Safety & Crime edit | delete
Community & Neighbors edit | delete
Infrastructure edit | delete
+ Add new theme
  • This is where your judgment matters — the AI proposes, you decide
  • Rename themes to better match the data, merge similar ones, or add new themes the AI missed
  • A strong codebook leads to more meaningful analysis
2 Code Responses
Tag each response with a theme:
"I feel safe because my neighbors look out for each other."
Community & Neighbors
"The streetlights are broken and the road is full of potholes."
Infrastructure
"Too many break-ins on my block this year."
Safety & Crime Community Infrastructure Environment
Safety
42
Community
29
Infrastructure
19
  • Read each response and click the theme that best matches it
  • The live frequency chart updates as you code — you'll see themes emerge in real time
  • This manual step ensures you engage deeply with the data, not just accept AI output
2 Finalize & Create Variable
All 410 responses coded. Codebook finalized with 4 themes.
New Variable Created
Neighborhood_Theme — categorical, 4 levels
Safety & Crime (42) | Community (29) | Infrastructure (19) | Environment (10)
✓ Variable added to your dataset
  • When all responses are coded, click "Finalize" to create a new categorical variable
  • The new variable appears in your dataset alongside the original text column
  • You've turned qualitative text into quantitative data you can chart
2 Use Themes in Charts
Primary: Neighborhood_Theme
Neighborhood Themes
42
Safety
29
Community
19
Infra.
10
Environ.
  • Your theme variable works just like any other categorical variable in the chart builder
  • Use it in univariate charts, or combine it with other variables for bivariate and multivariate analysis
  • Example: "Do neighborhood safety themes differ by parish?" — combine Theme × Parish for a grouped bar chart
Frequently Asked Questions
What data can I use?
Upload any CSV file up to 2 MB. Your teacher may also add datasets to the Course Library that the whole class can use. The tool works with survey results, public data, or any spreadsheet exported as CSV.
What is pattern scanning?
In Step 0, after you select variables, click "Scan for Patterns." A rule engine analyzes every column for skewness, outliers, tight clustering, high variability, and cross-variable correlations. Notable findings appear as colored dots on your variable cards. You can then click "Get AI Guidance" to have an AI model explain the findings in plain English and suggest what to investigate next.
What chart types are available?
18 chart types, organized into three analysis stages that mirror the scientific method:

Comparison: bar, grouped bar, stacked bar, radar
Distribution: pie/donut, box plot, funnel
Relationship: scatter, line, heatmap
Composition: treemap, sunburst, stacked area
Flow: Sankey diagram, parallel coordinates
Geographic: choropleth map
Specialized: pictorial bar, gauge

You start by reviewing each variable on its own (univariate), then build two-variable charts, then layer in a third variable. Use the Chart Library to explore all types with live previews.
What is thematic analysis?
Thematic analysis is for open-ended text responses (like "Why do you feel safe in your neighborhood?"). In Step 2's Thematic tab, you select a text column and the AI generates a codebook of themes. You review and edit the themes, then manually tag each response. The tool tracks theme frequencies with a live bar chart, and when you finalize, it creates a new categorical variable you can use in your charts.
How does data cleaning work?
Step 1 lets you recode text responses into numbers so you can run calculations. For example, "Strongly Agree" becomes 5, "Agree" becomes 4, and so on. The tool auto-detects Likert scales (agreement, frequency, satisfaction), ordinal ranges, and multi-select columns. You can accept the suggested mappings or create your own. Every downstream chart uses your recoded values.
Can I customize how my charts look?
Yes. Step 3 lets you pick a color theme, change fonts, toggle legends and gridlines, and add titles. Your charts should match the style of your final paper.
Is my work saved automatically?
Yes. Every question you ask and every chart you create is saved to the server immediately. Your project settings auto-save after a few seconds of inactivity. Come back anytime and pick up where you left off.
So how does this thing work, exactly?
Here's the full pipeline, from the moment you upload a file to the chart on your screen.
1. Data is uploaded and parsed in your browser
When you drop a CSV file, JavaScript running in your browser reads every row and column. It detects which columns are numbers and which are text (categorical), and identifies Likert scales, ordinal ranges, and multi-select columns automatically. Nothing has left your computer yet.
Tech: Client-side FileReader API + custom CSV parser. Types are inferred by attempting Number() conversion and scanning for known ordinal patterns (e.g. "Strongly Agree...Strongly Disagree").
2. The raw file and parsed data are stored
Your original CSV goes to R2 (Cloudflare's object storage) as a permanent backup. The parsed JSON goes to KV (Cloudflare's key-value store) for fast retrieval. A project row is created in D1 (Cloudflare's SQL database) to track your progress, settings, and which dataset you're working with.
Tech: R2 key = dataviz/{username}/{datasetId}.csv. KV key = dataviz-data:{username}:{datasetId}. D1 table = projects (one row per student, keyed by username).
3. Pattern scanning runs in your browser
When you click "Scan for Patterns," a rule engine analyzes every variable without any server calls. It computes skewness (mean vs. median), detects outliers (values 3+ standard deviations out), identifies tight clustering, checks for uniform distributions, and looks for correlated pairs. Notable findings appear as colored dots on your variable cards.
Tech: All computation is client-side JavaScript. Insights are stored in state.sweepInsights and rendered as badges on variable cards. No data leaves the browser during scanning.
4. AI interprets the patterns (optional)
If you click "Get AI Guidance," a summary of the pattern scan results (not your raw data) is sent to a Cloudflare Worker, which forwards it to a language model. The AI reads the statistical findings and writes a plain-English interpretation: what's interesting, what to investigate, and which chart types might reveal the story.
Tech: The LLM (Gemma 3 12B) runs locally on a GPU via Ollama, accessed through a Cloudflare Tunnel. It receives only the aggregated findings, never your individual rows. Rate-limited per session.
5. Data cleaning: text becomes numbers
In the Clean Data step, you recode categorical responses (like "Strongly Agree") to numeric values (5, 4, 3, 2, 1). The tool auto-detects common Likert and ordinal patterns and suggests mappings. You accept or customize them. All recoded values are stored and applied to every downstream chart.
Tech: Recode mappings are stored in state.recodeMappings as key-value objects per column. Applied at chart-generation time by the Worker, so original data is never modified.
6. Univariate review: every variable at a glance
The Analyze step computes descriptive statistics for every selected column entirely in your browser: count, mean, median, standard deviation, min, max for numbers; frequency counts, mode, and unique values for categories. A mini chart is drawn for each variable. You confirm which variables to carry forward.
Tech: All computation is client-side JavaScript. Mini charts are independent ECharts instances, created and destroyed as you navigate. Confirming your selection unlocks the Build a Visual tab.
7. Chart generation: your variables, computed on the server
When you pick a chart type and variables, the Worker loads your parsed data from KV and runs the appropriate aggregation: grouping, averaging, cross-tabulation, box-plot quartiles, hierarchical nesting, or flow counting depending on chart type. It returns an ECharts "option" object (a JSON description of the chart) plus a plain-English summary.
Tech: No AI is involved in chart generation. It's deterministic computation in the Worker. The option JSON tells ECharts exactly what to draw: axes, series, labels, colors. The chart is saved to D1 with its analysis_level (bivariate or multivariate).
8. The chart renders in your browser
ECharts (an open-source charting library) reads the option JSON and draws an interactive chart on an HTML canvas. You can hover for tooltips, zoom, and pan. Theme colors, fonts, titles, and toggles from Step 3 are applied as a layer on top.
Tech: ECharts v5, canvas renderer. Themes are applied client-side via applyThemeToOption() which deep-clones the option and injects your color palette, font family, and axis/legend visibility.
9. Thematic analysis: AI + human coding
For open-ended text columns, you can run thematic analysis. The AI reads your responses and generates a codebook of themes (e.g., "Safety concerns," "Community pride"). You review, edit, and approve the themes, then manually tag each response. The tool builds a frequency chart of themes and, when finalized, creates a new categorical variable you can use in charts.
Tech: Codebook generation sends response text to the LLM via the Worker. The coding step is entirely client-side. The new variable is injected into state.parsedData and persists with the project.
10. Everything auto-saves
Project settings (theme, colors, titles, step progress, curated variables, recode mappings) auto-save to D1 after 3 seconds of inactivity. Charts are saved immediately when created. When you log in again, the Worker loads everything back and your browser reconstructs the full UI state, including which analysis stages are unlocked.
Tech: Debounced save (3s) via scheduleSave(). On load, the Worker returns the full project in one response. A project snapshot is automatically saved on each load as a safety net.

What about cost? The language model (Gemma 3) is open-source and runs on hardware the school already owns. Cloudflare's free and paid tiers handle the storage and serverless compute. There are no per-question API fees.
Can I try it without logging in?
Yes. Use the guest account (username: guest, password: 123456) to explore the tool. Guest data is saved in your browser only and won't persist across devices.