Phase 4

Analyze

Data → Visualizations

You gathered the data. Now make it speak. MuggsOfViz lets you upload your dataset, ask questions about it in plain English, and create publication-ready charts for your final paper.

📐 Open MuggsOfViz
📚 Browse Chart Library

0 Data Import

📋

NHH Wave 1 Survey

Pre-loaded for your account

410 observations 12 variables ✓ Codebook attached

Codebook descriptions will appear throughout the tool so you always know what each variable measures.

Your NHH Wave 1 survey dataset is pre-loaded when you open the tool
The codebook attaches automatically, providing question text and response options for every variable
You can also upload your own CSV or browse the Course Library for other datasets

0 Select Variables

Parish

Age Group ordinal

Transportation Mode

Safety Rating

Timestamp

Household Income

Neighborhood Feelings

6 of 12 selected

Check the boxes next to the variables you want to analyze
Each variable shows its type (ordinal, nominal, likert) from the codebook
Uncheck variables you don't need — you can always add them back later
Click "Scan for Patterns" to run an automated analysis of your selections

0 Scan for Patterns

Sweep Findings — Age Group

Roughly uniform distribution across 5 categories (no single group dominates).

No missing values — 410 of 410 responses present.

Modal category: 25–34 (28% of responses).

The pattern scanner runs entirely in your browser — no data leaves your computer
It checks for skewness, outliers, clustering, and correlations across all selected variables
Blue dots = distribution insights, green dots = data quality checks
Click "Get AI Guidance" for a plain-English explanation of what the patterns mean

1 Clean Data

Age Group ✓ Auto-detected

Response		Code
18–24	→	1
25–34	→	2
35–44	→	3
45–54	→	4
55+	→	5

Survey responses are text ("Strongly Agree", "18-24") but statistics need numbers
The tool auto-detects Likert scales and ordinal ranges and suggests numeric codes
Review each mapping and accept or customize it
All charts downstream will use your recoded values

2 Univariate Review

Age Group

410

Unique

Mode

25–34

Mode %

28%

18–24

25–34

115

35–44

45–54

55+

Every selected variable gets a summary card with count, mean, median, mode, and a mini chart
Confirm which variables to carry forward to chart building
This step helps you spot unexpected distributions before building charts

2 Build a Visual

Primary Variable: Age Group

Choose a chart type:

▮ Bar

◉ Pie

▽ Funnel

▦ Treemap

★ Bar suggested for category frequencies

Select a variable and choose from 18 chart types
The tool suggests compatible chart types based on your variable's data
Bar charts work best for category frequencies; scatter plots for two numeric variables

2 Your Chart

Age Group Distribution

18–24

115

25–34

35–44

45–54

55+

Charts render instantly with interactive tooltips
Click the pin button to save a chart to your dashboard
Build multiple charts and compare them side by side

3 Curate & Share

NHH Survey Analysis

📷 Export PNG 🔗 Publish

Theme: Celtic Gold Ocean Forest Mardi Gras

Age Group Distribution

18–24

25–34

115

35–44

45–54

55+

Choose from 5 color themes or create custom colors
Export charts as PNG images for your research paper
Generate a shareable link so your teacher and classmates can view your dashboard
Everything auto-saves — come back anytime and pick up where you left off

✓ Check for Understanding

Answer all 10 questions to check your understanding of the univariate analysis process.

1. What does "univariate" mean?

A. Analyzing two variables together

B. Analyzing one variable at a time

C. Analyzing all variables at once

D. Combining variables into a new column

Correct! "Uni" means one. Univariate analysis examines a single variable to understand its distribution.

Not quite. "Uni" means one — univariate analysis looks at one variable at a time.

2. Why do we recode text responses like "18–24" into numbers like 1?

A. To make the dataset smaller

B. To hide the original answers

C. So we can calculate statistics and build charts

D. Because computers can't read text

Correct! Statistics like mean and median require numeric values. Recoding converts text categories to numbers while preserving their order.

Not quite. We recode so that statistical operations (mean, median, charts) can work with the data. Text can't be averaged.

3. Where does the pattern scanning happen?

A. Entirely in your browser — no data leaves your computer

B. On the school's server

C. In a cloud AI model

D. In Microsoft Excel

Correct! The pattern scanner is client-side JavaScript. Your data stays in the browser during scanning.

Not quite. Pattern scanning runs entirely in your browser using JavaScript. No data is sent to any server.

4. In the Age Group example, what was the modal category?

A. 18–24

B. 35–44

C. 55+

D. 25–34

Correct! 25–34 was the mode with 115 responses (28% of the total).

Not quite. The mode is the most frequent category. 25–34 had 115 responses (28%), the highest count.

5. What does the "mode" of a variable tell you?

A. The average value

B. The most frequently occurring category or value

C. The middle value when sorted

D. The range between the highest and lowest values

Correct! The mode is the value that appears most often in the data.

Not quite. The mode is the most frequently occurring value. (A is the mean, C is the median, D is the range.)

6. Why does the tool suggest a bar chart for Age Group?

A. Bar charts are the only chart type available

B. Age Group is a continuous numeric variable

C. Bar charts are effective for showing category frequencies

D. The AI told it to

Correct! Bar charts display the count or proportion of each category clearly, making them ideal for categorical frequency data.

Not quite. Bar charts are suggested because they're effective at displaying how many responses fall into each category.

7. What is a codebook?

A. A document describing what each variable measures and its response options

B. The JavaScript code that powers the tool

C. A chart showing the distribution of all variables

D. A password for logging in

Correct! A codebook documents the survey: variable names, question text, and possible response values.

Not quite. A codebook is a reference document that describes each variable's name, the question asked, and the possible response options.

8. In the walkthrough, how many observations (rows) were in the dataset?

A. 12

B. 115

C. 5

D. 410

Correct! The NHH Wave 1 Survey had 410 observations (rows) and 12 variables (columns).

Not quite. The dataset had 410 observations. (12 is the number of variables/columns, 115 is the count for one age group, and 5 is the number of Age Group categories.)

9. What is the purpose of the univariate review step before building charts?

A. To delete variables you don't like

B. To understand each variable's distribution and spot unexpected patterns

C. To send data to the AI for grading

D. To convert all data to text

Correct! Reviewing each variable's stats and mini chart helps you catch problems and understand the data before committing to chart building.

Not quite. The univariate review shows summary statistics and a mini chart for each variable so you can spot unexpected distributions before building your final charts.

10. What does "pinning" a chart do?

A. Locks the chart so nobody else can see it

B. Deletes all other charts

C. Saves it to your dashboard for your final paper

D. Sends it to your teacher immediately

Correct! Pinning saves a chart to your dashboard, where you can theme it, export it as PNG, or publish a shareable link.

Not quite. Pinning saves the chart to your personal dashboard so you can include it in your final paper, export it, or share it.

★ What Is Bivariate Analysis?

Univariate — one variable

→

Bivariate — two variables together

“Bi” means two — bivariate analysis examines the relationship between two variables
Instead of “what does this variable look like?” you ask “how do these two variables relate?”
Example: Age Group alone tells you who responded. Age Group × Transportation tells you whether older people drive more.

★ When to Use Bivariate Analysis

Variable Pairing	Suggested Charts
Categorical × Categorical	Grouped bar, Stacked bar, Heatmap
Categorical × Numeric	Box plot, Grouped bar (means)
Numeric × Numeric	Scatter plot, Line chart

Use bivariate when your research question involves a comparison or relationship between two specific variables
The chart type depends on what kind of data each variable contains
MuggsOfViz suggests the best chart type based on your variables' data types

0 Select Two Variables

Bivariate = two variables at once

Age Group ordinal

Transportation Mode nominal

Parish

Safety Rating

2 variables selected for bivariate analysis

Check exactly two variables to explore their relationship
"Bivariate" means examining how two variables interact — do older respondents prefer different transportation?
The tool labels each variable's type (ordinal, nominal) from the codebook

1 Clean Both Variables

Age Group ✓ Auto

18–24	→	1
25–34	→	2
35–44	→	3

Transportation ✓ Auto

Car	→	1
Bus	→	2
Walk	→	3

Both variables get recoded in the Clean Data step
Auto-detection works for each variable independently
Review and accept or customize each mapping before proceeding

2 Review Each Variable First

Age Group

18–24

25–34

115

35–44

Transportation Mode

Car

189

Bus

110

Walk

Before combining variables, review each one individually
Univariate stats help you understand the shape of each variable on its own
This prevents surprises when you combine them in a bivariate chart

2 Build a Grouped Bar Chart

Primary: Age Group Secondary: Transportation Mode

Choose a chart type:

▮▮ Grouped Bar

▦ Stacked Bar

◉ Heatmap

⋯ Scatter

★ Grouped Bar suggested for two categorical variables

With two variables selected, bivariate chart types become available
Grouped bar places bars side by side for easy comparison across categories
The primary variable goes on the x-axis; the secondary variable creates the groups

2 Your Chart

Transportation Mode by Age Group

18–24

25–34

■ Car ■ Bus ■ Walk

Each age group shows three bars: one per transportation mode
You can instantly see that car use increases with age while walking decreases
Hover any bar for a tooltip with exact count and percentage

3 Interpret & Pin

Younger respondents (18–24) walk and take the bus at higher rates than older groups.

Car usage rises steadily with age: from 44% in 18–24 to 68% in 55+.

This bivariate pattern may reflect income, access, or urban density differences across age groups.

📌 Pin to Dashboard 📷 Export PNG

After building your chart, write an interpretation — what does the pattern mean?
Pin your best charts to the dashboard for your final paper
Bivariate charts tell the story of how two variables relate to each other

Answer all 10 questions to check your understanding of bivariate analysis.

1. What does “bivariate” mean?

A. Analyzing one variable at a time

B. Analyzing two variables to explore their relationship

C. Analyzing all variables at once

D. Combining variables into a single index

Correct! “Bi” means two. Bivariate analysis examines the relationship between two variables.

Not quite. “Bi” means two — bivariate analysis looks at two variables together to explore their relationship.

2. In a grouped bar chart, what does the x-axis typically represent?

A. The count of observations

B. The secondary grouping variable

C. The primary category variable

D. The chart title

Correct! The x-axis shows the primary category variable, with grouped bars for each level of the secondary variable.

Not quite. In a grouped bar chart, the x-axis displays the primary category variable (e.g., Age Group).

3. Why review each variable individually before building a bivariate chart?

A. The tool requires it before unlocking bivariate mode

B. To understand each variable’s distribution and catch problems

C. To delete outliers from the data

D. To merge both variables into one column

Correct! Reviewing each variable individually helps you understand its shape and catch problems before combining them.

Not quite. Reviewing each variable first helps you understand its distribution and spot anything unexpected before you combine them.

4. In the walkthrough, what two variables were analyzed together?

A. Parish and Safety Rating

B. Transportation Mode and Safety Rating

C. Age Group and Parish

D. Age Group and Transportation Mode

Correct! The walkthrough used Age Group and Transportation Mode to show how transportation habits differ across age groups.

Not quite. The bivariate walkthrough analyzed Age Group and Transportation Mode together.

5. What chart type does the tool suggest for two categorical variables?

A. Scatter plot

B. Line chart

C. Grouped bar chart

D. Box plot

Correct! Grouped bar charts place bars side by side, making it easy to compare categories across groups.

Not quite. For two categorical variables, a grouped bar chart is suggested because it shows side-by-side comparisons.

6. What pattern was visible in the example chart?

A. All age groups used transportation equally

B. Car use increased with age while walking decreased

C. Bus was the most popular mode for every age group

D. Walking was highest among the 55+ group

Correct! The chart showed car use rising with age while walking rates declined in older groups.

Not quite. The example chart showed that car use increased with age while walking decreased.

7. What is the role of the “secondary” variable in a grouped bar chart?

A. It sets the chart title

B. It creates the color-coded groups within each category

C. It filters out irrelevant data

D. It determines the y-axis scale

Correct! The secondary variable creates color-coded bars within each primary category, enabling direct comparison.

Not quite. The secondary variable creates the color-coded groups (e.g., Car, Bus, Walk) within each primary category on the x-axis.

8. Which variable combination would best be shown with a scatter plot?

A. Gender × Transportation Mode

B. Age (numeric) × Income (numeric)

C. Parish × Safety Rating

D. Transportation Mode × Age Group

Correct! Scatter plots are ideal for two numeric variables, where each point represents one observation.

Not quite. Scatter plots work best for two numeric variables like Age and Income, where each dot is one observation.

9. After building a bivariate chart, what should you do before adding it to your dashboard?

A. Delete the secondary variable

B. Write an interpretation of what the pattern means

C. Convert the chart to a pie chart

D. Remove all labels from the chart

Correct! Writing an interpretation helps you articulate the pattern and prepares the chart for your final paper.

Not quite. Before pinning, you should write an interpretation explaining the pattern so your chart tells a clear story in your paper.

10. What makes bivariate analysis more powerful than univariate?

A. It uses more colors in the charts

B. It reveals relationships and comparisons between variables

C. It requires less data

D. It automatically writes your research paper

Correct! Bivariate analysis goes beyond describing one variable to revealing how two variables relate to each other.

Not quite. Bivariate analysis is more powerful because it reveals relationships and comparisons between two variables, not just describing one.

0 Select Three Variables

Multivariate = three or more variables

Safety Rating primary

Age Group secondary

Parish group-by

Transportation Mode

3 variables: primary + secondary + group-by

Pick a primary variable (what you're measuring), a secondary (what you're comparing), and a group-by (how you split the data)
The group-by variable creates separate charts or layers for each category
Example: "How does safety perception vary by age, broken down by parish?"

1 Clean All Three Variables

Safety ✓

Very Safe	→	5
Safe	→	4
Neutral	→	3

Age Group ✓

18–24	→	1
25–34	→	2
35–44	→	3

Parish nominal

Orleans	→	1
Jefferson	→	2
St. Tammany	→	3

All three variables are recoded in Step 1 before analysis
Likert scales (Safety) and ordinal ranges (Age) are auto-detected
Nominal variables like Parish get simple numeric codes for grouping

2 Build a Stacked Bar Chart

Primary: Safety Rating Secondary: Age Group Group by: Parish

Choose a chart type:

▮ Stacked Bar

◉ Heatmap

⋯ Parallel

◉ Sankey

★ Stacked Bar suggested for 3 categorical variables

With three variables, multivariate chart types unlock: stacked bar, heatmap, parallel coordinates, Sankey
Stacked bars show composition within each group — how safety ratings stack up by age in each parish
The group-by variable (Parish) creates separate chart panels for comparison

2 Your Chart

Safety Rating by Age Group (by Parish)

Orleans Jefferson St. Tammany

18–24

25–34

35–44

45–54

■ Very Safe ■ Safe ■ Neutral

Each bar shows the composition of safety ratings for one age group in Orleans Parish
Click the parish tabs to switch between panels and compare the same age groups across parishes
Stacked bars make it easy to see proportional differences at a glance

2 Compare Across Groups

In Orleans, younger respondents (18–24) report feeling less safe than those 35+.

In St. Tammany, safety ratings are high across all age groups — less age variation.

The third variable (Parish) reveals patterns invisible in a two-variable chart.

Multivariate analysis reveals what two variables alone can't show
The group-by variable lets you ask: "Is this pattern the same everywhere, or does it depend on context?"
This is the key skill: moving from description to explanation

3 Dashboard

NHH Survey Analysis

📷 Export PNG 🔗 Publish

Age Group Distribution

univariate · bar chart

Transportation by Age

bivariate · grouped bar

Safety × Age × Parish

multivariate · stacked bar

Pin multiple charts to build a complete analysis dashboard
Mix univariate, bivariate, and multivariate charts to tell a layered story
Export individual charts or the whole dashboard for your final paper

Answer all 10 questions to check your understanding of multivariate analysis.

1. What does “multivariate” mean in the context of this tool?

A. Analyzing one variable in great detail

B. Analyzing two variables together

C. Analyzing three or more variables simultaneously

D. Combining all variables into a single score

Correct! Multivariate analysis examines three or more variables at once to reveal patterns that simpler analyses miss.

Not quite. “Multi” means many — multivariate analysis works with three or more variables simultaneously.

2. In the walkthrough, which three variables were analyzed together?

A. Transportation Mode, Gender, and Income

B. Safety Rating, Age Group, and Parish

C. Age Group, Transportation Mode, and Safety Rating

D. Parish, Gender, and Age Group

Correct! The walkthrough used Safety Rating (primary), Age Group (secondary), and Parish (group-by).

Not quite. The three variables were Safety Rating, Age Group, and Parish.

3. What is the role of the “group-by” variable in a multivariate chart?

A. It sets the chart title automatically

B. It filters out missing data

C. It splits the data into separate panels or layers for comparison

D. It averages the other two variables

Correct! The group-by variable creates separate panels — so you can see if a pattern holds across different contexts.

Not quite. The group-by variable splits the data into separate panels (e.g., one per parish) so you can compare the same relationship across groups.

4. Which chart type was suggested for three categorical variables?

A. Scatter plot

B. Line chart

C. Stacked bar chart

D. Box plot

Correct! Stacked bar charts show the composition within each group, making them ideal for three categorical variables.

Not quite. The tool suggested a stacked bar chart because it shows how categories stack up within each group.

5. What does a stacked bar chart show that a grouped bar chart does not?

A. The total count for each category and how it breaks down internally

B. The average of all variables

C. The correlation between two numeric variables

D. The median response time

Correct! Stacked bars show both the total height and the internal composition — how each segment contributes to the whole.

Not quite. A stacked bar shows the total for each category and how the sub-categories compose that total.

6. In the walkthrough, what pattern appeared in Orleans Parish?

A. All age groups reported identical safety ratings

B. Safety ratings were unavailable for Orleans

C. Younger respondents (18–24) reported feeling less safe than those 35+

D. Older respondents refused to answer the safety question

Correct! In Orleans, younger respondents reported feeling less safe, while older groups felt safer — a pattern only visible with the parish split.

Not quite. The walkthrough showed that in Orleans, younger respondents (18–24) felt less safe than those 35 and older.

7. Why is the third variable important in multivariate analysis?

A. It makes the chart more colorful

B. It reveals whether a pattern holds across different contexts or changes depending on the group

C. It is required by the software to generate any chart

D. It replaces the need for a primary variable

Correct! The third variable lets you ask: “Is this pattern the same everywhere, or does it depend on context?” — the key move from description to explanation.

Not quite. The third variable reveals whether a two-variable pattern is universal or context-dependent — the shift from description to explanation.

8. Besides stacked bar, which other chart types are available for multivariate data?

A. Pie chart and donut chart

B. Heatmap, parallel coordinates, and Sankey

C. Histogram and stem-and-leaf

D. Only stacked bar is available

Correct! The tool unlocks heatmap, parallel coordinates, and Sankey diagrams when you select three or more variables.

Not quite. With three variables, you also unlock heatmap, parallel coordinates, and Sankey chart types.

9. How is a nominal variable like Parish handled during the recode step?

A. It is converted to a Likert scale

B. It receives simple numeric codes for grouping, with no implied order

C. It is deleted because nominal data cannot be charted

D. It is averaged across all observations

Correct! Nominal variables get numeric codes (Orleans → 1, Jefferson → 2) purely for grouping, without implying any rank or order.

Not quite. Nominal variables like Parish are assigned simple numeric codes for grouping purposes, with no implied ranking.

10. What advantage does a multivariate dashboard give over showing a single chart?

A. It automatically writes the research paper for you

B. It reduces the file size of your analysis

C. It lets you layer univariate, bivariate, and multivariate charts to tell a complete story

D. It limits your analysis to only three variables

Correct! A dashboard lets you combine chart types at different analysis levels to build a layered, complete narrative for your final paper.

Not quite. A dashboard lets you mix univariate, bivariate, and multivariate charts side by side to tell a layered analytical story.

2 Select a Text Variable

Univariate

Charts

Thematic

Open-Ended Text Columns

Neighborhood Feelings text

Improvement Suggestions text

"Why do you feel safe or unsafe in your neighborhood?"

In Step 2's Thematic tab, select an open-ended text column from your dataset
Thematic analysis works on free-text responses where students wrote answers in their own words
The codebook description tells you what question was asked

2 AI Generates Codebook

Proposed Themes

Safety & Crime
Mentions of crime, police, feeling safe or unsafe at night

Community & Neighbors
References to knowing neighbors, community events, mutual trust

Infrastructure
Streetlights, roads, sidewalks, public transit access

Environment
Green spaces, flooding, noise, pollution, cleanliness

AI proposed 4 themes from 410 responses

The AI reads all responses and proposes a set of themes (a "codebook")
Each theme has a name and description so you know what it covers
The AI is a starting point — you'll review and edit the themes next

2 Review & Edit Themes

Edit theme names, merge themes, or add your own:

Safety & Crime edit | delete

Community & Neighbors edit | delete

Infrastructure edit | delete

+ Add new theme

This is where your judgment matters — the AI proposes, you decide
Rename themes to better match the data, merge similar ones, or add new themes the AI missed
A strong codebook leads to more meaningful analysis

2 Code Responses

Tag each response with a theme:

"I feel safe because my neighbors look out for each other."

Community & Neighbors

"The streetlights are broken and the road is full of potholes."

Infrastructure

"Too many break-ins on my block this year."

Safety & Crime Community Infrastructure Environment

Safety

Community

Infrastructure

Read each response and click the theme that best matches it
The live frequency chart updates as you code — you'll see themes emerge in real time
This manual step ensures you engage deeply with the data, not just accept AI output

2 Finalize & Create Variable

All 410 responses coded. Codebook finalized with 4 themes.

New Variable Created

Neighborhood_Theme — categorical, 4 levels
Safety & Crime (42) | Community (29) | Infrastructure (19) | Environment (10)

✓ Variable added to your dataset

When all responses are coded, click "Finalize" to create a new categorical variable
The new variable appears in your dataset alongside the original text column
You've turned qualitative text into quantitative data you can chart

2 Use Themes in Charts

Primary: Neighborhood_Theme

Neighborhood Themes

Safety

Community

Infra.

Environ.

Your theme variable works just like any other categorical variable in the chart builder
Use it in univariate charts, or combine it with other variables for bivariate and multivariate analysis
Example: "Do neighborhood safety themes differ by parish?" — combine Theme × Parish for a grouped bar chart

Analyze

Data Import

Clean Data

Analyze

Curate & Share

Pattern Scanning

AI Guidance

Thematic Analysis

18 Chart Types

Theming & Export

Codebook Support