Scoring

A lyric score that actually means something

Every lyric is measured across 12 metrics in Craft, Expression, and Impact. Scores are deliberately hard. A 50 is average. An 80+ is strong. A 90+ is rare.

12 metricsEvidence-based scoringAnti-inflation built in

12 metrics across Craft, Expression, and Impact

Weighted composite: Expression counts most (40%)

Anti-inflation rules prevent meaningless high scores

Every score includes per-metric reasoning and evidence

Deliberately hard — a 50 is average, not a failing grade

Craft (25%)

Can this person write? Mechanics, structure, rhyme, and word choice.

Expression (40%)

Does it say something worth hearing? Specificity, originality, truth, and voice.

Impact (35%)

Will anyone remember it tomorrow? Transcendence, arc, stickiness, and genre fit.

Sample scorecard

What an actual evaluation looks like — annotated.

Composite

Grade B+

Genre

Country

Top 22% in genre

Prosody74

Strong natural rhythm, one forced rhyme in V2

Structure80

Clean arc, bridge earns its place

Rhyme72

Good slant rhyme use, one predictable end-rhyme

Economy76

Tight overall, two filler words in chorus

Specificity85

"Tangerines and someone else's smile" — earned

Imagery82

Original governing image, one stock metaphor

Emotion79

Rings true. Bridge vulnerability is genuine.

Voice77

Consistent narrator, one POV slip in V3

Transcendence81

Line 14 is the one. "Drove home with the windows down to forget it."

Arc75

Moves from avoidance to acceptance. Could push further.

Memorable73

Chorus hook is sticky, verses less so

Genre80

Authentic country with modern specificity

We built anti-inflation into the scoring system so that high scores actually mean something.

Gravity Rule

The default is 50, not 80. Every point above average must be earned with specific evidence from the lyrics.

Burden of Proof

Scores above 80 require the scorer to cite specific lines and explain why they justify the number.

Antagonist Ceiling

A dedicated critical voice challenges every score. If it finds a real weakness, the score drops.

Historical Context

Scores are anchored to professional craft standards. A 90+ means near-flawless execution across all 12 metrics — intentionally rare.

Methodology: how scoring works

Every song is scored by a separate AI evaluation pass — not the same model that wrote the lyrics. Multiple evaluators with different perspectives must reach consensus on each of the 12 metrics.

A dedicated critical voice challenges every score. If it identifies a real weakness — a cliché, a broken meter, a forced rhyme — the score drops. Unresolved objections cap the composite depending on severity.

This rigorous multi-voice process prevents the inflated scores that single-pass AI evaluation produces. Scores are calibrated relative to professional songwriting craft, not to other AI output.

What “deliberately hard” means: a single-pass AI scorer will give most output 80+. Our multi-voice process produces a distribution centered around 50, because the default assumption is “average until proven otherwise.” Scores above 80 require the scorer to cite specific lines. Scores above 90 require near-flawless execution across all 12 metrics — which is why they are rare in practice, not by arbitrary design.

Grade Scale

S+

95-100

Near-flawless across all 12 metrics. Exceptionally rare in practice.

90-94

Exceptional. Every line earns its place with cited evidence.

A+

85-89

Outstanding. Minor imperfections only.

80-84

Strong. Craft is evident throughout.

B+

75-79

Good. Solid work with room to grow.

70-74

Competent. Foundation is there.

C+

65-69

Developing. Moments of promise.

55-64

Average. Functional but unremarkable.

40-54

Below average. Significant gaps.

0-39

Needs fundamental rework.

How the composite score works

Each metric scores 0-100. The composite is a weighted average across the three tiers:

Craft (25%)+Expression (40%)+Impact (35%)=Composite

What a score should help you do

Spot the exact weakness in a lyric

Decide whether to refine or move on

Compare multiple versions objectively

Know whether a song is worth taking into production

See it in action

Every song you forge or evaluate gets a full 12-metric breakdown with reasoning per metric.

Start Free Hear Examples