How Content is Graded

Score Range: 0-100 points across 7 categories

Letter Grades:

  • A+ (97-100), A (93-96), A- (90-92)
  • B+ (87-89), B (83-86), B- (80-82)
  • C+ (77-79), C (73-76), C- (70-72)
  • D+ (67-69), D (63-66), D- (60-62)
  • F (0-59)

Category Breakdown:

  1. Substance (22 points) - Content depth, accuracy, value delivery
  2. Voice (17 points) - Brand consistency, tone, conversational markers
  3. Structure (17 points) - H2/H3 hierarchy, section balance, flow
  4. Clarity (17 points) - Readability, jargon-free language, sentence length
  5. Engagement (17 points) - Hooks, CTAs, reader connection
  6. SEO (10 points) - Keyword placement, meta elements, search optimization
  7. Visual (10 points) - Formatting, scannability (not penalized - images generated separately)

What Gets Detected

53 Issue Types Across 8 Categories:

Hook & Opening (5 types)

  • Weak opening - Generic first sentence
  • Generic intro - "In today's world..." phrases
  • Buried lede - Key point not in first 50 words
  • No promise - Opening doesn't state reader value
  • Slow start - Too much preamble

Structure (6 types)

  • Poor hierarchy - H3 without H2, skipped levels
  • Unbalanced sections - Sections with wildly different lengths
  • Missing H2 headers - No H2 tags at all (knocks Structure to 0)
  • Too few sections - Fewer than 3 H2 sections
  • Wall of text - Paragraphs exceeding 200 words
  • Overly thin sections - Sections under 100 words

Substance (7 types)

  • Thin content - Total word count too low
  • Lack of specifics - Vague claims without evidence
  • No examples - Missing concrete examples
  • Unsupported claims - Claims without backing
  • Missing data - No statistics or facts
  • Surface level - Doesn't go deep enough
  • Repetitive - Redundant points

Voice (12 types)

  • AI-isms - "Delve," "navigate," "realm," "landscape," "robust," "streamline," etc.
  • Clichés - "At the end of the day," "game-changer," "cutting-edge"
  • Conversational markers - Excessive "you," "your," "we"
  • Corporate jargon - "Leverage," "synergy," "holistic"
  • Fake specificity - "5 easy steps" without real specificity
  • Formulaic openings - "When it comes to," "In today's world"
  • Repetitive voice framing - Overuse of same framing device
  • Negative phrasing - "Don't miss," "avoid," "prevent"
  • Vague language - "Some," "many," "various"
  • Passive voice - Exceeding 10% of sentences
  • Em-dashes - Prohibited punctuation (highest priority AI signal)
  • Fabrication patterns - Phrases that signal made-up content

Clarity (7 types)

  • Long sentences - Sentences exceeding 35 words on average
  • Dense paragraphs - Paragraphs over 150 words
  • Unclear antecedents - Ambiguous pronouns
  • Run-on sentences - Multiple clauses without breaks
  • Complex vocabulary - Unnecessarily difficult words
  • Jargon heavy - Too much industry terminology
  • Poor transitions - Abrupt section changes

Engagement (6 types)

  • No hook - Missing engaging opening
  • Weak CTA - Call-to-action is generic or missing
  • No questions - Content doesn't engage reader thinking
  • Rhetorical questions - Overused rhetorical devices
  • No reader benefit - Doesn't explain "what's in it for me"
  • Boring subheads - H2/H3 tags don't create curiosity

SEO (5 types)

  • Missing keyword - Target keyword not in first 100 words
  • Keyword stuffing - Keyword density exceeds 3%
  • No keyword in title - Target keyword missing from H1/title
  • No bold keywords - Target keyword never bolded
  • Thin meta - Missing or poor meta description

Formatting (5 types)

  • BR tags - Using

    instead of

    tags

  • Markdown remnants - **bold** instead of
  • Placeholders - [INSERT X], [ADD Y] still present
  • H1 tags in body - Title should be separate
  • Inconsistent formatting - Mixed styles

Total: 53 issue types monitored

Auto-Fixing Process

Content goes through a three-phase auto-fixing pipeline before grading:

Phase 1: Programmatic Fixes (No AI cost)

  • BR tags →

    paragraphs

  • Markdown → HTML conversion
  • Placeholder removal ([INSERT], [ADD])
  • H1 tag stripping

Phase 2: Em-Dash Fixing (Highest priority)

  • Separate AI call
  • Context-aware replacements
  • Critical signal of AI-generated text

Phase 3: Style & Word Count (Combined call)

  • Negative phrasing removal
  • Rhetorical question removal
  • Cliché elimination
  • Formulaic opening fixes
  • Word count reduction (if over target by 10%+)

What is NOT Auto-Fixed:

  • Structure issues (heading hierarchy)
  • Substance issues (missing examples, thin content)
  • Most engagement issues (weak CTAs, no hook)
  • SEO issues (keyword placement)

These require regeneration with updated prompt instructions.

Penalty System

Issue Severity:

  • Critical: 3 points per occurrence (no cap)
  • High: 2 points per occurrence (capped at 10)
  • Medium: 1 point per occurrence (capped at 6)
  • Low: 0.5 points per occurrence (capped at 4)

Knockout Issues (Zero out entire category):

  • no_h2_headers - No H2 tags → Structure = 0 points

Example Scoring:

  • Content has 5 AI-isms (critical): -15 points from Voice
  • Content has 8 clichés (high): -10 points (capped) from Voice
  • Content has weak opening (high): -2 points from Engagement
  • Missing keyword in title (medium): -1 point from SEO
  • Total penalty: -28 points
  • Final score: 72/100 (C)

Where You See Grades

Content Hub - Grade badge next to each content piece

Content Detail Page - Full breakdown:

  • Overall grade (letter + score)
  • Category scores (7 individual scores)
  • Issues detected (full list with severity)
  • Issues fixed (what was auto-corrected)
  • Issues remaining (what needs manual attention)

Brief Detail Page - If content generated from brief

Email Reports - Content generation completion emails include grade

Improving Your Grade

For Substance Issues:

  • Regenerate with "Add specific examples" instruction
  • Manually add data, statistics, case studies
  • Deepen analysis in weak sections

For Voice Issues:

  • Use Learning Quality System (platform learns from your edits)
  • Mark false positives as "Not an issue" to suppress detection
  • Regenerate with updated brand voice samples

For Structure Issues:

  • Manually fix heading hierarchy (H2 → H3 nesting)
  • Add more H2 sections if too few
  • Break up long sections into subsections

For Clarity Issues:

  • Break long sentences into shorter ones
  • Simplify complex vocabulary
  • Add transitions between sections

For Engagement Issues:

  • Add strong opening hook (first 2 sentences)
  • Include clear CTAs in conclusion
  • Make subheadings more curiosity-driven

For SEO Issues:

  • Manually bold target keyword 2-3 times
  • Add keyword to first 100 words
  • Ensure keyword in title/H1

Learning Quality System

The platform learns from your edits to improve over time:

How It Works:

  1. Content is graded with issues detected
  2. You edit the content or mark issues as "Not an issue"
  3. Platform tracks which issues you consistently ignore or fix
  4. Future content auto-prevents recurring issues per tenant
  5. False-positive detections get suppressed

Result: Quality improves 60% in first month of use

Example:

  • Week 1: "Rhetorical questions" flagged 12 times
  • You mark 8 as "Not an issue" (these are strategic questions)
  • Week 4: Platform stops flagging strategic questions in your content
  • Only excessive rhetorical questions get flagged

Quality Monitoring System

ILLIXIS runs background monitoring to detect quality issues across your content generation pipeline.

Brief Quality Monitoring

Schedule: Daily

Checks for:

  • Invariant violations: Briefs missing required fields (keyword, title, content strategy)
  • SERP pending: Briefs stuck waiting for SERP analysis
  • Incomplete data: Briefs with partial analysis data
  • Error rates: Percentage of briefs failing during creation

What happens when issues are detected:

  • Dashboard notification appears in Strategy Hub
  • If error rate exceeds 10%, system alert is logged
  • Admins can view details in Settings > System Health

Content Generation Monitoring

Schedule: Daily

Monitors:

  • Truncation rate: Content cut off due to token limits
  • Chart failures: Charts that failed to render
  • Image failures: Hero images that didn't generate
  • Success rates by brief type: Which brief types produce best results

Quality metrics tracked:
| Metric | Threshold | Action if exceeded |
|--------|-----------|-------------------|
| Truncation rate | >5% | Prompt optimization review |
| Chart failure rate | >10% | Playwright health check |
| Image failure rate | >15% | AI image generation investigation |
| Overall error rate | >10% | System alert generated |

Viewing quality metrics: Settings > System Health shows:

  • 7-day rolling quality metrics
  • Breakdown by brief type (keyword, trend, link magnet, etc.)
  • Comparison to previous week

Automation Schedule

Content grading runs automatically at key points to ensure your content quality is always tracked:

On Content Generation:

  • Grading runs immediately when new content is generated
  • Auto-fixing pipeline (Phase 1-3) executes before grading
  • Grade is assigned and stored with the content piece
  • Improvement suggestions are generated alongside the grade

Weekly Bulk Re-Grading:

  • All existing content is re-graded every Saturday at 2:00 AM UTC
  • This catches improvements from Learning Quality System updates
  • Re-grading uses latest detection rules and thresholds
  • Only content that has changed since last grading is re-processed

Grade History:

  • Every grading run is stored for trend analysis
  • View grade changes over time in Content Detail page
  • Track how your content quality evolves month-over-month
  • Identify patterns in which categories improve or decline

Improvement Suggestions:

  • Fresh suggestions generate with each grading run
  • Suggestions prioritized by impact (highest point gains first)
  • Suggestions become more refined as platform learns your preferences
  • Stale suggestions (issues already fixed) are automatically cleared

Grade vs. Publish Decision

Grade ≠ Ready to Publish

A C-grade article with strong substance may outperform an A+ article with thin value.

Consider:

  • B+ or higher - Usually ready with minor tweaks
  • B to C+ - Review issues, fix critical ones
  • C to D - Significant issues, consider regeneration
  • F - Major problems, regenerate with better prompt

Strategic Exception:

  • Pillar content targeting high-competition keywords: aim for A- or higher
  • Quick blog posts for long-tail keywords: B is often sufficient
  • Social repurposing content: C+ is fine (brevity over depth)

Viewing Grade Details

Navigate: Content Hub → [Content Piece] → View Details

You'll see:

  • Overall grade badge
  • Score breakdown by category
  • Full issue list (detected, fixed, remaining)
  • Recommendations for improvement
  • Grading checklist (what was checked)

To regrade after manual edits:

  • Click "Regrade Content" button
  • Platform re-runs detection on current content
  • New grade calculated based on remaining issues

FAQs

Q: Why did my content get a C when it looks good? A: Check the category breakdown. You may have high substance but poor structure (no H2 headers) or excessive AI-isms in voice.

Q: Can I disable certain issue detections? A: Use the Learning Quality System. Mark false positives as "Not an issue" and the platform will stop flagging them for your tenant.

Q: Why aren't all issues auto-fixed? A: Some issues (structure, substance, engagement) require regeneration with updated prompts. Auto-fixing is limited to style and formatting issues that the AI can safely correct without changing meaning.

Q: Does a higher grade mean better SEO performance? A: Not necessarily. Grade measures quality, not keyword targeting or search intent match. A B-grade article perfectly matching search intent will outrank an A+ article targeting the wrong intent.

Q: What's the average grade for generated content? A: First-generation content typically scores B to B+ (83-89). After auto-fixing and learning from your edits, average grade improves to A- (90-92) within 30 days.

Q: Why is Visual worth 10 points if it's not penalized? A: Visual scoring was removed December 2024 because images are generated separately. The 10 points were redistributed to other categories (Substance, Voice, Structure, Clarity, Engagement each gained 2 points).

Q: Can I export grade data? A: Yes. Export grade data via Content Hub filters or API access.

Ready to lose the stack?

One platform. You approve. ILLIXIS executes. Marketing that just happens.

Join the waitlistNo spam, everUnsubscribe anytime
First 20 founding members: 50% off any plan for your first year.

Marketing, Unstacked.