SharePoint Syntex: AI-Powered Document Understanding

SharePoint Syntex: AI-Powered Document Understanding

Introduction

[Explain manual metadata tagging bottlenecks; Syntex applies AI to classify, extract, and route documents automatically.]

Prerequisites

  • SharePoint Online
  • Syntex license or trial
  • Document library with sample files

Syntex Model Types

Model Type Technique Use Case
Unstructured Document Processing ML training Contracts, proposals, reports
Freeform Selection Layout-based extraction Invoices, forms with fixed structure
Structured Document Processing Pre-built AI Receipts, invoices (prebuilt)

Step-by-Step Guide

Step 1: Create Content Center

New-SPOSite -Url https://contoso.sharepoint.com/sites/contentcenter -Template CONTENTCENTER#0 -Owner admin@contoso.com -StorageQuota 1024

Step 2: Train Unstructured Model

Scenario: Classify and extract data from contracts

  1. Navigate to Content Center → Create Model → Unstructured
  2. Add example files (10+ positive examples, 5+ negative)
  3. Teach model to identify "Contract" vs "Not Contract"
  4. Create extractors:
    • Client Name
    • Contract Value
    • Expiration Date
  5. Train each extractor with labeled examples
  6. Test model accuracy
  7. Publish to target library

Step 3: Freeform Selection Model

For Invoices:

  1. Create model → Freeform document processing
  2. Upload sample invoice
  3. Draw selection boxes around fields:
    • Invoice Number
    • Vendor Name
    • Total Amount
    • Due Date
  4. Add more samples (15+)
  5. Publish to invoices library

Step 4: Apply Model to Library

Via UI:

Library Settings → Syntex models → Add model → Select trained model

Via PowerShell:

Add-SPOContentTypeToList -Site https://contoso.sharepoint.com/sites/contracts -List "Documents" -ContentType "Contract"

Step 5: Automate with Power Automate

Flow Trigger:

When properties in file are changed (SharePoint)
Condition: ContentType equals "Contract"
Action: Send approval request
  - Approver: Manager
  - Details: Contract Value, Expiration Date

Step 6: Content Assembly

Generate documents from templates:

  1. Create modern template in Word with placeholders
  2. Upload to Syntex library
  3. Power Automate action: Generate document from template
  4. Populate placeholders from Dataverse or SharePoint list

Step 7: Compliance & Retention

Apply retention labels automatically:

Set-RetentionComplianceRule -Identity "Contract Retention" -ContentMatchQuery "ContentType:Contract" -RetentionDuration 2555

Advanced Scenarios

Multi-Language Support

[Train models with documents in multiple languages; Syntex auto-detects language]

Explanation Mode

[Review why model classified document; see confidence scores per extractor]

Model Versioning

[Retrain model with new examples; compare accuracy before publishing new version]

Performance Tuning

  • Provide 15+ positive examples for accurate classification
  • Use 5+ negative examples to reduce false positives
  • Test model on documents not in training set
  • Monitor classification confidence scores

Monitoring & Reporting

# Get model usage
$models = Get-SPOSyntexModel -Site https://contoso.sharepoint.com/sites/contentcenter
foreach ($model in $models) {
    Write-Host "$($model.Name): $($model.ItemsProcessed) documents processed"
}

Power BI Dashboard:

[Connect to SharePoint list tracking model application results; visualize accuracy trends]

Cost Management

Component Cost Driver Optimization
Model training Per-model setup Reuse models across sites
Processing Per-document Apply models selectively
Storage Extracted metadata Retain only essential fields

Troubleshooting

Issue: Low classification accuracy
Solution: Add more training examples; include edge cases

Issue: Extractor missing data
Solution: Verify field exists in all training documents; adjust selection region

Issue: Model not applied to new files
Solution: Check library content type configuration; verify model published

Best Practices

  • Start with high-value, high-volume document types
  • Train models iteratively; test before broad rollout
  • Combine Syntex with retention policies for compliance automation
  • Use Power Automate to route classified documents

Key Takeaways

  • Syntex automates metadata extraction and classification via AI.
  • Unstructured models handle varied layouts; freeform models suit fixed forms.
  • Integration with Power Automate enables end-to-end automation.
  • Monitoring model accuracy ensures ongoing value.

Next Steps

  • Deploy contract management automation
  • Integrate with compliance center for automatic labeling
  • Build content assembly workflows

Additional Resources


Which document type will you automate first?