SharePoint Syntex: AI-Powered Document Understanding
Introduction
[Explain manual metadata tagging bottlenecks; Syntex applies AI to classify, extract, and route documents automatically.]
Prerequisites
- SharePoint Online
- Syntex license or trial
- Document library with sample files
Syntex Model Types
| Model Type | Technique | Use Case |
|---|---|---|
| Unstructured Document Processing | ML training | Contracts, proposals, reports |
| Freeform Selection | Layout-based extraction | Invoices, forms with fixed structure |
| Structured Document Processing | Pre-built AI | Receipts, invoices (prebuilt) |
Step-by-Step Guide
Step 1: Create Content Center
New-SPOSite -Url https://contoso.sharepoint.com/sites/contentcenter -Template CONTENTCENTER#0 -Owner admin@contoso.com -StorageQuota 1024
Step 2: Train Unstructured Model
Scenario: Classify and extract data from contracts
- Navigate to Content Center → Create Model → Unstructured
- Add example files (10+ positive examples, 5+ negative)
- Teach model to identify "Contract" vs "Not Contract"
- Create extractors:
- Client Name
- Contract Value
- Expiration Date
- Train each extractor with labeled examples
- Test model accuracy
- Publish to target library
Step 3: Freeform Selection Model
For Invoices:
- Create model → Freeform document processing
- Upload sample invoice
- Draw selection boxes around fields:
- Invoice Number
- Vendor Name
- Total Amount
- Due Date
- Add more samples (15+)
- Publish to invoices library
Step 4: Apply Model to Library
Via UI:
Library Settings → Syntex models → Add model → Select trained model
Via PowerShell:
Add-SPOContentTypeToList -Site https://contoso.sharepoint.com/sites/contracts -List "Documents" -ContentType "Contract"
Step 5: Automate with Power Automate
Flow Trigger:
When properties in file are changed (SharePoint)
Condition: ContentType equals "Contract"
Action: Send approval request
- Approver: Manager
- Details: Contract Value, Expiration Date
Step 6: Content Assembly
Generate documents from templates:
- Create modern template in Word with placeholders
- Upload to Syntex library
- Power Automate action: Generate document from template
- Populate placeholders from Dataverse or SharePoint list
Step 7: Compliance & Retention
Apply retention labels automatically:
Set-RetentionComplianceRule -Identity "Contract Retention" -ContentMatchQuery "ContentType:Contract" -RetentionDuration 2555
Advanced Scenarios
Multi-Language Support
[Train models with documents in multiple languages; Syntex auto-detects language]
Explanation Mode
[Review why model classified document; see confidence scores per extractor]
Model Versioning
[Retrain model with new examples; compare accuracy before publishing new version]
Performance Tuning
- Provide 15+ positive examples for accurate classification
- Use 5+ negative examples to reduce false positives
- Test model on documents not in training set
- Monitor classification confidence scores
Monitoring & Reporting
# Get model usage
$models = Get-SPOSyntexModel -Site https://contoso.sharepoint.com/sites/contentcenter
foreach ($model in $models) {
Write-Host "$($model.Name): $($model.ItemsProcessed) documents processed"
}
Power BI Dashboard:
[Connect to SharePoint list tracking model application results; visualize accuracy trends]
Cost Management
| Component | Cost Driver | Optimization |
|---|---|---|
| Model training | Per-model setup | Reuse models across sites |
| Processing | Per-document | Apply models selectively |
| Storage | Extracted metadata | Retain only essential fields |
Troubleshooting
Issue: Low classification accuracy
Solution: Add more training examples; include edge cases
Issue: Extractor missing data
Solution: Verify field exists in all training documents; adjust selection region
Issue: Model not applied to new files
Solution: Check library content type configuration; verify model published
Best Practices
- Start with high-value, high-volume document types
- Train models iteratively; test before broad rollout
- Combine Syntex with retention policies for compliance automation
- Use Power Automate to route classified documents
Key Takeaways
- Syntex automates metadata extraction and classification via AI.
- Unstructured models handle varied layouts; freeform models suit fixed forms.
- Integration with Power Automate enables end-to-end automation.
- Monitoring model accuracy ensures ongoing value.
Next Steps
- Deploy contract management automation
- Integrate with compliance center for automatic labeling
- Build content assembly workflows
Additional Resources
Which document type will you automate first?