← Back to Search Tool

📚 User Guide

Welcome to the PDF Keyword Search Tool! This guide will help you understand how to use the tool effectively and what to do when you encounter quality warnings.
⚡ Quick Start:
1️⃣ Upload your PDFs
2️⃣ Add your keywords
3️⃣ Choose Simple Search (recommended for most users)
4️⃣ Click Search PDFs
5️⃣ If you see quality warnings, try the Enhanced Search button that appears

🔍 Basic Usage

1. Upload PDFs

2. Add Keywords

You can add keywords in two ways:

Keyword Format Examples:
✅ Good: obesity
✅ Good: machine-learning
✅ Good: research methods
❌ Avoid: obesity, diversity, health (use separate lines instead)

3. Choose Search Mode

Select between two search methods based on your PDF quality and needs:

🟢 Simple Search (Recommended)
How it works: Uses the best single text extraction method for each PDF page
Best for: Most PDFs created from Word, Excel, or other digital documents
Results: Accurate keyword counts without inflation
When to use: Start here for all searches - it works well for 90% of PDFs
🟡 Enhanced Search
How it works: Uses 4 different text extraction methods simultaneously and combines results
Best for: Scanned PDFs, image-heavy documents, or PDFs with formatting issues
Results: Finds more keywords but may inflate counts (same keyword counted multiple times)
When to use: When Simple Search misses keywords you know exist, or for known problematic PDFs

4. Search and Download

⚠️ PDF Quality Warnings

The tool analyzes your PDFs and may display quality warnings. Here's what they mean:

🚨 Poor Quality PDFs (Red Warning)

What it means: The PDF is likely scanned, image-based, or has very little searchable text.
Impact: Keywords may be missed entirely.
What to do: Try to recreate the PDF from the original document using "Print to PDF".

⚠️ Minor Issues (Yellow Warning)

What it means: The PDF has some formatting issues but should work reasonably well.
Impact: Most keywords will be found, but some might be missed.
What to do: Check results for accuracy; recreate PDF if results seem wrong.

✅ Good Quality PDFs (No Warning)

What it means: The PDF has clean, searchable text.
Impact: Keywords should be found accurately.
What to do: Nothing - results should be reliable.

🔧 Search Modes Explained

🟢 Simple Search (Recommended)

Technical Details:
• Uses 4 text extraction methods: Standard, Block-by-block, HTML, and XML
• Selects the best single method that extracted text for each page
• Priority order: Standard → Blocks → HTML → XML
• Provides accurate, non-inflated keyword counts
• Handles hyphenated words and PDF line breaks automatically

🟡 Enhanced Search (For Problematic PDFs)

Technical Details:
• Uses the same 4 text extraction methods as Simple Search
• Searches for keywords in ALL extraction methods simultaneously
• Combines results from all methods, which may count the same keyword multiple times
• Useful when PDFs have layers, hidden text, or complex formatting
• Trade-off: Finds more keywords but inflates counts

📈 Progressive Enhancement Workflow

Even if you start with Simple Search, you can still use Enhanced extraction:

  1. Start with Simple Search - Most reliable for clean PDFs
  2. Review quality warnings - Tool identifies problematic PDFs
  3. Try Enhanced for problematic PDFs only - Click the enhancement button that appears
  4. Compare results - Enhanced results are shown separately from Simple results
Smart Enhancement: When you use the "Try Enhanced Search for Problematic PDFs" button after a Simple search, only PDFs with quality issues are re-processed with Enhanced mode. Clean PDFs keep their accurate Simple Search results.

📊 Understanding Results

Report Information

Download Formats

💡 Best Practices

For Best Results:

  1. Use clean PDFs: Create PDFs using "Print to PDF" from Word, not "Save as PDF"
  2. Remove track changes: Accept all changes and delete comments before creating PDF
  3. Avoid scanned documents: Use original digital files when possible
  4. Test your PDFs: If you can't copy/paste text normally, recreate the PDF

Keyword Tips:

❓ Troubleshooting

Keywords Not Found?

Too Many Matches?

Which Search Mode Should I Use?

Upload Issues?

Still having issues? The tool is designed to handle most PDF types, but some heavily formatted or corrupted files may not work well. When in doubt, recreate the PDF from the original source document.