Skip to content

Full-Text Search

Tamsaek goes beyond file name search by indexing the actual content of your documents.

CategoryExtensions
Documents.pdf, .docx, .doc, .odt, .rtf, .txt
Spreadsheets.xlsx, .xls, .csv, .ods
Presentations.pptx, .ppt, .odp
E-books.epub, .mobi
Code.js, .ts, .py, .rs, .go, .java, .c, .cpp, .md, .json, .yaml, .xml, .html, .css
Email.eml, .msg
  1. Text Extraction: Tamsaek extracts text from supported file types
  2. Tokenization: Text is broken into searchable terms
  3. Indexing: Terms are added to a high-performance Tantivy index
  4. Search: Queries match against the full-text index

Find files containing specific words:

quarterly revenue

Use quotes for exact phrases:

"annual report"

Combine terms:

budget AND 2024
budget OR expenses
budget NOT draft

Partial matching:

mark*

Matches “marketing”, “markdown”, “marker”, etc.

By default, Tamsaek searches both file names and content. To search only content:

content:quarterly report

To search only names:

name:report.pdf
  • Tantivy engine: Same technology as modern search engines
  • Incremental updates: Only new/changed files are re-indexed
  • Compressed index: Efficient storage for large collections
  • Sub-second results: Even across millions of indexed terms