Weaviate metadata filtering capabilities let developers combine semantic search with structured constraints such as category, tenant, role, date, number, status, null state, array membership, and object metadata.
This matters because production search rarely means “search everything.” Most AI applications need retrieval inside a product line, workspace, user permission boundary, document type, language, time range, or business status.
Short Answer
Weaviate supports metadata filters for equality, inequality, numeric and date ranges, text matching, array membership, null state, property length, object IDs, timestamps, and logical combinations such as AND and OR.
These filters can be used with vector search, keyword search, hybrid search, object fetches, and RAG-style retrieval. The exact performance and availability depend on the collection schema and inverted index configuration.
What Metadata Filtering Is Used For
Metadata filtering narrows the set of objects that a query is allowed to consider.
Common examples include:
- search only public documents
- search inside one tenant or workspace
- retrieve only documents from a product line
- limit results to a language, region, or document type
- filter by publication date or update time
- exclude archived or draft content
- find objects with missing metadata for cleanup
Filtering With Vector Search
Weaviate filters can be applied to vector searches such as near_text.
from weaviate.classes.query import Filter
collection = client.collections.use("Articles")
response = collection.query.near_text(
query="billing dashboard setup",
limit=10,
filters=Filter.by_property("product").equal("analytics")
)
The vector query finds semantically similar objects, while the metadata filter restricts the eligible set.
Filtering With Hybrid Search
Metadata filters can also be used with hybrid search, where keyword and vector signals are combined.
This is useful when a query needs both lexical precision and semantic recall, but still must stay inside a structured boundary such as tenant, status, or document type.
Equality and Inequality Filters
Equality filters match exact filter values.
Filter.by_property("status").equal("published")
Filter.by_property("language").not_equal("deprecated")
These are useful for categorical metadata such as status, language, product, region, tier, document type, and source system.
Numeric and Date Range Filters
Weaviate supports comparison-style filters for numbers and dates.
from datetime import datetime, timezone
from weaviate.classes.query import Filter
recent = Filter.by_property("published_at").greater_or_equal(
datetime(2025, 1, 1, tzinfo=timezone.utc)
)
high_priority = Filter.by_property("priority").greater_than(7)
Range filters are useful for prices, ratings, scores, version numbers, timestamps, and priority fields.
Text Matching Filters
For text properties, Weaviate supports filter patterns such as exact equality, contains-style operations, and wildcard-like matching.
Filter.by_property("title").equal("AI Search Guide")
Filter.by_property("title").like("*search*")
Text filters are different from semantic search. They operate on structured or tokenized field values, not on vector similarity.
Array Filters
Array fields are useful for tags, roles, topics, categories, entities, permissions, and labels.
Filter.by_property("tags").contains_any(["rag", "search"])
Filter.by_property("roles").contains_all(["admin", "reviewer"])
Array filters help when a document can belong to more than one category or permission group.
Null-State Filters
Weaviate can filter for null and non-null property states when null-state indexing is enabled.
Filter.by_property("department").is_none(True)
Filter.by_property("department").is_none(False)
This is useful for data cleanup, enrichment workflows, and retrieval rules that depend on whether metadata is complete.
Property-Length Filters
Property-length filtering helps with empty or non-empty text and array fields when the collection is configured to index property length.
empty_tags = Filter.by_property("tags", length=True).equal(0)
long_body = Filter.by_property("body", length=True).greater_than(500)
This is useful for finding untagged documents, short documents, empty fields, or unusually long fields.
Timestamp Filters
Weaviate can filter by object creation and update timestamps when timestamp indexing is enabled.
from datetime import datetime, timezone
from weaviate.classes.query import Filter
created_recently = Filter.by_creation_time().greater_than(
datetime(2025, 1, 1, tzinfo=timezone.utc)
)
updated_before = Filter.by_update_time().less_than(
datetime(2026, 1, 1, tzinfo=timezone.utc)
)
Timestamp filters are useful for freshness, lifecycle management, incremental indexing, and audit workflows.
Object ID Filters
Weaviate can filter by object ID when an application needs a specific object or known set of objects.
Filter.by_id().equal("00037775-1432-35e5-bc59-443baaef7d80")
ID filters are useful for lookup flows, debugging, result pinning, and controlled retrieval tests.
Logical Filter Combinations
Filters can be combined with logical operators.
filters = (
Filter.by_property("product").equal("analytics") &
Filter.by_property("status").equal("published") &
Filter.by_property("language").equal("en")
)
alternative = Filter.any_of([
Filter.by_property("region").equal("us"),
Filter.by_property("region").equal("emea")
])
This lets applications express real business rules instead of relying on one metadata field at a time.
Index Settings That Affect Filtering
Weaviate filtering depends on inverted index configuration.
indexFilterablesupports match-based filtering.indexRangeFilterssupports efficient numeric and date range filters.indexSearchablesupports BM25 and keyword-oriented search over text.indexNullStatesupports null and not-null filtering.indexPropertyLengthsupports property-length filtering.indexTimestampssupports creation and update timestamp filters.
Some options are enabled by default for common fields, while others should be enabled only when the application needs them.
When to Enable Range Filters
Range filters are most useful for fields that frequently use greater-than or less-than comparisons.
Examples include price, rating, age, timestamp, score, quantity, priority, and numeric version fields.
If a field is only used for equality, a match-based filter index may be enough. If it is frequently used for ranges, range indexing can be important.
When to Enable Null and Length Indexing
Null-state and property-length indexing add overhead, so they should be enabled when the application actually needs those capabilities.
Enable them when users or workflows need to find missing fields, incomplete metadata, empty arrays, short content, untagged records, or documents that need enrichment.
Filtering and Access Control
Metadata filters are often used for access control, but the schema must be designed carefully.
Use explicit fields such as tenant_id, visibility, allowed_roles, or workspace_id. Avoid relying on missing or nullable permission metadata as a safe access rule.
For sensitive systems, filters should be part of a broader authorization design, not the only security layer.
Filtering and RAG Quality
In RAG systems, filters affect both quality and safety.
A good filter can keep retrieval focused on the right product, language, source, customer, or permission scope. A bad filter can remove useful context or retrieve documents that are not allowed for the user.
Evaluate filters with real queries, not only with synthetic examples.
Common Mistakes
- Forgetting to configure indexes for the filter capabilities the app needs.
- Using metadata fields in vectorization when they should only be filters.
- Using nullable access-control fields.
- Assuming text filters and semantic search behave the same way.
- Filtering on high-cardinality fields without testing performance.
- Using range queries on fields that are not configured for efficient range filtering.
Best Practices
- Decide filter fields during schema design.
- Separate semantic content from operational metadata.
- Use categorical fields for product, region, language, status, and document type.
- Use range-capable fields for dates, prices, scores, and numeric values.
- Enable null, length, and timestamp indexing only when needed.
- Test filters together with vector and hybrid search.
- Benchmark common and worst-case filter patterns.
Summary
Weaviate metadata filtering capabilities cover equality, inequality, ranges, text matching, arrays, null state, property length, timestamps, object IDs, and logical filter combinations.
These filters can be used with vector search, hybrid search, keyword search, and object retrieval.
The strongest results come from designing the schema intentionally: choose the right filter fields, enable the right index options, and test the filters against realistic production queries.