Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This article explains the workflow and uses REST for illustration. Once you understand the basic workflow, continue with the Azure SDK code samples in the
azure-search-vector-samples
repository for guidance on using these features in test and production code.
If you choose HNSW on a field, you can opt in for exhaustive KNN at query time. But the other direction doesn’t work: if you choose exhaustive, you can’t later request HNSW search because the extra data structures that enable approximate search don’t exist.
A vector configuration also specifies quantization methods for reducing vector size:
Add a
vectorSearch
section in the index that specifies the search algorithms used to create the embedding space.
"vectorSearch": {
"compressions": [
"name": "scalar-quantization",
"kind": "scalarQuantization",
"rerankWithOriginalVectors": true,
"defaultOversampling": 10.0,
"scalarQuantizationParameters": {
"quantizedDataType": "int8"
"name": "binary-quantization",
"kind": "binaryQuantization",
"rerankWithOriginalVectors": true,
"defaultOversampling": 10.0,
"algorithms": [
"name": "hnsw-1",
"kind": "hnsw",
"hnswParameters": {
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
"name": "hnsw-2",
"kind": "hnsw",
"hnswParameters": {
"m": 8,
"efConstruction": 800,
"efSearch": 800,
"metric": "hamming"
"name": "eknn",
"kind": "exhaustiveKnn",
"exhaustiveKnnParameters": {
"metric": "euclidean"
"profiles": [
"name": "vector-profile-hnsw-scalar",
"compression": "scalar-quantization",
"algorithm": "hnsw-1"
Key points:
Names for each configuration of compression, algorithm, and profile must be unique for its type within the index.
vectorSearch.compressions.kind
can be scalarQuantization
or binaryQuantization
.
vectorSearch.compressions.rerankWithOriginalVectors
uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if stored
is false. This property is optional. Default is true.
vectorSearch.compressions.defaultOversampling
considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the k
in the query, with an oversampling multiplier. For example, if the query specifies a k
of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top k
reranked results are returned. This property is optional. Default is 4.
vectorSearch.compressions.scalarQuantizationParameters.quantizedDataType
must be set to int8
. This is the only primitive data type supported at this time. This property is optional. Default is int8
.
vectorSearch.algorithms.kind
are either "hnsw"
or "exhaustiveKnn"
. These are the Approximate Nearest Neighbors (ANN) algorithms used to organize vector content during indexing.
vectorSearch.algorithms.m
is the bi-directional link count. Default is 4. The range is 4 to 10. Lower values should return less noise in the results.
vectorSearch.algorithms.efConstruction
is the number of nearest neighbors used during indexing. Default is 400. The range is 100 to 1,000.
"vectorSearch.algorithms.fSearch
is the number of nearest neighbors used during search. Default is 500. The range is 100 to 1,000.
vectorSearch.algorithms.metric
should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are cosine
, dotProduct
, euclidean
, hamming
(used for indexing binary data).
vectorSearch.profiles
add a layer of abstraction for accommodating richer definitions. A profile is defined in vectorSearch
, and then referenced by name on each vector field. It's a combination of compression and algorithm configurations. This is the property that you assign to a vector field, and it determines the fields' algorithm and compression.
2024-05-01-preview is the most recent preview version.
vectorSearch.algorithms
with support for HNSW and exhaustive KNN.
vectorSearch.compressions
with properties for scalar (but not binary) quantization, oversampling, and reranking with original vectors.
vectorSearch.profiles
for multiple combinations of algorithm and compression configurations.
Inclusive of 2024-03-01-preview.
Inclusive of 2023-10-01-preview.
Inclusive of 2023-11-01 vectorSearch.algorithms
and vectorSearch.profiles
.
Use the Create or Update Index Preview REST API to create the index.
Add a vectorSearch
section in the index that specifies compression settings and the search algorithms used to create the embedding space. For more information, see Configure vector quantization and reduced storage.
"vectorSearch": {
"compressions": [
"name": "my-scalar-quantization",
"kind": "scalarQuantization",
"rerankWithOriginalVectors": true,
"defaultOversampling": 10.0,
"scalarQuantizationParameters": {
"quantizedDataType": "int8"
"algorithms": [
"name": "hnsw-1",
"kind": "hnsw",
"hnswParameters": {
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
"name": "hnsw-2",
"kind": "hnsw",
"hnswParameters": {
"m": 8,
"efConstruction": 800,
"efSearch": 800,
"metric": "hamming"
"name": "eknn",
"kind": "exhaustiveKnn",
"exhaustiveKnnParameters": {
"metric": "euclidean"
"profiles": [
"name": "vector-profile-hnsw-1",
"algorithm": "hnsw-1"
Key points:
vectorSearch.compressions.kind
must be scalarQuantization
.
vectorSearch.compressions.rerankWithOriginalVectors
uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if stored
is false. This property is optional. Default is true.
vectorSearch.compressions.defaultOversampling
considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the k
in the query, with an oversampling multiplier. For example, if the query specifies a k
of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top k
reranked results are returned. This property is optional. Default is 4.
vectorSearch.compressions.scalarQuantizationParameters.quantizedDataType
must be set to int8
. This is the only primitive data type supported at this time. This property is optional. Default is int8
.
vectorSearch.algorithms.kind
are either "hnsw"
or "exhaustiveKnn"
. These are the Approximate Nearest Neighbors (ANN) algorithms used to organize vector content during indexing.
vectorSearch.algorithms.m
is the bi-directional link count. Default is 4. The range is 4 to 10. Lower values should return less noise in the results.
vectorSearch.algorithms.efConstruction
is the number of nearest neighbors used during indexing. Default is 400. The range is 100 to 1,000.
"vectorSearch.algorithms.fSearch
is the number of nearest neighbors used during search. Default is 500. The range is 100 to 1,000.
vectorSearch.algorithms.metric
should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are cosine
, dotProduct
, euclidean
, hamming
(used for indexing binary data).
vectorSearch.profiles
add a layer of abstraction for accommodating richer definitions. A profile is defined in vectorSearch
, and then referenced by name on each vector field. It's a combination of compression and algorithm configurations. This is the property that you assign to a vector field, and it determines the fields' algorithm and compression.
Add a vector field to the fields collection
The fields collection must include a field for the document key, vector fields, and any other fields that you need for hybrid search scenarios.
Vector fields are characterized by their data type, a dimensions
property based on the embedding model used to output the vectors, and a vector profile.
2024-07-01
2024-05-01-preview
Use the Create or Update Index to create the index.
Define a vector field with the following attributes. You can store one generated embedding per field. For each vector field:
type
must be a vector data types. Collection(Edm.Single)
is the most common for embedding models.
dimensions
is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
vectorSearchProfile
is the name of a profile defined elsewhere in the index.
searchable
must be true.
retrievable
can be true or false. True returns the raw vectors (1536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app.
stored
can be true or false. It determines whether an extra copy of vectors is stored for retrieval. For more information, see Reduce vector size.
filterable
, facetable
, sortable
must be false.
Add filterable nonvector fields to the collection, such as "title" with filterable
set to true, if you want to invoke prefiltering or postfiltering on the vector query.
Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectors. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
The following example shows the fields collection:
PUT https://my-search-service.search.windows.net/indexes/my-index?api-version=2024-07-01&allowIndexDowntime=true
Content-Type: application/json
api-key: {{admin-api-key}}
"name": "{{index-name}}",
"fields": [
"name": "id",
"type": "Edm.String",
"key": true,
"filterable": true
"name": "title",
"type": "Edm.String",
"searchable": true,
"filterable": true,
"sortable": true,
"retrievable": true
"name": "titleVector",
"type": "Collection(Edm.Single)",
"searchable": true,
"retrievable": true,
"stored": true,
"dimensions": 1536,
"vectorSearchProfile": "vector-profile-1"
"name": "content",
"type": "Edm.String",
"searchable": true,
"retrievable": true
"name": "contentVector",
"type": "Collection(Edm.Single)",
"searchable": true,
"retrievable": false,
"stored": false,
"dimensions": 1536,
"vectorSearchProfile": "-vector-profile-1"
"vectorSearch": {
"algorithms": [
"name": "hsnw-1",
"kind": "hnsw",
"hnswParameters": {
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
"profiles": [
"name": "vector-profile-1",
"algorithm": "hnsw-1"
Supports all vector data types.
Inclusive of 2024-03-01-preview
, with new support indexing binary data for vector search.
Use the Create or Update Index Preview REST API to define the fields collection of an index.
Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
type
can be Collection(Edm.Single)
, Collection(Edm.Half)
, Collection(Edm.Int16)
, Collection(Edm.SByte)
dimensions
is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
vectorSearchProfile
is the name of a profile defined elsewhere in the index.
searchable
must be true.
retrievable
can be true or false. True returns the raw vectors (1536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app. False is required if stored
is false.
stored
is a new boolean property that applies to vector fields only. True stores a copy of vectors returned in search results. False discards that copy during indexing. You can search on vectors, but can't return vectors in results.
filterable
, facetable
, sortable
must be false.
Add filterable nonvector fields to the collection, such as "title" with filterable
set to true, if you want to invoke prefiltering or postfiltering on the vector query.
Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectors. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
The following example shows the fields collection:
PUT https://my-search-service.search.windows.net/indexes/my-index?api-version=2024-05-01-preview&allowIndexDowntime=true
Content-Type: application/json
api-key: {{admin-api-key}}
"name": "{{index-name}}",
"fields": [
"name": "id",
"type": "Edm.String",
"key": true,
"filterable": true
"name": "firstVectorfield-float32-embeddings",
"type": "Collection(Edm.Single)",
"searchable": true,
"retrievable": false,
"stored": false,
"dimensions": 1536,
"vectorSearchProfile": "vector-profile-1"
"name": "secondVectorfield-float16-embeddings",
"type": "Collection(Edm.Half)",
"searchable": true,
"retrievable": false,
"stored": false,
"dimensions": 1536,
"vectorSearchProfile": "vector-profile-1"
"name": "thirdVectorfield-int8-embeddings-for-my-custom-quantization-output",
"type": "Collection(Edm.SByte)",
"searchable": true,
"retrievable": false,
"stored": false,
"dimensions": 1536,
"vectorSearchProfile": "vector-profile-1"
"name": "fourthVectorfield-for-binary-data",
"type": "Collection(Edm.Byte)",
"searchable": true,
"retrievable": false,
"stored": false,
"dimensions": 1536,
"vectorSearchProfile": "vector-profile-1"
"vectorSearch": {
"algorithms": [
"name": "hnsw-1",
"kind": "hnsw",
"hnswParameters": {
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
"profiles": [
"name": "vector-profile-1",
"algorithm": "hnsw-1"
Load vector data for indexing
Content that you provide for indexing must conform to the index schema and include a unique string value for the document key. Prevectorized data is loaded into one or more vector fields, which can coexist with other fields containing alphanumeric content.
You can use either push or pull methodologies for data ingestion.
Push APIs
Pull APIs (indexers)
Use Documents - Index to load vector and nonvector data into an index. The push APIs for indexing are identical across all stable and preview versions. Use any of the following APIs to load documents:
2024-07-01
2024-05-01-preview
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/index?api-version=2024-07-01
"value": [
"id": "1",
"title": "Azure App Service",
"content": "Azure App Service is a fully managed platform for building, deploying, and scaling web apps. You can host web apps, mobile app backends, and RESTful APIs. It supports a variety of programming languages and frameworks, such as .NET, Java, Node.js, Python, and PHP. The service offers built-in auto-scaling and load balancing capabilities. It also provides integration with other Azure services, such as Azure DevOps, GitHub, and Bitbucket.",
"category": "Web",
"titleVector": [
-0.02250031754374504,
. . .
"contentVector": [
-0.024740582332015038,
. . .
"@search.action": "upload"
"id": "2",
"title": "Azure Functions",
"content": "Azure Functions is a serverless compute service that enables you to run code on-demand without having to manage infrastructure. It allows you to build and deploy event-driven applications that automatically scale with your workload. Functions support various languages, including C#, F#, Node.js, Python, and Java. It offers a variety of triggers and bindings to integrate with other Azure services and external services. You only pay for the compute time you consume.",
"category": "Compute",
"titleVector": [
-0.020159931853413582,
. . .
"contentVector": [
-0.02780858241021633,
. . .
"@search.action": "upload"
. . .
Pull APIs refer to indexers, which automate multiple indexing steps, from data retrieval and refresh, to integrated vectorization that encodes content for vector search.
Data sources must be a supported type.
Skillsets provide the Text Split skill for data chunking, plus skills that connect to embedding models. A few are generally available, others are still in preview. Skills and vectorizers are used to generate embeddings. The skill you choose for indexing should be paired with an equivalent vectorizer for queries. For vectorization during indexing, choose from the following skills:
AzureOpenAIEmbedding skill
Custom Web API skill
Azure AI Vision multimodal embeddings skill (preview)
AML skill (preview) to generate embeddings for models hosted in the Azure AI Studio model catalog. See How to implement integrated vectorization using models from Azure AI Studio for details.
Indexes provide the vector field definitions and vector search configurations. Those definitions are described in this article.
Indexers drive the indexing pipeline. For more information, see Create an indexer.
If you're familiar with indexers and skillsets:
Field mappings, output field mappings, and deletion detection settings apply to vector and nonvector fields equally.
If vector data is sourced in files, we recommend a nondefault parsingMode
such as json
, jsonLines
, or csv
based on the shape of the data.
For data sources, Azure blob indexers and Azure Cosmos DB for NoSQL indexers with one of the aforementioned parsingModes have been tested and confirmed to work.
The dimensions of all vectors from the data source must be the same and match their index definition for the field they're mapping to. The indexer throws an error on any documents that don’t match.
Check your index for vector content
For validation purposes, you can query the index using Search Explorer in the Azure portal or a REST API call. Because Azure AI Search can't convert a vector to human-readable text, try to return fields from the same document that provide evidence of the match. For example, if the vector query targets the "titleVector" field, you could select "title" for the search results.
Fields must be attributed as "retrievable" to be included in the results.
Azure portal
REST API
Review the indexes in Search management > Indexes to view index size all-up and vector index size. A positive vector index size indicates vectors are present.
Use Search Explorer to query an index. Search Explorer has two views: Query view (default) and JSON view.
Set Query options > Hide vector values in search results for more readable results.
Use the JSON view for vector queries. You can either paste in a JSON definition of the vector query you want to execute, or use the built-in text-to-vector or image-to-vector conversion if your index has a vectorizer assignment. For more information about image search, see Quickstart: Search for images in Search Explorer.
Use the default Query view for a quick confirmation that the index contains vectors. The query view is for full text search. Although you can't use it for vector queries, you can send an empty search (search=*
) to check for content. The content of all fields, including vector fields, is returned as plain text.
See Create a vector query for more details.
The following REST API example is a vector query, but it returns only nonvector fields (title, content, category). Only fields marked as "retrievable" can be returned in search results.
POST https://my-search-service.search.windows.net/indexes/my-index/docs/search?api-version=2024-07-01
Content-Type: application/json
api-key: {{admin-api-key}}
"vector": {
"value": [
-0.009154141,
0.018708462,
. . .
-0.02178128,
-0.00086512347
"fields": "contentVector",
"k": 5
"select": "title, content, category"
Update a vector store
To update a vector store, modify the schema and if necessary, reload documents to populate new fields. APIs for schema updates include Create or Update Index (REST), CreateOrUpdateIndex in the Azure SDK for .NET, create_or_update_index in the Azure SDK for Python, and similar methods in other Azure SDKs.
The standard guidance for updating an index is covered in Update or rebuild an index.
Key points include:
Drop and rebuild is often required for updates to and deletion of existing fields.
However, you can update an existing schema with the following modifications, with no rebuild required:
Add new fields to a fields collection.
Add new vector configurations, assigned to new fields but not existing fields that have already been vectorized.
Change "retrievable" (values are true or false) on an existing field. Vector fields must be searchable and retrievable, but if you want to disable access to a vector field in situations where drop and rebuild isn't feasible, you can set retrievable to false.
Next steps
As a next step, we recommend Query vector data in a search index.
Code samples in the azure-search-vector repository demonstrate end-to-end workflows that include schema definition, vectorization, indexing, and queries.
There's demo code for Python, C#, and JavaScript.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see: https://aka.ms/ContentUserFeedback.
Submit and view feedback for
This product