2023-06-14

Differentiating Between Upsert and Update in Pinecone

Introduction

In Pinecone, there are two APIs available for updating vector data: upsert, which performs full updates, and update, which performs partial updates. This article explains the distinction between these APIs.

Creating an Index

Execute the following commands or code to create an index in Pinecone.

bash
$ pip install -U pinecone-client
python
import pinecone

pinecone.init(
    api_key="API_KEY",
    environment="ENVIRONMENT"
)

if "sample" not in pinecone.list_indexes():
    pinecone.create_index("sample", dimension=1536)

index = pinecone.Index("sample")

Adding Data

Register vector data.

python
index.upsert(
    vectors=[{
        "id": "1",
        "values': [0.0] * 1536,
        "metadata": {
            "content": "This is a sample vector"
        }
    }],
    namespace='my_namespace'
)

res = index.fetch(
    ids=["1"],
    namespace="my_namespace"
)

print(res["vectors"]["1"]["metadata"])
{'content': 'This is a sample vector'}

Updating Data

To update data, you can specify the id using the update or upsert API. When using upsert, it can be done as follows:

python
index.upsert(
    vectors=[{
        "id": "1",
        "values": [0.0] * 1536,
        "metadata": {
            "content": "Updated"
        }
    }],
    namespace="my_namespace"
)

res = index.fetch(
    ids=["1"],
    namespace="my_namespace"
)

print(res["vectors"]["1"]["metadata"])
{'content': 'Updated'}

Creating vectors consumes computing resources. If vector regeneration is unnecessary, it is advisable to use update to update only the metadata.

python
index.update(
    id="1",
    setMetadata={
        "content": "Only metatada updated"
    },
    namespace="my_namespace"
)

res = index.fetch(
    ids=["1"],
    namespace="my_namespace"
)

print(res["vectors"]["1"]["metadata"])
{'content': 'Only metatada updated'}

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!