2023-06-14

PineconeのUpsertとUpdateの使い分け

Vector Database

Pinecone

はじめに

Pineconeでは、ベクトルデータを更新する際に、全部更新を行うupsertと、部分更新を行うupdateの二つのAPIがあります。この記事では、これらのAPIの使い分けについて説明します。

インデックスの作成

次のコマンドやコードを実行してPineconeのインデックスを作成します。

bash

$ pip install -U pinecone-client

python

import pinecone

pinecone.init(
    api_key="API_KEY",
    environment="ENVIRONMENT"
)

if "sample" not in pinecone.list_indexes():
    pinecone.create_index("sample", dimension=1536)

index = pinecone.Index("sample")

データの追加

ベクトルデータを登録します。

python

index.upsert(
    vectors=[{
        "id": "1",
        "values': [0.0] * 1536,
        "metadata": {
            "content": "This is a sample vector"
        }
    }],
    namespace='my_namespace'
)

res = index.fetch(
    ids=["1"],
    namespace="my_namespace"
)

print(res["vectors"]["1"]["metadata"])

{'content': 'This is a sample vector'}

データの更新

データを更新するには、updateやupsertのAPIにidを指定してデータを更新します。upsertを使う場合は次のようになります。

python

index.upsert(
    vectors=[{
        "id": "1",
        "values": [0.0] * 1536,
        "metadata": {
            "content": "Updated"
        }
    }],
    namespace="my_namespace"
)

res = index.fetch(
    ids=["1"],
    namespace="my_namespace"
)

print(res["vectors"]["1"]["metadata"])

{'content': 'Updated'}

ベクトルを作るのにコンピューティングリソースがかかります。ベクトルを再生成する必要がない場合は、updateでmetadataのみの更新をするのが良いです。

PineconeのUpsertとUpdateの使い分け

はじめに

インデックスの作成

データの追加

データの更新

Pinecone Sparse-Denseベクトル

Pineconeにおける選択的メタデータインデックスの作成

Ryusei Kakujo