facebook pixel

Processing batches of addresses with Arbiter


In this guide

1. Overview of the batch API

Arbiter is designed to allow you to efficiently analyse large numbers of addresses. To this end, the API allows you to upload large CSV files of data which will be processed asynchronously.

The process of using the API is as follows:

  1. The client uploads a CSV file to the files API
  2. The client creates a new Arbiter address batch, providing the id of the file object from the previous step
  3. Arbiter will begin processing the batch immediately, and the client may poll the batch's API endpoint to check on its status.
  4. After the batch has finished processing, the batch object will contain a link where the results may be downloaded. The results are compiled as another CSV file.

For more detail, see the sequence diagram below:

SVG diagram failed to load.

Please consult the linked API documentation to see the details of each request and response format. If you are unable to see the Arbiter-specific resources in your API documentation, ensure you are signed in to your Observer account, or contact support for assistance.

2. Creating a file upload

Our API typically follows the JSON:API spec, but uploading large files is an exception. In order to accept file content conveniently, you must use a multipart/form-data request containing two parameters:

  • file is the file content itself
  • purpose is the reason for the file upload, and must be set to arbiter_addresses for the purposes of this API

Here is an example of uploading a file using curl:

curl -fLsS -XPOST https://api.getpylon.com/v1/files \
    --header "Authorization: Bearer $API_TOKEN" \
    --header "Accept: application/vnd.api+json" \
    --form "file=@/path/to/file" \
    --form "purpose=arbiter_addresses"

The address file must be a CSV file with the following format:

address,latitude,longitude
Example address 1,-33.871864,151.120888
Example address 2,-32.168909,115.862132

The column order is not significant. Values may be quoted with " in order to ensure data integrity.

Read more: File API documentation

3. Creating an address batch

The "address batch" is a record of the processing state of a large chunk of addresses. When creating a batch, you must provide the ID of the file for it to process.

Here is an example of uploading a file using curl:

curl -fLsS -XPOST https://api.getpylon.com/v1/arbiter_address_batches  \
    --header "Authorization: Bearer $API_TOKEN" \
    --header "Accept: application/vnd.api+json" \
    --header "Content-Type: application/vnd.api+json" \
    --data '{
        "data": {
            "attributes": {},
            "relationships": {
                "file": {
                    "data": {
                        "type": "files",
                        "id": "'$FILE_ID'"
                    }
                }
            }
        }
    }'

Note that in this script, $FILE_ID is the ID of the file created in the previous step. The deep nesting of the file relationship is required by the JSON:API relationships spec.

Read more: Address batch API documentation

4. Obtaining the result file

The processing may take a significant amount of time for large files. Polling the address batch endpoint is currently the only way to wait for results to be complete.

Once the batch has finished processing, its API response will contain a result_download_url property. The client should follow this link, and it will be presented with a CSV file served with the Content-Disposition: attachment header.