1. Get Your API Key

Go to this page and click the New API Key button. Copy and save the key to a secure place.

2. Create a Dataset

Before you can start your job, you need to create a dataset. You can do that with the Upsert Dataset endpoint. Because you are creating a new dataset, be sure to pass id_column and name in the request body.

Make sure to take note of the dataset_id in the response as you'll need it for starting a job with the API. You can always get the dataset_id by using the Get Datasets endpoint.

If you want to match between two datasets, make sure to create both datasets and make a note of both of their dataset_ids

3. Start a Match Job

Once your datasets have been created, it’s time to kick off a match job. You can do that with the Start Match Job endpoint. Because you are creating a new match job, be sure to pass in the base_dataset_id, match_dataset_id, and name parameters in the body.

The base_dataset and match_dataset can be the same if you want to match records within an existing dataset, otherwise, for every record in the base_dataset the system will try to match any record from the match_dataset.

Since matching can take a while, the system will return a response that contains the job_id, which you can then use to poll for the status of the job with the Get Match Job Status endpoint. Alternatively you can pass in a callback_url to the Start Match Job request and the system will POST a request there once it is complete.

4. Retrieving Your Matches

Once a match job is complete, you can call the Get All Matches endpoint with the job_id from the previous step.

When retrieving results with the api, the response will be paginated. Here's an example of how you can handle that in python to retrieve all of the results:

import requests

page = 1
data = []
while page is not None:
    response = requests.get(
        f"https://app.getaugerdata.com/api/v1/{job_type}/all/{job_id}?page={page}",
        headers={"Authorization": "Bearer xxxxx"},
    )
    response.raise_for_status()
    response = response.json()
    page += 1
    data.extend(response['results'])
    if page > response['pagination']['total_pages']:
        page = None