Client

class Client(config=None, profile=None, default_project_id=None)

Client object to interact with the Qualcomm AI Hub API.

Examples

Create a client using credentials from ~/.qai_hub/client.ini:

import qai_hub as hub

client = hub.Client()
model = client.upload_model("model.pt")

Create a client using a named profile from ~/.qai_hub/my_profile.ini:

import qai_hub as hub

client = hub.Client(profile="my_profile")
model = client.upload_model("model.pt")
get_dataset(dataset_id)

Returns a dataset for a given id.

Parameters:

dataset_id (str) – id of a dataset.

Returns:

dataset – The dataset for the id.

Return type:

Dataset

Examples

Get dataset and print information about it (granted you provide a valid dataset ID):

import qai_hub as hub
client = hub.Client()

dataset = client.get_dataset("dabc123")
print("Dataset information:", dataset)
get_datasets(offset=0, limit=50)

Returns a list of datasets visible to you.

Parameters:
  • offset (int) – Offset the query to get even older datasets.

  • limit (int) – Maximum numbers of datasets to return.

Returns:

dataset_list – List of datasets.

Return type:

list[Dataset]

Examples

Fetch Dataset objects for your five most recent datasets:

import qai_hub as hub
client = hub.Client()

datasets = client.get_datasets(limit=5)
get_device_attributes()

Returns the super set of available device attributes.

Any of these attributes can be used to filter devices when using get_devices().

Returns:

attribute_list – Super set of all available device attributes.

Return type:

list[str]

Examples

import qai_hub as hub
client = hub.Client()
attributes = client.get_device_attributes()
get_devices(name='', os='', attributes=[])

Returns a list of available devices.

The returned list of devices are compatible with the supplied name, os, and attributes. The name must be an exact match with an existing device and os can either be a version (“15.2”) or a version range (“[14,15)”).

Parameters:
  • name (str) – Only devices with this exact name will be returned.

  • os (str) – Only devices with an OS version that is compatible with this os are returned

  • attributes (str | list[str]) – Only devices that have all requested properties are returned.

Returns:

device_list – List of available devices, comptatible with the supplied filters.

Return type:

list[Device]

Examples

import qai_hub as hub
client = hub.Client()

# Get all devices
devices = client.get_devices()

# Get all devices matching this operating system
devices = client.get_devices(os="12")

# Get all devices matching this chipset
devices = client.get_devices(attributes=["chipset:quantization-snapdragon-8gen2"])

# Get all devices matching hardware
devices = client.get_devices(name="Samsung Galaxy S23")
get_frameworks()

Returns a list of available ML frameworks.

Returns:

framework_list – List of available frameworks.

Return type:

list[Framework]

Examples

import qai_hub as hub
client = hub.Client()

# Get all frameworks
frameworks = client.get_frameworks()
get_job(job_id, job_type=None)

Returns a job for a given id.

Parameters:
  • job_id (str) – id of a job.

  • job_type (JobType | None) – Type of the job. If this is not None and the target job is not this type, this method will raise.

Returns:

job – The job for the id.

Return type:

Job

Examples

Get job and print its status. The job ID is an alphanumeric string starting with j that you can get from the job’s URL (/jobs/<job ID>).:

import qai_hub as hub
client = hub.Client()

job = client.get_job("jabc123")
status = job.get_status()
get_job_summaries(offset=0, limit=50, creator=None, state=None, type=None)

Returns summary information for jobs matching the specified filters.

Parameters:
  • creator (str | None) – Fetch only jobs created by the specified creator. If unspecified, fetch all jobs owned by your organization.

  • state (State | None | list[State]) – Fetch only jobs that are currently in the specified state(s).

  • type (JobType | None) – Fetch only jobs of the specified type (compile, profile, etc.).

  • limit (int) – Maximum number of jobs to return.

  • offset (int) – How many jobs to skip over (in order to retrieve older jobs).

Returns:

List of job summaries in reverse chronological order (i.e., most recent first).

Return type:

list[JobSummary]

Examples

Print a selection of recent jobs:

import qai_hub as hub
client = hub.Client()

running = client.get_job_summaries(limit=10, state=hub.JobStatus.all_running_states())
failed = client.get_job_summaries(limit=10, state=hub.JobStatus.State.FAILED)
more_failed = client.get_job_summaries(offset=10, limit=10, state=hub.JobStatus.State.FAILED)
for j in running + failed + more_failed:
    print(f"{j.job_id}: {j.name} running since {j.date}: currently {j.status.code}")
get_model(model_id)

Returns a model for a given id.

Parameters:

model_id (str) – id of a model.

Returns:

model – The model for the id.

Return type:

Model

get_models(offset=0, limit=50)

Returns a list of models.

Parameters:
  • offset (int) – Offset the query to get even older models.

  • limit (int) – Maximum numbers of models to return.

Returns:

model_list – List of models.

Return type:

list[Model]

Examples

Fetch Model objects for your five most recent models:

import qai_hub as hub
client = hub.Client()

models = client.get_models(limit=5)
set_verbose(verbose=True)

If true, API calls may print progress to standard output.

Parameters:

verbose (bool) – Verbosity.

Return type:

None

Compiles and links multiple models or model(s) with multiple input_specs variants into a single weight-shared (multi-graph) QNN context binary. To specify multiple input_specs variants, the model needs to be ONNX with dynamic shapes or TorchScript (.pt).

Parameters:
  • models (Union[Model, TopLevelTracedModule, ScriptModule, ExportedProgram, ModelProto, bytes, str, Path, None, list[Union[Model, TopLevelTracedModule, ScriptModule, ExportedProgram, ModelProto, bytes, str, Path, None]]]) – A list of models. To represent multiple variants for a model, its entries in the list are repeated.

  • device (Device | list[Device]) – Device or list of devices. Results are per-device.

  • input_specs (Union[None, Mapping[str, tuple[int, ...] | tuple[tuple[int, ...], str]], list[Mapping[str, tuple[int, ...] | tuple[tuple[int, ...], str]] | None]]) – None | InputSpecs | list[InputSpecs | None]. Each InputSpecs in list corresponds to the model at the same index in models. Mandatory for TorchScript models. Please refer to the following example for usage.

  • graph_names (Optional[list[str]]) – list[str] | None. Graph names are used as keys to access model variants from the generated QNN Context Binary. If a list of models is provided, then graph_names are mandatory. All graph names must be unique. Each graph name in list corresponds to model at the same index in models.

  • name (Optional[str]) – Optional name for both Compile and Link jobs. Job names need not be unique.

  • compile_options (str | list[str]) – Cli-like flag options for the compile job. See Compile Options. --target_runtime qnn_dlc is automatically appended (the only supported target_runtime for this API). Can be a single string (broadcasted to all input_specs) or a List[str] corresponding to the model at same index in models. Do not specify graph name in compile_options, use graph_names argument instead.

  • link_options (str) – CLI-like flags for link job. See Link Options. It is a single string (broadcasted to all devices).

  • retry (bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.

Return type:

tuple[list[CompileJob], LinkJob | None] | list[tuple[list[CompileJob], LinkJob | None]]

Returns:

  • If a single device – (list[CompileJob], LinkJob | None)

  • If multiple devices – list[tuple[list[CompileJob], LinkJob | None]]

  • LinkJob is None if any compile job failed.

Constraints / Validation

  • If multiple variants are provided for a model, that model needs to be ONNX with dynamic shapes or TorchScript (.pt).

  • InputSpecs must be provided for TorchScript models.

  • Number of models, input_specs variants, compile_options variants, and graph_names must match.

  • All graph names must be unique.

  • Do not specify graph name in compile_options, use graph_names argument instead.

  • --target_runtime flag in compile_options is auto-set to --target_runtime qnn_dlc.

Examples

Submit two models with multiple I/O spec variants for compilation and linking:

import torch
import numpy as np
import qai_hub as hub

client = hub.Client()
pt_model1 = torch.jit.load("encoder.pt")
pt_model2 = torch.jit.load("decoder.pt")

input_specs1 = [
    {"x": ((1, 3, 224, 224), "float32")},
    {"x": ((1, 3, 192, 192), "float32")},
]
# Compile options are repeated to match the number of model input_specs variants
# Each input_spec can have its own compile options
compile_options1 = ["--force_channel_last_input x --quantize_io"] * 2

input_specs2 = [
    {"x": ((1, 3, 224, 224), "float32")},
    {"x": ((1, 3, 192, 192), "float32")},
    {"x": ((1, 3, 160, 160), "float32")},
]
compile_options2 = ["--qnn_options default_graph_htp_precision=FLOAT16"] * 3

# Model entries in list are repeated to match their respective number of input_specs variants
models = [pt_model1, pt_model1, pt_model2, pt_model2, pt_model2]

jobs = client.submit_compile_and_link_jobs(
    models,
    device=hub.Device("Samsung Galaxy S23"),
    name="encoder + decoder",
    input_specs=[*input_specs1, *input_specs2],
    graph_names=[
        "encoder_224", "encoder_192",
        "decoder_224", "decoder_192", "decoder_160",
    ],
    compile_options=[*compile_options1, *compile_options2],
    link_options="--qnn_options default_graph_htp_optimizations=O=3",
)
submit_compile_and_profile_jobs(model, device, name=None, input_specs=None, compile_options='', profile_options='', single_compile=True, calibration_data=None, retry=True, project=None)

Submits a compilation job and a profile job.

Parameters:
  • model (Union[Model, TopLevelTracedModule, ScriptModule, ExportedProgram, ModelProto, bytes, str, Path, None]) – Model to compile and profile.

  • device (Device | list[Device]) – Devices on which to run the jobs.

  • name (Optional[str]) – Optional name for both the jobs. Job names need not be unique.

  • input_specs (Optional[Mapping[str, tuple[int, ...] | tuple[tuple[int, ...], str]]]) –

    Required if model is a PyTorch model. Keys in Dict (which is ordered in Python 3.7+) define the input names for the target model (e.g., TFLite model) created from this profile job, and may be different from the names in PyTorch model.

    An input shape can either be a tuple[int, …], ie (1, 2, 3), or it can be a tuple[tuple[int, …], str], ie ((1, 2, 3), “int32”)). The latter form can be used to specify the type of the input. If a type is not specified, it defaults to “float32”. Currently, only “float32”, “int8”, “int16”, “int32”, “int64”, “uint8”, and “uint16” are accepted types.

    For example, a PyTorch module with forward(self, x, y) may have input_specs=dict(a=(1,2), b=(1, 3)). When using the resulting target model (e.g. a TFLite model) from this profile job, the inputs must have keys a and b, not x and y. Similarly, if this target model is used in an inference job (see submit_inference_job()), the dataset must have entries a, b in this order, not x, y

    If model is an ONNX model, input_specs are optional. input_specs can be used to overwrite the model’s input names and the dynamic extents for the input shapes. If input_specs is not None, it must be compatible with the model, or the server will return an error.

  • single_compile (bool) –

    If True, create a single compile job on a single device compatible with all devices. The CompileJob in every tuple in the returned List will point the same AI Hub job.

    If False, create a compile job for each device.

  • compile_options (str) – Cli-like flag options for the compile job. See Compile Options.

  • profile_options (str) – Cli-like flag options for the profile job. See Profile & Inference Options.

  • calibration_data (Union[Dataset, Mapping[str, list[ndarray]], str, None]) – Data, Dataset, or Dataset ID to use for post-training quantization. PTQ will be applied to the model during translation.

  • retry (bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.

Returns:

jobs – Returns a tuple of CompileJob and ProfileJob.

Return type:

tuple[CompileJob, ProfileJob | None] | list[tuple[CompileJob, ProfileJob | None]

Examples

Submit a traced Torch model for profiling as a QNN DLC on a Samsung Galaxy S23:

import qai_hub as hub
import torch

client = hub.Client()
pt_model = torch.jit.load("mobilenet.pt")

input_shapes = (1, 3, 224, 224)

model = client.upload_model(pt_model)

jobs = client.submit_compile_and_profile_jobs(
    model, device=hub.Device("Samsung Galaxy 23"),
    name="mobilenet (1, 3, 224, 224)",
    input_specs=dict(x=input_shapes),
    compile_options="--target_runtime qnn_dlc"
)

For more examples, see Compiling Models and Profiling Models.

submit_compile_job(model, device, name=None, input_specs=None, options='', single_compile=True, calibration_data=None, retry=True, project=None)

Submits a compile job.

Parameters:
  • model (Union[Model, TopLevelTracedModule, ScriptModule, ExportedProgram, ModelProto, bytes, str, Path, None]) – Model to compile. The model must be a PyTorch or an ONNX / ONNX wrappable model (eg: QNN Context Binary).

  • device (Device | list[Device]) – Devices for which to compile the input model.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • input_specs (Optional[Mapping[str, tuple[int, ...] | tuple[tuple[int, ...], str]]]) –

    Required if model is a PyTorch model. Keys in Dict (which is ordered in Python 3.7+) define the input names for the target model (e.g., TFLite model) created from this profile job, and may be different from the names in PyTorch model.

    An input shape can either be a tuple[int, …], ie (1, 2, 3), or it can be a tuple[tuple[int, …], str], ie ((1, 2, 3), “int32”)). The latter form can be used to specify the type of the input. If a type is not specified, it defaults to “float32”. Currently, only “float32”, “int8”, “int16”, “int32”, “int64”, “uint8”, and “uint16” are accepted types.

    For example, a PyTorch module with forward(self, x, y) may have input_specs=dict(a=(1,2), b=(1, 3)). When using the resulting target model (e.g. a TFLite model) from this profile job, the inputs must have keys a and b, not x and y. Similarly, if this target model is used in an inference job (see submit_inference_job()), the dataset must have entries a, b in this order, not x, y

    If model is an ONNX model, input_specs are optional. input_specs can be used to overwrite the model’s input names and the dynamic extents for the input shapes. If input_specs is not None, it must be compatible with the model, or the server will return an error.

  • options (str) – Cli-like flag options. See Compile Options.

  • single_compile (bool) – If True, submits a single compile job that creates an asset compatible with all devices. If False, create a compile job for each device.

  • calibration_data (Union[Dataset, Mapping[str, list[ndarray]], str, None]) – Data, Dataset, or Dataset ID to use for post-training quantization. PTQ will be applied to the model during translation.

  • retry (bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.

Returns:

job – Returns the compile jobs. Always one job if single_compile is “True”, and possibly multiple jobs if it is “False”.

Return type:

CompileJob | list[CompileJob]

Examples

Submit a traced Torch model for compile on an Samsung Galaxy S23:

import qai_hub as hub
import torch

client = hub.Client()
pt_model = torch.jit.load("mobilenet.pt")

input_specs = (1, 3, 224, 224)

model = client.upload_model(pt_model)

job = client.submit_compile_job(model, device=hub.Device("Samsung Galaxy S23"),
                                name="mobilenet (1, 3, 224, 224)",
                                input_specs=dict(x=input_specs))

For more examples, see Compiling Models.

submit_inference_job(model, device, inputs, name=None, options='', retry=True, project=None)

Submits an inference job.

Parameters:
  • model (Model | bytes | str | Path | None) – Model to run inference with. Must be one of the following: (1) Model object from a compile job via get_target_model() (2) Any TargetModel (3) Path to Any TargetModel

  • device (Device | list[Device]) – Devices on which to run the job.

  • inputs (Dataset | Mapping[str, list[ndarray]] | str) –

    If Dataset, it must have matching schema to model. For example, if model is a target model from a compile job, and the compile job was submitted with input_shapes=dict(a=(1, 2), b=(1, 3)). The dataset must also be created with dict(a=<list_of_np_array>, b=<list_of_np_array>). See submit_compile_job() for details.

    If Dict, it’s uploaded as a new Dataset, equivalent to calling upload_dataset() with arbitrary name. Note that Dict is ordered in Python 3.7+ and we rely on the order to match the schema. See the paragraph above for an example.

    If str, it’s a h5 path to Dataset.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • options (str) – Cli-like flag options. See Profile & Inference Options.

  • retry (bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.

Returns:

job – Returns the inference jobs.

Return type:

InferenceJob | list[InferenceJob]

Examples

Submit a TFLite model for inference on a Samsung Galaxy S23:

import qai_hub as hub
import numpy as np

client = hub.Client()

# TFlite model path
tflite_model = "squeeze_net.tflite"

# Setup input data
input_tensor = np.random.random((1, 3, 227, 227)).astype(np.float32)

# Submit inference job
job = client.submit_inference_job(
    tflite_model,
    device=hub.Device("Samsung Galaxy S23"),
    name="squeeze_net (1, 3, 227, 227)",
    inputs=dict(image=[input_tensor]),
)

# Load the output data into a dictionary of numpy arrays
output_tensors = job.download_output_data()

For more examples, see Running Inference.

Submits a link job.

A link job generates a context binary model from one or more input models. The input models must all be QNN DLC models. This is particularly useful if the input models contain overlapping weights, since the weights will be shared between the graphs.

To profile or inference a multi-graph QNN context binary, please use --qnn_options context_enable_graphs=<graph name> to select the graph.

Parameters:
  • models (Model | str | Path | None | list[Model | str | Path | None]) – Models to link. Each model in the list must be a QNN DLC model.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • options (str) – Cli-like flag options. See Link Options.

Returns:

job – Returns the link job.

Return type:

LinkJob

submit_profile_job(model, device, name=None, options='', retry=True, project=None)

Submits a profile job.

Parameters:
  • model (Model | bytes | str | Path | None) – Model to profile. Must not be a PyTorch model.

  • device (Device | list[Device]) – Devices on which to run the profile job.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • options (str) – Cli-like flag options. See Profile & Inference Options.

  • retry (bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.

Returns:

job – Returns the profile jobs.

Return type:

ProfileJob | list[ProfileJob]

Examples

Submit a tflite model for profiling on a Samsung Galaxy S23:

import qai_hub as hub
client = hub.Client()

model = client.upload_model("mobilenet.tflite")

job = client.submit_profile_job(model,
                                device=hub.Device("Samsung Galaxy S23"),
                                name="mobilenet (1, 3, 224, 224)")

For more examples, see Profiling Models.

submit_quantize_job(model, calibration_data, weights_dtype=QuantizeDtype.INT8, activations_dtype=QuantizeDtype.INT8, name=None, options='', project=None)

Submits a quantize job. Input model must be onnx. The resulting target model on a completed job will be a quantized onnx model in QDQ format.

Parameters:
  • model (Model | ModelProto | str | Path | None) – Model to quantize. The model must be a PyTorch model or an ONNX model

  • calibration_data (Dataset | Mapping[str, list[ndarray]] | str) – Data, Dataset, or Dataset ID used to calibrate quantization parameters.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • weights_dtype (QuantizeDtype) – The data type to which weights will be quantized.

  • activations_dtype (QuantizeDtype) – The data type to which activations will be quantized.

  • options (str) – Cli-like flag options. See Quantize Options.

Returns:

job – Returns the quantize job.

Return type:

QuantizeJob

Examples

Submit an onnx model for quantization:

import numpy as np
import qai_hub as hub

client = hub.Client()
model_file = "mobilenet_v2.onnx"
calibration_data = {"t.1": [np.random.randn(1, 3, 224, 224).astype(np.float32)]}
job = client.submit_quantize_job(
    model_file,
    calibration_data,
    weights_dtype=hub.QuantizeDtype.INT8,
    activations_dtype=hub.QuantizeDtype.INT8,
    name="mobilenet",
)
upload_dataset(data, name=None, project=None)

Upload a dataset that expires in 30 days. A Dataset has an ordered named schema. For example, dict(x=…, y=…) has a different schema than dict(y=…, x=…).

Parameters:
  • data (Mapping[str, list[ndarray]] | str) –

    If data is a dict, ordered string keys defines the dataset schema. Length of the list is the number of samples and must be the same for all features

    If string, it must be an h5 path (str) to a saved dataset.

  • name (str | None) – Optional name of the dataset. If a name is not specified, it is decided either based on the data or the file name.

Returns:

dataset – Returns a dataset object if successful.

Return type:

Dataset

Examples

import qai_hub as hub
import numpy as np

# Define dataset
array = np.reshape(np.array(range(15)), (3, 5)).astype(np.float32)

# Upload dataset
client = hub.Client()
client.upload_dataset(dict(x=[array]), 'simplenet_dataset')
upload_model(model, name=None, project=None)

Uploads a model.

Parameters:
  • model (Union[TopLevelTracedModule, ScriptModule, ExportedProgram, ModelProto, bytes, str]) – In memory representation or filename of the model to upload.

  • name (Optional[str]) – Optional name of the model. If a name is not specified, it is decided either based on the model or the file name.

Returns:

model – Returns a model if successful.

Return type:

Model

Raises:

UserError – Failure in the model input.

Examples

import qai_hub as hub
import torch

client = hub.Client()
pt_model = torch.jit.load("model.pt")

# Upload model
model = client.upload_model(pt_model)

# Jobs can now be scheduled using this model
device = hub.Device("Samsung Galaxy S23", "12")
cjob = client.submit_compile_job(model, device=device,
                                 name="pt_model (1, 3, 256, 256)",
                                 input_shapes=dict(x=(1, 3, 256, 256)))
model = cjob.get_target_model()
pjob = client.submit_profile_job(model, device=device,
                                 name="pt_model (1, 3, 256, 256)")