Client
- class Client(config=None, profile=None, default_project_id=None)
Client object to interact with the Qualcomm AI Hub API.
Examples
Create a client using credentials from
~/.qai_hub/client.ini:import qai_hub as hub client = hub.Client() model = client.upload_model("model.pt")
Create a client using a named profile from
~/.qai_hub/my_profile.ini:import qai_hub as hub client = hub.Client(profile="my_profile") model = client.upload_model("model.pt")
- get_dataset(dataset_id)
Returns a dataset for a given id.
- Parameters:
dataset_id (str) – id of a dataset.
- Returns:
dataset – The dataset for the id.
- Return type:
Examples
Get dataset and print information about it (granted you provide a valid dataset ID):
import qai_hub as hub client = hub.Client() dataset = client.get_dataset("dabc123") print("Dataset information:", dataset)
- get_datasets(offset=0, limit=50)
Returns a list of datasets visible to you.
- Parameters:
offset (int) – Offset the query to get even older datasets.
limit (int) – Maximum numbers of datasets to return.
- Returns:
dataset_list – List of datasets.
- Return type:
list[Dataset]
Examples
Fetch
Datasetobjects for your five most recent datasets:import qai_hub as hub client = hub.Client() datasets = client.get_datasets(limit=5)
- get_device_attributes()
Returns the super set of available device attributes.
Any of these attributes can be used to filter devices when using
get_devices().- Returns:
attribute_list – Super set of all available device attributes.
- Return type:
list[str]
Examples
import qai_hub as hub client = hub.Client() attributes = client.get_device_attributes()
- get_devices(name='', os='', attributes=[])
Returns a list of available devices.
The returned list of devices are compatible with the supplied name, os, and attributes. The name must be an exact match with an existing device and os can either be a version (“15.2”) or a version range (“[14,15)”).
- Parameters:
name (
str) – Only devices with this exact name will be returned.os (
str) – Only devices with an OS version that is compatible with this os are returnedattributes (
str|list[str]) – Only devices that have all requested properties are returned.
- Returns:
device_list – List of available devices, comptatible with the supplied filters.
- Return type:
list[Device]
Examples
import qai_hub as hub client = hub.Client() # Get all devices devices = client.get_devices() # Get all devices matching this operating system devices = client.get_devices(os="12") # Get all devices matching this chipset devices = client.get_devices(attributes=["chipset:quantization-snapdragon-8gen2"]) # Get all devices matching hardware devices = client.get_devices(name="Samsung Galaxy S23")
- get_frameworks()
Returns a list of available ML frameworks.
- Returns:
framework_list – List of available frameworks.
- Return type:
list[Framework]
Examples
import qai_hub as hub client = hub.Client() # Get all frameworks frameworks = client.get_frameworks()
- get_job(job_id, job_type=None)
Returns a job for a given id.
- Parameters:
job_id (str) – id of a job.
job_type (JobType | None) – Type of the job. If this is not None and the target job is not this type, this method will raise.
- Returns:
job – The job for the id.
- Return type:
Examples
Get job and print its status. The job ID is an alphanumeric string starting with j that you can get from the job’s URL (/jobs/<job ID>).:
import qai_hub as hub client = hub.Client() job = client.get_job("jabc123") status = job.get_status()
- get_job_summaries(offset=0, limit=50, creator=None, state=None, type=None)
Returns summary information for jobs matching the specified filters.
- Parameters:
creator (str | None) – Fetch only jobs created by the specified creator. If unspecified, fetch all jobs owned by your organization.
state (State | None | list[State]) – Fetch only jobs that are currently in the specified state(s).
type (JobType | None) – Fetch only jobs of the specified type (compile, profile, etc.).
limit (int) – Maximum number of jobs to return.
offset (int) – How many jobs to skip over (in order to retrieve older jobs).
- Returns:
List of job summaries in reverse chronological order (i.e., most recent first).
- Return type:
list[JobSummary]
Examples
Print a selection of recent jobs:
import qai_hub as hub client = hub.Client() running = client.get_job_summaries(limit=10, state=hub.JobStatus.all_running_states()) failed = client.get_job_summaries(limit=10, state=hub.JobStatus.State.FAILED) more_failed = client.get_job_summaries(offset=10, limit=10, state=hub.JobStatus.State.FAILED) for j in running + failed + more_failed: print(f"{j.job_id}: {j.name} running since {j.date}: currently {j.status.code}")
- get_model(model_id)
Returns a model for a given id.
- Parameters:
model_id (str) – id of a model.
- Returns:
model – The model for the id.
- Return type:
- get_models(offset=0, limit=50)
Returns a list of models.
- Parameters:
offset (int) – Offset the query to get even older models.
limit (int) – Maximum numbers of models to return.
- Returns:
model_list – List of models.
- Return type:
list[Model]
Examples
Fetch
Modelobjects for your five most recent models:import qai_hub as hub client = hub.Client() models = client.get_models(limit=5)
- set_verbose(verbose=True)
If true, API calls may print progress to standard output.
- Parameters:
verbose (
bool) – Verbosity.- Return type:
None
- submit_compile_and_link_jobs(models, device, name=None, input_specs=None, graph_names=None, compile_options='', link_options='', retry=True)
Compiles and links multiple models or model(s) with multiple
input_specsvariants into a single weight-shared (multi-graph) QNN context binary. To specify multipleinput_specsvariants, the model needs to be ONNX with dynamic shapes or TorchScript (.pt).- Parameters:
models (
Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None,list[Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None]]]) – A list of models. To represent multiple variants for a model, its entries in the list are repeated.device (
Device|list[Device]) – Device or list of devices. Results are per-device.input_specs (
Union[None,Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]],list[Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]] |None]]) – None | InputSpecs | list[InputSpecs | None]. Each InputSpecs in list corresponds to the model at the same index inmodels. Mandatory for TorchScript models. Please refer to the following example for usage.graph_names (
Optional[list[str]]) – list[str] | None. Graph names are used as keys to access model variants from the generated QNN Context Binary. If a list of models is provided, thengraph_namesare mandatory. All graph names must be unique. Each graph name in list corresponds to model at the same index inmodels.name (
Optional[str]) – Optional name for both Compile and Link jobs. Job names need not be unique.compile_options (
str|list[str]) – Cli-like flag options for the compile job. See Compile Options.--target_runtime qnn_dlcis automatically appended (the only supported target_runtime for this API). Can be a single string (broadcasted to all input_specs) or a List[str] corresponding to the model at same index inmodels. Do not specify graph name in compile_options, usegraph_namesargument instead.link_options (
str) – CLI-like flags for link job. See Link Options. It is a single string (broadcasted to all devices).retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Return type:
tuple[list[CompileJob],LinkJob|None] |list[tuple[list[CompileJob],LinkJob|None]]- Returns:
If a single device – (list[CompileJob], LinkJob | None)
If multiple devices – list[tuple[list[CompileJob], LinkJob | None]]
LinkJob is None if any compile job failed.
Constraints / Validation
If multiple variants are provided for a model, that model needs to be ONNX with dynamic shapes or TorchScript (.pt).
InputSpecs must be provided for TorchScript models.
Number of models, input_specs variants, compile_options variants, and graph_names must match.
All graph names must be unique.
Do not specify graph name in compile_options, use
graph_namesargument instead.--target_runtimeflag incompile_optionsis auto-set to--target_runtime qnn_dlc.
Examples
Submit two models with multiple I/O spec variants for compilation and linking:
import torch import numpy as np import qai_hub as hub client = hub.Client() pt_model1 = torch.jit.load("encoder.pt") pt_model2 = torch.jit.load("decoder.pt") input_specs1 = [ {"x": ((1, 3, 224, 224), "float32")}, {"x": ((1, 3, 192, 192), "float32")}, ] # Compile options are repeated to match the number of model input_specs variants # Each input_spec can have its own compile options compile_options1 = ["--force_channel_last_input x --quantize_io"] * 2 input_specs2 = [ {"x": ((1, 3, 224, 224), "float32")}, {"x": ((1, 3, 192, 192), "float32")}, {"x": ((1, 3, 160, 160), "float32")}, ] compile_options2 = ["--qnn_options default_graph_htp_precision=FLOAT16"] * 3 # Model entries in list are repeated to match their respective number of input_specs variants models = [pt_model1, pt_model1, pt_model2, pt_model2, pt_model2] jobs = client.submit_compile_and_link_jobs( models, device=hub.Device("Samsung Galaxy S23"), name="encoder + decoder", input_specs=[*input_specs1, *input_specs2], graph_names=[ "encoder_224", "encoder_192", "decoder_224", "decoder_192", "decoder_160", ], compile_options=[*compile_options1, *compile_options2], link_options="--qnn_options default_graph_htp_optimizations=O=3", )
- submit_compile_and_profile_jobs(model, device, name=None, input_specs=None, compile_options='', profile_options='', single_compile=True, calibration_data=None, retry=True, project=None)
Submits a compilation job and a profile job.
- Parameters:
model (
Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None]) – Model to compile and profile.device (
Device|list[Device]) – Devices on which to run the jobs.name (
Optional[str]) – Optional name for both the jobs. Job names need not be unique.input_specs (
Optional[Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]]]) –Required if model is a PyTorch model. Keys in Dict (which is ordered in Python 3.7+) define the input names for the target model (e.g., TFLite model) created from this profile job, and may be different from the names in PyTorch model.
An input shape can either be a tuple[int, …], ie (1, 2, 3), or it can be a tuple[tuple[int, …], str], ie ((1, 2, 3), “int32”)). The latter form can be used to specify the type of the input. If a type is not specified, it defaults to “float32”. Currently, only “float32”, “int8”, “int16”, “int32”, “int64”, “uint8”, and “uint16” are accepted types.
For example, a PyTorch module with forward(self, x, y) may have input_specs=dict(a=(1,2), b=(1, 3)). When using the resulting target model (e.g. a TFLite model) from this profile job, the inputs must have keys a and b, not x and y. Similarly, if this target model is used in an inference job (see
submit_inference_job()), the dataset must have entries a, b in this order, not x, yIf model is an ONNX model, input_specs are optional. input_specs can be used to overwrite the model’s input names and the dynamic extents for the input shapes. If input_specs is not None, it must be compatible with the model, or the server will return an error.
single_compile (
bool) –If True, create a single compile job on a single device compatible with all devices. The CompileJob in every tuple in the returned List will point the same AI Hub job.
If False, create a compile job for each device.
compile_options (
str) – Cli-like flag options for the compile job. See Compile Options.profile_options (
str) – Cli-like flag options for the profile job. See Profile & Inference Options.calibration_data (
Union[Dataset,Mapping[str,list[ndarray]],str,None]) – Data, Dataset, or Dataset ID to use for post-training quantization. PTQ will be applied to the model during translation.retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Returns:
jobs – Returns a tuple of CompileJob and ProfileJob.
- Return type:
tuple[CompileJob, ProfileJob | None] | list[tuple[CompileJob, ProfileJob | None]
Examples
Submit a traced Torch model for profiling as a QNN DLC on a Samsung Galaxy S23:
import qai_hub as hub import torch client = hub.Client() pt_model = torch.jit.load("mobilenet.pt") input_shapes = (1, 3, 224, 224) model = client.upload_model(pt_model) jobs = client.submit_compile_and_profile_jobs( model, device=hub.Device("Samsung Galaxy 23"), name="mobilenet (1, 3, 224, 224)", input_specs=dict(x=input_shapes), compile_options="--target_runtime qnn_dlc" )
For more examples, see Compiling Models and Profiling Models.
- submit_compile_job(model, device, name=None, input_specs=None, options='', single_compile=True, calibration_data=None, retry=True, project=None)
Submits a compile job.
- Parameters:
model (
Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None]) – Model to compile. The model must be a PyTorch or an ONNX / ONNX wrappable model (eg: QNN Context Binary).device (
Device|list[Device]) – Devices for which to compile the input model.name (
Optional[str]) – Optional name for the job. Job names need not be unique.input_specs (
Optional[Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]]]) –Required if model is a PyTorch model. Keys in Dict (which is ordered in Python 3.7+) define the input names for the target model (e.g., TFLite model) created from this profile job, and may be different from the names in PyTorch model.
An input shape can either be a tuple[int, …], ie (1, 2, 3), or it can be a tuple[tuple[int, …], str], ie ((1, 2, 3), “int32”)). The latter form can be used to specify the type of the input. If a type is not specified, it defaults to “float32”. Currently, only “float32”, “int8”, “int16”, “int32”, “int64”, “uint8”, and “uint16” are accepted types.
For example, a PyTorch module with forward(self, x, y) may have input_specs=dict(a=(1,2), b=(1, 3)). When using the resulting target model (e.g. a TFLite model) from this profile job, the inputs must have keys a and b, not x and y. Similarly, if this target model is used in an inference job (see
submit_inference_job()), the dataset must have entries a, b in this order, not x, yIf model is an ONNX model, input_specs are optional. input_specs can be used to overwrite the model’s input names and the dynamic extents for the input shapes. If input_specs is not None, it must be compatible with the model, or the server will return an error.
options (
str) – Cli-like flag options. See Compile Options.single_compile (
bool) – If True, submits a single compile job that creates an asset compatible with all devices. If False, create a compile job for each device.calibration_data (
Union[Dataset,Mapping[str,list[ndarray]],str,None]) – Data, Dataset, or Dataset ID to use for post-training quantization. PTQ will be applied to the model during translation.retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Returns:
job – Returns the compile jobs. Always one job if single_compile is “True”, and possibly multiple jobs if it is “False”.
- Return type:
CompileJob | list[CompileJob]
Examples
Submit a traced Torch model for compile on an Samsung Galaxy S23:
import qai_hub as hub import torch client = hub.Client() pt_model = torch.jit.load("mobilenet.pt") input_specs = (1, 3, 224, 224) model = client.upload_model(pt_model) job = client.submit_compile_job(model, device=hub.Device("Samsung Galaxy S23"), name="mobilenet (1, 3, 224, 224)", input_specs=dict(x=input_specs))
For more examples, see Compiling Models.
- submit_inference_job(model, device, inputs, name=None, options='', retry=True, project=None)
Submits an inference job.
- Parameters:
model (
Model|bytes|str|Path|None) – Model to run inference with. Must be one of the following: (1) Model object from a compile job viaget_target_model()(2) Any TargetModel (3) Path to Any TargetModeldevice (
Device|list[Device]) – Devices on which to run the job.inputs (
Dataset|Mapping[str,list[ndarray]] |str) –If Dataset, it must have matching schema to model. For example, if model is a target model from a compile job, and the compile job was submitted with input_shapes=dict(a=(1, 2), b=(1, 3)). The dataset must also be created with dict(a=<list_of_np_array>, b=<list_of_np_array>). See
submit_compile_job()for details.If Dict, it’s uploaded as a new Dataset, equivalent to calling
upload_dataset()with arbitrary name. Note that Dict is ordered in Python 3.7+ and we rely on the order to match the schema. See the paragraph above for an example.If str, it’s a h5 path to Dataset.
name (
Optional[str]) – Optional name for the job. Job names need not be unique.options (
str) – Cli-like flag options. See Profile & Inference Options.retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Returns:
job – Returns the inference jobs.
- Return type:
InferenceJob | list[InferenceJob]
Examples
Submit a TFLite model for inference on a Samsung Galaxy S23:
import qai_hub as hub import numpy as np client = hub.Client() # TFlite model path tflite_model = "squeeze_net.tflite" # Setup input data input_tensor = np.random.random((1, 3, 227, 227)).astype(np.float32) # Submit inference job job = client.submit_inference_job( tflite_model, device=hub.Device("Samsung Galaxy S23"), name="squeeze_net (1, 3, 227, 227)", inputs=dict(image=[input_tensor]), ) # Load the output data into a dictionary of numpy arrays output_tensors = job.download_output_data()
For more examples, see Running Inference.
- submit_link_job(models, device, name=None, options='', project=None)
Submits a link job.
A link job generates a context binary model from one or more input models. The input models must all be QNN DLC models. This is particularly useful if the input models contain overlapping weights, since the weights will be shared between the graphs.
To profile or inference a multi-graph QNN context binary, please use
--qnn_options context_enable_graphs=<graph name>to select the graph.- Parameters:
models (
Model|str|Path|None|list[Model|str|Path|None]) – Models to link. Each model in the list must be a QNN DLC model.name (
Optional[str]) – Optional name for the job. Job names need not be unique.options (
str) – Cli-like flag options. See Link Options.
- Returns:
job – Returns the link job.
- Return type:
- submit_profile_job(model, device, name=None, options='', retry=True, project=None)
Submits a profile job.
- Parameters:
model (
Model|bytes|str|Path|None) – Model to profile. Must not be a PyTorch model.device (
Device|list[Device]) – Devices on which to run the profile job.name (
Optional[str]) – Optional name for the job. Job names need not be unique.options (
str) – Cli-like flag options. See Profile & Inference Options.retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Returns:
job – Returns the profile jobs.
- Return type:
ProfileJob | list[ProfileJob]
Examples
Submit a tflite model for profiling on a Samsung Galaxy S23:
import qai_hub as hub client = hub.Client() model = client.upload_model("mobilenet.tflite") job = client.submit_profile_job(model, device=hub.Device("Samsung Galaxy S23"), name="mobilenet (1, 3, 224, 224)")
For more examples, see Profiling Models.
- submit_quantize_job(model, calibration_data, weights_dtype=QuantizeDtype.INT8, activations_dtype=QuantizeDtype.INT8, name=None, options='', project=None)
Submits a quantize job. Input model must be onnx. The resulting target model on a completed job will be a quantized onnx model in QDQ format.
- Parameters:
model (
Model|ModelProto|str|Path|None) – Model to quantize. The model must be a PyTorch model or an ONNX modelcalibration_data (
Dataset|Mapping[str,list[ndarray]] |str) – Data, Dataset, or Dataset ID used to calibrate quantization parameters.name (
Optional[str]) – Optional name for the job. Job names need not be unique.weights_dtype (
QuantizeDtype) – The data type to which weights will be quantized.activations_dtype (
QuantizeDtype) – The data type to which activations will be quantized.options (
str) – Cli-like flag options. See Quantize Options.
- Returns:
job – Returns the quantize job.
- Return type:
Examples
Submit an onnx model for quantization:
import numpy as np import qai_hub as hub client = hub.Client() model_file = "mobilenet_v2.onnx" calibration_data = {"t.1": [np.random.randn(1, 3, 224, 224).astype(np.float32)]} job = client.submit_quantize_job( model_file, calibration_data, weights_dtype=hub.QuantizeDtype.INT8, activations_dtype=hub.QuantizeDtype.INT8, name="mobilenet", )
- upload_dataset(data, name=None, project=None)
Upload a dataset that expires in 30 days. A Dataset has an ordered named schema. For example, dict(x=…, y=…) has a different schema than dict(y=…, x=…).
- Parameters:
data (
Mapping[str,list[ndarray]] |str) –If data is a dict, ordered string keys defines the dataset schema. Length of the list is the number of samples and must be the same for all features
If string, it must be an h5 path (str) to a saved dataset.
name (str | None) – Optional name of the dataset. If a name is not specified, it is decided either based on the data or the file name.
- Returns:
dataset – Returns a dataset object if successful.
- Return type:
Examples
import qai_hub as hub import numpy as np # Define dataset array = np.reshape(np.array(range(15)), (3, 5)).astype(np.float32) # Upload dataset client = hub.Client() client.upload_dataset(dict(x=[array]), 'simplenet_dataset')
- upload_model(model, name=None, project=None)
Uploads a model.
- Parameters:
model (
Union[TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str]) – In memory representation or filename of the model to upload.name (
Optional[str]) – Optional name of the model. If a name is not specified, it is decided either based on the model or the file name.
- Returns:
model – Returns a model if successful.
- Return type:
- Raises:
UserError – Failure in the model input.
Examples
import qai_hub as hub import torch client = hub.Client() pt_model = torch.jit.load("model.pt") # Upload model model = client.upload_model(pt_model) # Jobs can now be scheduled using this model device = hub.Device("Samsung Galaxy S23", "12") cjob = client.submit_compile_job(model, device=device, name="pt_model (1, 3, 256, 256)", input_shapes=dict(x=(1, 3, 256, 256))) model = cjob.get_target_model() pjob = client.submit_profile_job(model, device=device, name="pt_model (1, 3, 256, 256)")