qai_hub.submit_compile_and_link_jobs
- submit_compile_and_link_jobs(models, device, name=None, input_specs=None, graph_names=None, compile_options='', link_options='', retry=True)
Compiles and links multiple models or model(s) with multiple
input_specsvariants into a single weight-shared (multi-graph) QNN context binary. To specify multipleinput_specsvariants, the model needs to be ONNX with dynamic shapes or TorchScript (.pt).- Parameters:
models (
Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None,list[Union[Model,TopLevelTracedModule,ScriptModule,ExportedProgram,ModelProto,bytes,str,Path,None]]]) – A list of models. To represent multiple variants for a model, its entries in the list are repeated.device (
Device|list[Device]) – Device or list of devices. Results are per-device.input_specs (
Union[None,Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]],list[Mapping[str,tuple[int,...] |tuple[tuple[int,...],str]] |None]]) – None | InputSpecs | list[InputSpecs | None]. Each InputSpecs in list corresponds to the model at the same index inmodels. Mandatory for TorchScript models. Please refer to the following example for usage.graph_names (
Optional[list[str]]) – list[str] | None. Graph names are used as keys to access model variants from the generated QNN Context Binary. If a list of models is provided, thengraph_namesare mandatory. All graph names must be unique. Each graph name in list corresponds to model at the same index inmodels.name (
Optional[str]) – Optional name for both Compile and Link jobs. Job names need not be unique.compile_options (
str|list[str]) – Cli-like flag options for the compile job. See Compile Options.--target_runtime qnn_dlcis automatically appended (the only supported target_runtime for this API). Can be a single string (broadcasted to all input_specs) or a List[str] corresponding to the model at same index inmodels. Do not specify graph name in compile_options, usegraph_namesargument instead.link_options (
str) – CLI-like flags for link job. See Link Options. It is a single string (broadcasted to all devices).retry (
bool) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Return type:
tuple[list[CompileJob],LinkJob|None] |list[tuple[list[CompileJob],LinkJob|None]]- Returns:
If a single device – (list[CompileJob], LinkJob | None)
If multiple devices – list[tuple[list[CompileJob], LinkJob | None]]
LinkJob is None if any compile job failed.
Constraints / Validation
If multiple variants are provided for a model, that model needs to be ONNX with dynamic shapes or TorchScript (.pt).
InputSpecs must be provided for TorchScript models.
Number of models, input_specs variants, compile_options variants, and graph_names must match.
All graph names must be unique.
Do not specify graph name in compile_options, use
graph_namesargument instead.--target_runtimeflag incompile_optionsis auto-set to--target_runtime qnn_dlc.
Examples
Submit two models with multiple I/O spec variants for compilation and linking:
import torch import numpy as np import qai_hub as hub client = hub.Client() pt_model1 = torch.jit.load("encoder.pt") pt_model2 = torch.jit.load("decoder.pt") input_specs1 = [ {"x": ((1, 3, 224, 224), "float32")}, {"x": ((1, 3, 192, 192), "float32")}, ] # Compile options are repeated to match the number of model input_specs variants # Each input_spec can have its own compile options compile_options1 = ["--force_channel_last_input x --quantize_io"] * 2 input_specs2 = [ {"x": ((1, 3, 224, 224), "float32")}, {"x": ((1, 3, 192, 192), "float32")}, {"x": ((1, 3, 160, 160), "float32")}, ] compile_options2 = ["--qnn_options default_graph_htp_precision=FLOAT16"] * 3 # Model entries in list are repeated to match their respective number of input_specs variants models = [pt_model1, pt_model1, pt_model2, pt_model2, pt_model2] jobs = client.submit_compile_and_link_jobs( models, device=hub.Device("Samsung Galaxy S23"), name="encoder + decoder", input_specs=[*input_specs1, *input_specs2], graph_names=[ "encoder_224", "encoder_192", "decoder_224", "decoder_192", "decoder_160", ], compile_options=[*compile_options1, *compile_options2], link_options="--qnn_options default_graph_htp_optimizations=O=3", )