Device transfer: iPhone 15 Pro quickstart (CoreML)

ChipA17 Pro

RAM8 GB

Neural Engine16-core, 35 TOPS

BackendCoreML

Throughput (3B int4)10 to 15 tok/s

iOS17 or later

Step 1. Install coremltools on the Mac.

CoreML conversion runs on a Mac, not the phone. Use a venv. coremltools ships the converter and the model-runner used at export time.

$ python3 -m venv ~/kolm-coreml
$ source ~/kolm-coreml/bin/activate
$ pip install --upgrade pip
$ pip install coremltools
# sanity check
$ python -c "import coremltools; print(coremltools.__version__)"

Also install Xcode from the App Store. Xcode 15 or later is required for iOS 17 deployment targets.

Step 2. Export from your `.kolm`.

Run on the Mac. CoreML export produces an .mlpackage directory that Xcode treats as a resource. Quantization tier is honored.

$ kolm export your-artifact.kolm \
              --backend coreml \
              --device "iPhone 15 Pro (8GB)" \
              --quant int4 \
              --out ./exports/

The output is ./exports/your-artifact-int4.mlpackage. This is the file Xcode needs.

Step 3. Bundle into an Xcode project.

Create or open an iOS app target in Xcode. Drag the .mlpackage into the Project Navigator. When prompted, check the box for your app target under Target Membership and use the Copy Items If Needed option.

In Swift, the bundled package is auto-generated as a class with the same name as the file (without the extension). Load and run:

import CoreML

do {
    let config = MLModelConfiguration()
    config.computeUnits = .all  // CPU + GPU + Neural Engine
    let model = try YourArtifactInt4(configuration: config)
    let result = try model.prediction(input: input)
    print(result.output)
} catch {
    print("CoreML inference failed: \\(error)")
}

Step 4. Run on device.

Two deployment paths:

Personal-team build (free Apple ID): Connect the iPhone via USB or pair over Wi-Fi. In Xcode, set Signing & Capabilities to your personal team, select the device as the run destination, and hit Run. The app lasts 7 days before it has to be re-signed.
Paid Apple Developer Program ($99/year): Same Xcode flow, but the build is signed against your team's certificate. Upload to TestFlight for internal testers, or submit to the App Store for public release.

There is no other path to a stock iPhone. Apple does not allow sideload of .ipa files outside the App Store or TestFlight in the US App Store (EU has third-party stores via the Digital Markets Act). If you need to ship an LLM to consumer iPhones at scale, this is the path. For an internal app at a single company, TestFlight internal testing is the right path and is free with a paid Developer account.

What fits on an iPhone 15 Pro.

Comfortably: Llama-3.2-1B int4 (0.58 GB), Llama-3.2-3B int4 (1.7 GB), Phi-3-mini int4 (2.1 GB).
Tight but feasible: 7B int4 (3.9 to 4.4 GB) on the 8 GB chip leaves about 1 GB for the rest of the app and the OS. iOS will kill the app under memory pressure.
Not feasible: 13B at any quant. Above 5 GB the app gets jetsam-killed almost immediately.

Verify the artifact stayed honest.

Run verification on the Mac before bundling. The .kolm manifest stamps the quant tier, the K-score on the embedded eval pack, and the SHA-256 of the resulting payload. A reviewer can recompute everything later.

$ kolm verify your-artifact.kolm --binder report.html

For reviewer-grade evidence, /verify-prod accepts the same .kolm in the browser and runs the same six checks.

iPhone 15 Pro: export, bundle, run on device.

Step 1. Install coremltools on the Mac.

Step 2. Export from your `.kolm`.

Step 3. Bundle into an Xcode project.

Step 4. Run on device.

What fits on an iPhone 15 Pro.

Verify the artifact stayed honest.

References.

Next.

iPhone 15 Pro: export, bundle, run on device.

Step 1. Install coremltools on the Mac.

Step 2. Export from your .kolm.

Step 3. Bundle into an Xcode project.

Step 4. Run on device.

What fits on an iPhone 15 Pro.

Verify the artifact stayed honest.

References.

Next.

Step 2. Export from your `.kolm`.