XOOMAR
Split futuristic AI infrastructure scene comparing modular packaging and distributed serving clusters
TechnologyJune 16, 2026· 18 min read· By XOOMAR Insights Team

BentoML vs Ray Serve Forces a Costly AI Serving Bet

Share

XOOMAR Intelligence

Analyst Take

Choosing between BentoML and Ray Serve is really a choice between two production philosophies: package-and-ship model services with minimal ceremony, or run distributed Python inference pipelines on a Ray cluster. This BentoML vs Ray Serve comparison focuses on the practical buying criteria teams care about in 2026: deployment workflow, scaling, latency, observability, infrastructure fit, and total cost of ownership.

Both platforms are capable Python AI model serving options. The better choice depends on whether your team needs a clean model packaging lifecycle or a distributed serving layer for multi-stage, traffic-shaped inference.


BentoML vs Ray Serve: Quick Comparison Table

Category BentoML Ray Serve
Best fit General-purpose ML serving, fast Python-native deployment, single-model or modest multi-model APIs Distributed, high-throughput serving, compound AI systems, multi-stage pipelines
Core abstraction Service classes, runners, and “Bentos” Ray deployments composed into graphs
Packaging model Standard Bento packaging format; reproducible OCI-compatible container concept in BentoML 1.3 Python deployments running inside a Ray cluster
Model lifecycle support Stronger focus on model packaging, model management, and CI/CD workflows Focuses on serving and orchestration inside the Ray ecosystem
Adaptive batching Built in Built in
Autoscaling granularity Per-service / replica Per-deployment with Ray actors
Horizontal scaling Good; Kubernetes with Yatai / Helm noted in source data Excellent; Ray cluster native
GPU support Yes, per-runner Yes, including fractional GPU placement
LLM tooling OpenLLM and vLLM runner mentioned in source data Ray Serve LLM, built on vLLM
Multi-stage composition Supported via runners First-class deployment graphs
Cold start, warm container About ~2.5s in the PythonDataBench comparison About ~6s including cluster init in the PythonDataBench comparison
p50 CPU latency overhead About ~8ms in the PythonDataBench comparison About ~12ms in the PythonDataBench comparison
Operational complexity Low in the PythonDataBench comparison Medium-high in the PythonDataBench comparison
Kubernetes story Yatai / Helm KubeRay
Setup time in PyTorch stack test 1 hour in Markaicode’s table; under 15 minutes noted for first single-model deployment cycle 4 hours in Markaicode’s table
License / self-hosted cost Apache License 2.0; $0 OSS self-hosted, optional BentoCloud mentioned Apache License 2.0; $0 self-hosted
Open-source project signal from LibHunt 8,672 GitHub stars, Python, Apache License 2.0 42,860 GitHub stars, Python, Apache License 2.0

Key takeaway: BentoML is usually the more straightforward serving framework for packaging and deploying Python ML services. Ray Serve becomes more compelling when your application is a distributed inference graph with independent scaling needs.


Who Should Use BentoML?

Choose BentoML if your team wants a Python-native model serving platform that turns trained models into reproducible APIs without requiring a distributed systems layer first.

The source data consistently positions BentoML as the better general-purpose model serving option for most Python ML teams. PythonDataBench describes BentoML 1.3 as the “most balanced choice” because it includes model packaging, runners, Bento images, and adaptive batching out of the box, with sub-10ms framework overhead on CPU.

BentoML is a strong fit when you need:

  • Fast deployment: Markaicode reports BentoML as the fastest path from notebook to REST endpoint, with a first-deployment cycle under 15 minutes for a single model.
  • Reproducible packaging: BentoML’s Bento format packages model weights, dependencies, runtime configuration, and inference code into a deployable artifact.
  • Lower learning curve: VIPS Learn rates BentoML’s learning curve as low for Python developers.
  • General ML serving: PythonDataBench lists BentoML as best for general ML serving, while Ray Serve is framed around multi-model pipelines.
  • Multiple framework support: Markaicode highlights BentoML’s integration with MLflow, Hugging Face, and Scikit-learn, not just PyTorch.
  • Simpler operations: PythonDataBench rates BentoML’s operational complexity as low, compared with medium-high for Ray Serve.

BentoML is especially attractive when your team needs to productionize models but does not want to own a Ray cluster. It is also a natural choice when model packaging, promotion through environments, and CI/CD workflows are as important as request routing.

Where BentoML may be less ideal

BentoML is not always the best choice for cluster-wide distributed inference. PythonDataBench notes that BentoML’s weaker area is multi-cluster orchestration. Its open-source Yatai control plane exists, but the same source says it can lag BentoCloud’s commercial features.

If your application requires retrievers, rerankers, LLMs, guardrails, and tool-calling stages to scale independently across many GPUs, Ray Serve’s deployment graph model is likely a better fit.


Who Should Use Ray Serve?

Choose Ray Serve if your model serving problem is really a distributed Python application problem.

Ray Serve is built on Ray, which LibHunt describes as an AI compute engine with a distributed runtime and AI libraries for accelerating ML workloads. The search data also describes Ray Serve as a scalable model serving library for online inference APIs that is framework-agnostic across PyTorch, TensorFlow, Keras, Scikit-learn, and arbitrary Python business logic.

Ray Serve is a strong fit when you need:

  • Multi-stage inference pipelines: VIPS Learn says Ray Serve’s deployment graphs are hard to beat when pipelines include retrievers, rerankers, LLMs, guards, and function-calling stages.
  • Independent scaling per component: Each Ray Serve deployment can have its own replica count, hardware requirements, and autoscaling policy.
  • Traffic-driven autoscaling: PythonDataBench highlights Ray Serve’s autoscaling based on target_ongoing_requests, which adapts to request shape rather than CPU utilization alone.
  • Fractional GPU placement: PythonDataBench gives an example where num_gpus: 0.25 lets four lightweight models share a single A10G, cutting cost per prediction by 3–4x for embedding models that do not saturate a full GPU.
  • Existing Ray adoption: VIPS Learn recommends Ray Serve when teams already use Ray for training or data processing.
  • High-throughput distributed serving: Markaicode describes Ray Serve as excelling for high-throughput, distributed deployments.

Where Ray Serve may be less ideal

Ray Serve adds a Ray cluster as an operational dependency. PythonDataBench explicitly calls this an operational burden, and its comparison table rates Ray Serve’s operational complexity as medium-high.

Markaicode’s PyTorch deployment stack comparison also reports that Ray Serve required 4 hours of setup time, compared with 1 hour for BentoML. The same source says Ray Serve required substantial cluster tuning before stabilizing in its test environment.

Ray Serve is powerful, but the power comes with infrastructure assumptions. If you do not need Ray’s actor model, deployment graphs, or distributed scheduling, the added operational surface may not pay off.


Model Packaging and Deployment Workflow

The biggest BentoML vs Ray Serve difference is packaging philosophy.

BentoML centers the workflow around a packaged service artifact. Ray Serve centers the workflow around deployments running in a Ray cluster.

BentoML workflow: package the model as a Bento

BentoML’s source-backed strength is its model packaging lifecycle. A Bento packages inference code, dependencies, model artifacts, and runtime configuration into a deployable image.

PythonDataBench describes the Bento image as a reproducible, OCI-compatible container concept in BentoML 1.3. A GitHub discussion answered by a BentoML maintainer also emphasizes BentoML’s standard model packaging format and model management component for CI/CD and model deployment lifecycle management.

A simplified BentoML-style service pattern from the source data looks like this:

import bentoml
import numpy as np

@bentoml.service(
    resources={"cpu": "2", "memory": "2Gi"},
    traffic={"timeout": 30, "max_concurrency": 64},
)
class FraudDetector:
    model_ref = bentoml.models.get("xgb_fraud:latest")

    def __init__(self) -> None:
        self.model = bentoml.xgboost.load_model(self.model_ref)

    @bentoml.api(
        batchable=True,
        batch_dim=0,
        max_batch_size=128,
        max_latency_ms=20,
    )
    def score(self, features: np.ndarray) -> np.ndarray:
        return self.model.predict_proba(features)[:, 1]

The important production detail is not just the Python decorator syntax. It is the combination of packaging, model loading, resource configuration, traffic limits, and batching in one service definition.

Ray Serve workflow: compose deployments into an application graph

Ray Serve uses deployments, actors, and graphs. This is better suited to applications where “the model” is actually a chain of components.

A simplified Ray Serve pattern from the source data looks like this:

from ray import serve

@serve.deployment(
    num_replicas="auto",
    autoscaling_config={
        "min_replicas": 1,
        "max_replicas": 8,
        "target_ongoing_requests": 5,
    },
    ray_actor_options={"num_gpus": 0.25},
)
class Embedder:
    def __init__(self):
        from sentence_transformers import SentenceTransformer
        self.model = SentenceTransformer("BAAI/bge-base-en-v1.5", device="cuda")

    async def __call__(self, texts: list[str]) -> list[list[float]]:
        return self.model.encode(texts, batch_size=32).tolist()

@serve.deployment
class Reranker:
    def __init__(self, embedder):
        self.embedder = embedder

    async def __call__(self, query: str, docs: list[str]) -> list[str]:
        q_vec, *d_vecs = await self.embedder.remote([query] + docs)
        # scoring logic omitted
        return docs[:5]

embedder = Embedder.bind()
app = Reranker.bind(embedder)
serve.run(app, route_prefix="/rerank")

This workflow is more complex, but it enables a valuable architecture: each stage can scale differently and request traffic can flow through a graph of Python components.


Scaling, Autoscaling, and Distributed Inference

Both BentoML and Ray Serve support scaling, but they optimize for different scaling units.

Scaling Dimension BentoML Ray Serve
Primary scaling unit Service / runner / replica Deployment / Ray actor
Autoscaling granularity Per-service / replica Per-deployment
Cluster model Works well with Kubernetes via Yatai / Helm Native to Ray clusters; KubeRay for Kubernetes
Best scaling pattern Scale an API service or model runner predictably Scale multi-stage pipelines independently
GPU placement Per-runner GPU support Fractional GPU placement supported
Distributed inference fit Good for many production services; less focused on Ray-style distributed graphs Strong fit for distributed, graph-based inference

BentoML’s scaling story is service-oriented. It fits teams that want to scale model APIs in familiar deployment environments, especially Kubernetes.

Ray Serve’s scaling story is cluster-oriented. It fits teams that want each component of a compound AI system to scale based on request pressure.

Autoscaling signal matters

PythonDataBench specifically calls out Ray Serve’s autoscaling based on target_ongoing_requests. This is important because CPU utilization can be a poor signal for I/O-bound or GPU-bound inference services.

BentoML also supports adaptive batching, which can materially improve throughput. In a tabular XGBoost example from PythonDataBench, batching concurrent requests up to 20ms or until a batch size of 128 produced a 3.8x throughput improvement versus single-request scoring, while p99 latency stayed under 40ms at 800 QPS.


Latency, Throughput, and Performance Trade-Offs

Performance comparisons need care because results vary with model type, hardware, batching profile, and network topology. The source data provides useful directional numbers, but not a universal benchmark for every workload.

Performance Factor BentoML Ray Serve
p50 CPU latency overhead About ~8ms in PythonDataBench’s comparison About ~12ms in PythonDataBench’s comparison
Cold start, warm container About ~2.5s About ~6s, including cluster init
Batching Built-in adaptive batching Built-in automatic request batching
Throughput strengths Efficient packaged services; reported 3.8x batching gain in XGBoost example High-throughput distributed deployments; Markaicode reports strong throughput in a T4 PyTorch test
GPU efficiency Per-runner GPU support; model runner abstraction noted for reducing OOM errors in multi-model tests Fractional GPU scheduling can improve utilization for lightweight models

PythonDataBench reports BentoML’s CPU overhead as lower than Ray Serve’s in its comparison: ~8ms versus ~12ms p50 overhead. It also reports faster warm-container cold start for BentoML: ~2.5s versus ~6s for Ray Serve, where Ray cluster initialization contributes to the difference.

That does not mean BentoML always outperforms Ray Serve. Ray Serve is designed for distributed scaling and multi-component inference. Markaicode reports that Ray Serve managed 1,200 req/min for a ResNet-50 model on a T4 in its PyTorch stack test, compared with 400 req/min for TorchServe, though that result is not a direct BentoML-versus-Ray benchmark.

Practical performance guidance

  • Low-latency single model: BentoML often has the simpler and lighter path, based on the source data’s lower overhead and lower operational complexity.
  • Traffic spikes: Ray Serve is stronger when you need elastic scaling and request-queue-aware autoscaling.
  • Multi-model GPU sharing: Ray Serve’s fractional GPU placement is a major advantage when models do not need a full GPU.
  • Batch-friendly workloads: Both platforms support batching; BentoML’s source example shows a clear throughput improvement from adaptive batching.

Monitoring, Logging, and Production Operations

Production fit is not just about serving requests. Teams also need health checks, logs, metrics, rollouts, and debugging paths.

Operations Area BentoML Ray Serve
Operational complexity Low in PythonDataBench’s comparison Medium-high in PythonDataBench’s comparison
Metrics Markaicode lists Prometheus metrics via plugin Markaicode highlights Ray dashboard
Observability Packaging and deployment lifecycle are central strengths Built-in observability via Ray dashboard noted by Markaicode
A/B deployment support Markaicode lists Bento tagging for model deployment scenarios Markaicode lists Ray deployment groups
Tracing Source data says support varies across stacks Source data says support varies across stacks

BentoML’s operational advantage is that it narrows the serving surface area. The platform is focused on packaging, serving, and deployment workflows. For teams that already have Kubernetes, CI/CD, and container observability, BentoML can fit into those patterns without requiring a separate distributed runtime.

Ray Serve’s operational advantage is visibility into Ray-native workloads. Markaicode specifically calls out built-in observability via the Ray dashboard. That matters when debugging distributed actor placement, request queues, and multi-deployment applications.

The trade-off is that Ray Serve operations require Ray knowledge. VIPS Learn rates the learning curve as moderate because teams need to absorb Ray concepts. PythonDataBench similarly identifies the Ray cluster as an added operational burden.


Kubernetes, Cloud, and Infrastructure Compatibility

BentoML has the broader deployment-platform story in the source data, while Ray Serve has the stronger Ray-cluster story.

A BentoML maintainer comparison states that BentoML can deploy to many platforms, including Kubernetes, OpenShift, AWS SageMaker, AWS Lambda, Azure ML, GCP, Heroku, and batch inference jobs on Apache Spark and Apache Airflow. The same comparison frames Ray Serve as operating inside a Ray cluster.

PythonDataBench lists BentoML’s Kubernetes story as Yatai / Helm and Ray Serve’s as KubeRay. VIPS Learn similarly describes BentoML horizontal scaling as good with Kubernetes and Ray Serve horizontal scaling as excellent because it is Ray cluster native.

Infrastructure fit by environment

Environment / Need Better Fit Based on Source Data Why
Standard Kubernetes model APIs BentoML Yatai / Helm support and lower operational complexity
Ray-based ML platform Ray Serve Native integration with Ray clusters
Multi-stage LLM pipeline across many GPUs Ray Serve Deployment graphs and per-stage scaling
Single LLM behind REST or gRPC BentoML VIPS Learn says BentoML plus OpenLLM is lower ceremony
Batch inference integrations BentoML Source data mentions Spark and Airflow batch jobs
Existing Ray Train / Ray Data / Ray Tune users Ray Serve Same ecosystem and distributed runtime

VIPS Learn summarizes the LLM-specific trade-off clearly: BentoML is a better fit when one LLM is behind a REST or gRPC endpoint with minimum ceremony, while Ray Serve is better when the pipeline has retrievers, rerankers, LLMs, guards, and other stages that need independent replication and scaling.


Pricing and Total Cost of Ownership Considerations

Both BentoML and Ray Serve are open-source and self-hostable. The source data lists both under Apache License 2.0.

Cost Factor BentoML Ray Serve
License Apache License 2.0 Apache License 2.0
Self-hosted software cost in Markaicode table $0 OSS, optional BentoCloud mentioned $0 self-hosted
Setup time in Markaicode table 1 hour 4 hours
Operational cost driver Packaging, deployment, and standard service operations Ray cluster operations, tuning, and distributed debugging
GPU utilization lever Adaptive batching and per-runner GPU support Fractional GPU placement and per-deployment scaling
Commercial offering mentioned Optional BentoCloud No specific commercial Ray Serve pricing in the provided source data

The key TCO lesson from PythonDataBench is that cost per prediction depends more on batching and GPU utilization than the framework name. The practical question is which platform exposes the right knobs for your workload.

BentoML TCO profile

BentoML can reduce engineering cost when teams need a clean path from model artifact to production service. Its lower learning curve, Bento packaging, and lower operational complexity can matter more than raw throughput if the team is small or the deployment pattern is straightforward.

Ray Serve TCO profile

Ray Serve can reduce infrastructure cost when independent scaling and fractional GPU placement improve utilization. The source data gives a concrete example: fractional GPU placement can allow four lightweight models to share one A10G, reducing cost per prediction by 3–4x for embedding models that do not saturate a full GPU.

The trade-off is engineering time. If a team needs to learn, deploy, monitor, and tune Ray clusters only to serve one model, Ray Serve may cost more operationally even when the software license is free.


Final Recommendation: BentoML or Ray Serve?

For most Python ML teams comparing BentoML vs Ray Serve, the decision should start with architecture, not popularity.

Choose BentoML when you want a production model service. Choose Ray Serve when you want a distributed inference application.

If your priority is… Choose Reason
Fast path from notebook to API BentoML Source data reports under 15 minutes for first single-model deployment cycle
Model packaging and CI/CD lifecycle BentoML Standard Bento packaging and model management are core strengths
Lower operational complexity BentoML Rated low complexity in PythonDataBench
Single LLM endpoint BentoML VIPS Learn recommends it for lower-ceremony single LLM APIs
Multi-stage LLM or compound AI pipeline Ray Serve First-class deployment graphs and per-stage scaling
Existing Ray ecosystem usage Ray Serve Works naturally with Ray Train, Ray Data, Tune, and Serve
Fractional GPU utilization Ray Serve Supports fractional GPU placement such as num_gpus: 0.25
Traffic-shaped autoscaling Ray Serve Autoscaling can target ongoing requests per deployment
Many independent components across a cluster Ray Serve Ray cluster-native architecture is designed for this

If your team is deploying a handful of Python models behind REST APIs, BentoML is usually the simpler and more balanced choice. If your team is building a distributed AI application with multiple inference stages, independent scaling requirements, and GPU placement constraints, Ray Serve is the stronger architectural fit.


Bottom Line

BentoML is the better default for teams that value packaging, reproducibility, lower operational complexity, and a clean Python developer experience. The source data supports this with lower reported CPU overhead, faster warm-container startup, built-in adaptive batching, and a strong model lifecycle story.

Ray Serve is the better choice for distributed inference systems. Its deployment graphs, Ray-native autoscaling, fractional GPU placement, and cluster-wide orchestration make it a better fit for multi-stage LLM and compound AI workloads.

In short: pick BentoML to ship model services faster; pick Ray Serve when serving is part of a larger distributed Ray application.


FAQ

Is BentoML better than Ray Serve for most Python ML teams?

Based on the provided source data, BentoML is the better general-purpose choice for most Python ML teams. PythonDataBench describes BentoML 1.3 as the most balanced option because it includes model packaging, runners, Bento images, and adaptive batching with sub-10ms CPU framework overhead.

Is Ray Serve overkill for a single model?

For a single-model or single-GPU deployment, the source data suggests Ray Serve can be overkill. VIPS Learn says BentoML is easier for a single-GPU single-model deployment, while Ray Serve starts to pay off when you have multiple stages or need cluster scaling.

Which has better autoscaling: BentoML or Ray Serve?

Ray Serve has the stronger autoscaling story for distributed systems. VIPS Learn describes Ray Serve autoscaling as per-deployment with Ray actors, and PythonDataBench highlights autoscaling based on target_ongoing_requests.

Which is faster: BentoML or Ray Serve?

The source data does not provide a universal winner. PythonDataBench reports lower p50 CPU latency overhead for BentoML at ~8ms versus ~12ms for Ray Serve, and faster warm-container cold start at ~2.5s versus ~6s. Ray Serve, however, is designed for high-throughput distributed deployments and can be stronger when scaling across components and GPUs.

Do both BentoML and Ray Serve support LLM serving?

Yes. VIPS Learn says both can integrate with vLLM. BentoML is associated with OpenLLM and vLLM runners, while Ray Serve LLM is built on vLLM. The difference is mainly packaging and orchestration.

Are BentoML and Ray Serve free to self-host?

The provided source data lists both as Apache License 2.0 projects. Markaicode lists BentoML as $0 OSS with optional BentoCloud and Ray Serve as $0 self-hosted. Infrastructure, GPU usage, and operations are still part of total cost.

Sources & References

Content sourced and verified on June 16, 2026

  1. 1
  2. 2
    ML Model Serving in Python (2026)

    https://pythondatabench.com/article/model-serving-python-bentoml-ray-serve-fastapi-triton-compared

  3. 3
    BentoML vs Ray Serve (LLM) | VIPS Learn

    https://learn.engineering.vips.edu/compare/bentoml-vs-ray-serve-llm

  4. 4
  5. 5
    BentoML vs Ray - compare differences and reviews? | LibHunt

    https://www.libhunt.com/compare-BentoML-vs-ray

  6. 6
    Bentoml vs Ray Serve (2026) Comparison - noizz.io

    https://noizz.io/compare/bentoml-vs-ray-serve

XOOMAR

Written by

XOOMAR Insights Team

Research and Editorial Desk

The XOOMAR Insights Team pairs automated research with human editorial judgment. We track hundreds of sources across technology, fintech, trading, SaaS, and cybersecurity, cross-check the facts, and explain what happened, why it matters, and what to watch next. We do not just rewrite headlines. Every article is fact-checked and scored for reliability before it goes live, and we link back to the original sources so you can verify anything yourself.

Related Articles

Engineers in a futuristic AI operations hub compare competing model deployment pipelines.Technology

BentoML vs KServe vs Seldon Splits Kubernetes Teams

KServe fits Kubernetes-native teams, Seldon handles inference graphs, and BentoML wins on Python-first packaging and fast iteration.

Jun 16, 202624 min
Futuristic server lab comparing simple ML API endpoint with scalable distributed AI pipelineTechnology

Ray Serve vs FastAPI Exposes the ML API Scaling Trap

FastAPI wins for simple model APIs. Ray Serve wins when batching, autoscaling, GPUs, or multi-model pipelines start to matter.

Jun 16, 202622 min
Futuristic MLOps hub with glowing AI pipelines and infrastructure screens in a sleek tech workspaceTechnology

Kubeflow vs Metaflow vs Flyte Exposes the MLOps Trap

Kubeflow brings breadth, Metaflow favors Python teams, and Flyte wins on typed scale. The right pick depends on your infrastructure.

Jun 16, 202621 min
Three AI chatbot builders compete around a glowing company document hub in a futuristic workspace.Technology

No-Code RAG Chatbot Builders Fight for Company Docs

No-code RAG tools can work, but Dify, automation stacks, and LangChain trade speed for control in very different ways.

Jun 16, 202619 min
Photorealistic tech workspace showing an AI model deployment pipeline with containers, cloud nodes, and automation.Technology

Ship a Sklearn Model With Docker and CI/CD Without Chaos

A practical path to package a scikit-learn model as a FastAPI service, ship it with Docker, and automate releases with CI/CD.

Jun 16, 202617 min
Smartphone micro-investing concept showing small coins reduced by fees in a modern fintech setting.Fintech

Tiny Fees Can Gut Your Micro-Investing App Returns

Micro-investing apps make $1 investing easy, but flat fees can eat tiny balances. Match features to your deposit size before signing up.

Jun 16, 202621 min
Person comparing a budgeting app and spreadsheet at a table with bills and bank cards.Fintech

Only 29% Check Budgets, Envelope Budgeting Apps Fight Back

Apps win on consistency. Spreadsheets win on control. The best budget is the one you won't abandon when life gets busy.

Jun 16, 202618 min
Robo-advisor portfolio balancing tax losses and hidden fees in a modern fintech sceneFintech

Tax-Loss Harvesting Exposes Robo-Advisor Fee Traps

Tax-loss harvesting only pays in taxable accounts, and fees, minimums and platform rules can wipe out the edge.

Jun 16, 202620 min
Founders in a futuristic accelerator workspace weighing abstract equity tradeoffs and startup terms.Technology

Startup Accelerator Equity Terms That Can Cost Founders

Accelerator offers can hide costly equity tradeoffs. Founders need to compare dilution, rights, fees, and support before applying.

Jun 16, 202620 min
Founder reviews a secure AI-analyzed pitch deck in a futuristic privacy-focused tech workspace.Technology

AI Pitch Deck Review Tools Face a Founder Privacy Test

Founders need more than a deck score. The real choice is which AI tool gives useful feedback without mishandling fundraising secrets.

Jun 16, 202623 min