Dynamo Disagg Skeleton
Created a PR to add disagg skeleton example for Dynamo. This actually completes the Backend/Worker guide
1 Processor
Process is smiliar to the hello world multinode example with 2 changes
- Adding
depends(Router)
- Adding
kv
router mode, which will connect to a dummy kv routner for worker nodes selection and useself.worker_client.direct(..., worker_id)
to specify a worker
2 KV Router
This client depends on workers, so it has the same logic as Processor to get workers
- The use
_cost_function
to decide worker selection and customization will be implemented in the cost function. In this dummy example, we used string matching score as the cost.hit_rate = SequenceMatcher( isjunk=None, ## junk char to ignore self.kv_cache[curr_id], ## target string request_prompt).ratio() ## compare string
- Adding the KV metrics pulisaggregator examples
self.runtime = dynamo_context["runtime"] kv_listener = self.runtime.namespace("dynamo-demo").component("DummyWorker") await kv_listener.create_service() self.metrics_aggregator = KvMetricsAggregator(kv_listener)
- The use of this aggregator is integrated in vLLM backend. In this dummy example, we explicitly call this aagregator to get KV metrics.
metrics = await self.metrics_aggregator.get_metrics() for endpoint in metrics.endpoints: logger.info(f"KV metrics:{endpoint.worker_id}, {endpoint.num_requests_waiting}")
3 Worker
KV publisher is initialized in worker and can publish KV metrics.
self.component = dynamo_context["component"]
self.metrics_publisher = KvMetricsPublisher()
# Register an endpoint for consumers of the KV Metrics
# (KvMetricsAggregator in kv_router) to listen/gather on.
self.metrics_publisher.create_endpoint(self.component)
self.metrics_publisher.publish(...)