Laptop

Craylm can be run on your laptop for development purposes. It requires Docker.

Clone the Craylm repository and start the server.

git clone [email protected]:cray-lm/cray-lm.git
cd cray-lm
./cray up

You should see the server come up.

(environment) gregorydiamos@Air-Gregory cray % ./cray up
+++ dirname ./cray
++ cd .
++ pwd
+ LOCAL_DIRECTORY=/Users/gregorydiamos/checkout/cray
+ /Users/gregorydiamos/checkout/cray/cmd/bashly.sh generate
++++ dirname /Users/gregorydiamos/checkout/cray/cmd/bashly.sh
+++ cd /Users/gregorydiamos/checkout/cray/cmd
+++ pwd
++ LOCAL_DIRECTORY=/Users/gregorydiamos/checkout/cray/cmd
+++ id -u
+++ id -g
++ docker run --rm -it --user 501:20 --volume /Users/gregorydiamos/checkout/cray/cmd:/app/cmd --volume /Users/gregorydiamos/checkout/cray/cmd/../scripts:/app/scripts --volume /Users/gregorydiamos/checkout/cray/cmd/bashly-settings.yml:/app/bashly-settings.yml dannyben/bashly generate
creating user files in cmd
skipped cmd/build_image_command.sh (exists)
skipped cmd/depot_build_command.sh (exists)
skipped cmd/up_command.sh (exists)
skipped cmd/test_command.sh (exists)
skipped cmd/deploy_command.sh (exists)
skipped cmd/serve_command.sh (exists)
skipped cmd/llm_plot_command.sh (exists)
skipped cmd/llm_logs_command.sh (exists)
skipped cmd/llm_ls_command.sh (exists)
skipped cmd/llm_squeue_command.sh (exists)
skipped cmd/diffusion_command.sh (exists)
created /app/scripts/cray
run /app/scripts/cray --help to test your bash script
+ /Users/gregorydiamos/checkout/cray/scripts/cray up
[+] Running 0/14.2s (10/37)                                                                                                                                                              docker:desktop-linux
 => [vllm internal] load build definition from Dockerfile                                                                                                                                                0.0s
[+] Building 217.0s (39/39) FINISHED                                                                                                                                                     docker:desktop-linux
 => [vllm internal] load build definition from Dockerfile                                                                                                                                                0.0s
 => => transferring dockerfile: 4.35kB                                                                                                                                                                   0.0s
 => [vllm internal] load metadata for docker.io/library/ubuntu:24.04                                                                                                                                     0.3s
 => [vllm internal] load .dockerignore                                                                                                                                                                   0.0s
 => => transferring context: 2B                                                                                                                                                                          0.0s
 => CACHED [vllm cpu 1/6] FROM docker.io/library/ubuntu:24.04@sha256:80dd3c3b9c6cecb9f1667e9290b3bc61b78c2678c02cbdae5f0fea92cc6734ab                                                                    0.0s
 => => resolve docker.io/library/ubuntu:24.04@sha256:80dd3c3b9c6cecb9f1667e9290b3bc61b78c2678c02cbdae5f0fea92cc6734ab                                                                                    0.0s
 => [vllm internal] load build context                                                                                                                                                                   0.2s
 => => transferring context: 723.40kB                                                                                                                                                                    0.2s
 => [vllm cpu 2/6] RUN --mount=type=cache,target=/var/cache/apt     apt-get update -y     && apt-get install -y python3 python3-pip python3-venv     openmpi-bin libopenmpi-dev libpmix-dev             60.7s
 => [vllm cpu 3/6] RUN python3 -m venv /app/.venv                                                                                                                                                        2.6s
 => [vllm cpu 4/6] RUN . /app/.venv/bin/activate                                                                                                                                                         0.1s
 => [vllm cpu 5/6] RUN pip install uv                                                                                                                                                                    1.4s
 => [vllm cpu 6/6] RUN uv pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu                                                                                                      6.3s
 => [vllm vllm  1/17] RUN --mount=type=cache,target=/var/cache/apt     apt-get update -y     && apt-get install -y curl ccache git vim numactl gcc-12 g++-12 libomp-dev libnuma-dev     && apt-get ins  74.9s
 => [vllm vllm  2/17] COPY ./requirements.txt /app/cray/requirements.txt                                                                                                                                 0.1s
 => [vllm vllm  3/17] COPY ./test/requirements-pytest.txt /app/cray/requirements-pytest.txt                                                                                                              0.0s
 => [vllm vllm  4/17] COPY ./infra/requirements-vllm-build.txt /app/cray/requirements-vllm-build.txt                                                                                                     0.0s
 => [vllm vllm  5/17] RUN uv pip install --no-compile --no-cache-dir -r /app/cray/requirements.txt                                                                                                       8.6s
 => [vllm vllm  6/17] RUN uv pip install --no-compile --no-cache-dir -r /app/cray/requirements-vllm-build.txt                                                                                            1.6s
 => [vllm vllm  7/17] RUN uv pip install --no-compile --no-cache-dir -r /app/cray/requirements-pytest.txt                                                                                                0.5s
 => [vllm vllm  8/17] WORKDIR /app/cray                                                                                                                                                                  0.0s
 => [vllm vllm  9/17] COPY ./infra/cray_infra/vllm /app/cray/infra/cray_infra/vllm                                                                                                                       0.1s
 => [vllm vllm 10/17] COPY ./infra/setup.py /app/cray/infra/cray_infra/setup.py                                                                                                                          0.0s
 => [vllm vllm 11/17] COPY ./infra/CMakeLists.txt /app/cray/infra/cray_infra/CMakeLists.txt                                                                                                              0.0s
 => [vllm vllm 12/17] COPY ./infra/cmake /app/cray/infra/cray_infra/cmake                                                                                                                                0.0s
 => [vllm vllm 13/17] COPY ./infra/csrc /app/cray/infra/cray_infra/csrc                                                                                                                                  0.0s
 => [vllm vllm 14/17] COPY ./infra/requirements-vllm.txt /app/cray/infra/cray_infra/requirements.txt                                                                                                     0.0s
 => [vllm vllm 15/17] WORKDIR /app/cray/infra/cray_infra                                                                                                                                                 0.0s
 => [vllm vllm 16/17] RUN --mount=type=cache,target=/root/.cache/pip     --mount=type=cache,target=/root/.cache/ccache     MAX_JOBS=8 TORCH_CUDA_ARCH_LIST="7.5 8.6" VLLM_TARGET_DEVICE=cpu     python  39.0s
 => [vllm vllm 17/17] WORKDIR /app/cray                                                                                                                                                                  0.0s
 => [vllm infra  1/10] RUN apt-get update -y      && apt-get install -y slurm-wlm libslurm-dev     build-essential     less curl wget net-tools vim iputils-ping     && rm -rf /var/lib/apt/lists/*     15.0s
 => [vllm infra  2/10] COPY ./infra/slurm_src /app/cray/infra/slurm_src                                                                                                                                  0.0s
 => [vllm infra  3/10] RUN /app/cray/infra/slurm_src/compile.sh                                                                                                                                          0.2s
 => [vllm infra  4/10] RUN mkdir -p /app/cray/jobs                                                                                                                                                       0.1s
 => [vllm infra  5/10] COPY ./infra /app/cray/infra                                                                                                                                                      0.9s
 => [vllm infra  6/10] COPY ./sdk /app/cray/sdk                                                                                                                                                          0.0s
 => [vllm infra  7/10] COPY ./test /app/cray/test                                                                                                                                                        0.0s
 => [vllm infra  8/10] COPY ./cray /app/cray/cray                                                                                                                                                        0.0s
 => [vllm infra  9/10] COPY ./ml /app/cray/ml                                                                                                                                                            0.0s
 => [vllm infra 10/10] COPY ./scripts /app/cray/scripts                                                                                                                                                  0.0s
 => [vllm] exporting to image                                                                                                                                                                            4.0s
 => => exporting layers                                                                                                                                                                                  4.0s
 => => writing image sha256:2e9c8cea0daed2da4c4bd5bec4c875c1e4a773395b95cd4ddbd7823479c4ef83                                                                                                             0.0s
[+] Running 2/2o docker.io/library/cray-vllm                                                                                                                                                             0.0s
 ✔ Service vllm           Built                                                                                                                                                                        217.1s
 ✔ Container cray-vllm-1  Recreated                                                                                                                                                                      0.2s
Attaching to vllm-1
vllm-1  | +++ dirname /app/cray/scripts/start_one_server.sh
vllm-1  | ++ cd /app/cray/scripts
vllm-1  | ++ pwd
vllm-1  | + LOCAL_DIRECTORY=/app/cray/scripts
vllm-1  | + /app/cray/scripts/start_slurm.sh
vllm-1  | +++ dirname /app/cray/scripts/start_slurm.sh
vllm-1  | ++ cd /app/cray/scripts
vllm-1  | ++ pwd
vllm-1  | + LOCAL_DIRECTORY=/app/cray/scripts
vllm-1  | + python /app/cray/scripts/../infra/cray_infra/slurm/discovery/discover_clusters.py
vllm-1  | + slurmctld
vllm-1  | + slurmd
vllm-1  | + python -m cray_infra.one_server.main
vllm-1  | INFO 01-09 18:47:16 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
vllm-1  | INFO:     Will watch for changes in these directories: ['/app/cray/infra/cray_infra']
vllm-1  | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
vllm-1  | INFO:     Started reloader process [38] using WatchFiles
vllm-1  | INFO 01-09 18:47:19 importing.py:10] Triton not installed; certain GPU-related functions will not be available.
vllm-1  | DEBUG:asyncio:Using selector: EpollSelector
vllm-1  | DEBUG:cray_infra.one_server.start_cray_server:Starting servers: ['api']
vllm-1  | DEBUG:cray_infra.one_server.start_cray_server:Starting API server
vllm-1  | INFO:persistqueue.serializers.pickle:Selected pickle protocol: '4'
vllm-1  | INFO:persistqueue:DBUtils may not be installed, install via 'pip install persist-queue[extra]'
vllm-1  | INFO:     Started server process [51]
vllm-1  | INFO:     Waiting for application startup.
vllm-1  | INFO:cray_infra.training.register_megatron_models:Registering Megatron models
vllm-1  | INFO:     Application startup complete.
vllm-1  | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
vllm-1  | ERROR:cray_infra.one_server.wait_for_vllm:Error getting health: Cannot connect to host localhost:8001 ssl:default [Connect call failed ('127.0.0.1', 8001)]
vllm-1  | INFO:cray_infra.training.register_megatron_models:VLLM is not ready. Skipping model registration
vllm-1  | INFO:cray_infra.training.restart_megatron_jobs:Restarting Megatron jobs
vllm-1  | INFO:cray_infra.training.restart_megatron_jobs:Slurm jobs running: []
vllm-1  | DEBUG:persistqueue.sqlbase:Initializing Sqlite3 Queue with path /app/cray/inference_work_queue.sqlite
vllm-1  | INFO:cray_infra.generate.clear_acked_requests_from_queue:Cleared 0 acked requests from the queue.