Build from Source
SiteOne Crawler is written in Rust and compiles to a single self-contained binary with zero runtime dependencies. Building from source is straightforward with Cargo, Rustโs build tool.
Prerequisites
Section titled โPrerequisitesโYou need Rust 1.94 or later (the minimum supported version declared in Cargo.toml). The recommended way to install Rust is via rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shAfter installation, verify the toolchain:
rustc --version # should report 1.94.0 or newercargo --versionBuild the release binary
Section titled โBuild the release binaryโClone the repository and build an optimized release binary:
git clone https://github.com/janreges/siteone-crawler.gitcd siteone-crawler
# Build the optimized release binarycargo build --release
# Run it./target/release/siteone-crawler --url=https://my.domain.tldThe compiled binary is located at ./target/release/siteone-crawler. Copy it anywhere on your PATH (e.g. /usr/local/bin/) to run it as siteone-crawler from any directory.
Browser rendering is built into the default build
Section titled โBrowser rendering is built into the default buildโThe browser-rendering mode (--browser, screenshots, and console/JS/network diagnostics) is included in the default build and in the pre-built release binaries โ cargo build --release already adds the ~6 MB chromiumoxide (Chrome DevTools Protocol) client. Only the client is bundled; the actual browser is detected or downloaded at runtime, never bundled. Just pass --browser:
# The default build already includes browser rendering.cargo build --release
# Render a JavaScript / SPA site and capture screenshots./target/release/siteone-crawler --url=https://my.spa.tld --browser --screenshotsAt runtime, the crawler auto-detects an installed Chrome/Chromium/Edge/Brave, and if none is found it offers to download a chrome-headless-shell build. You can also point it at a specific browser binary with --browser-path=<exe>. See Browser Rendering for the full set of options.
If you want a leaner binary without browser rendering (drops chromiumoxide, ~6 MB smaller), build with --no-default-features:
cargo build --release --no-default-featuresBuild a statically linked (musl) binary
Section titled โBuild a statically linked (musl) binaryโBy default, the Linux build links against glibc and requires a reasonably recent system (glibc 2.39+). If you need a binary that runs on any Linux distribution regardless of the installed glibc version โ for example, older distributions or minimal container images โ build a statically linked binary with the musl target:
# Install the musl toolchain (Ubuntu / Debian)sudo apt-get install musl-tools
# Add the musl Rust targetrustup target add x86_64-unknown-linux-musl
# Build a fully static binary (no system dependencies)cargo build --release --target x86_64-unknown-linux-musl
# Run it โ works on any Linux distribution./target/x86_64-unknown-linux-musl/release/siteone-crawler --url=https://my.domain.tldThe resulting binary has no dynamic library dependencies at all.
Next steps
Section titled โNext stepsโ- New to the crawler? Start with the command-line options reference.
- Want to enable JavaScript/SPA crawling and screenshots? See Browser Rendering.
- Planning to contribute code? Read the Contribution and Development guide.