# Write your Policy! Welcome to the Butchunker Policy Development Guide. This guide explains how to create a custom chunking policy for Butchunker. A chunking policy defines how to split data streams or files into chunks. This is a core task for data deduplication, storage, and transfer. Before starting, you should know basic Rust and understand the Butchunker framework. Your policy will decide where to split the data based on its content and your settings. ## Creating a Policy Crate First, create a new `Rust Crate` to host your chunking policy. ### Writing `Cargo.toml` ```toml [package] name = "butck_fixed_size" # Policy name authors = ["Butchunker"] # Author info version = "0.1.0" edition = "2024" [dependencies] ``` ## Implementing Policy Logic ### Writing `src/lib.rs` In `src/lib.rs`, implement one or both of the following schemes: #### Scheme 1: Streaming Processing Scheme Suitable for processing large files where subsequent content cannot be predicted, but also does not require loading the entire file into memory. ```rust use std::collections::HashMap; // Streaming policy struct (must implement the Default trait) #[derive(Default)] pub struct YourPolicyStream { // Define your state fields here } // Streaming processing function pub async fn your_policy_stream( current_data: &[u8], // Current data chunk len: u32, // Data length stream: &mut FixedSizeStream, // Streaming processing context params: &HashMap<&str, &str>, // Configuration parameters ) -> Option { // Implement your chunking logic // Return the split position (offset from the start of current_data), or None if no split None } ``` #### Scheme 2: Simple Processing Scheme Suitable for processing small to medium-sized files that can be loaded entirely at once, allowing knowledge of subsequent data during chunking for better results. ```rust use std::collections::HashMap; // Simple processing function pub async fn your_policy( raw_data: &[u8], // Raw data params: &HashMap<&str, &str>, // Configuration parameters ) -> Vec { // Implement your chunking logic // Return a vector of all split positions (offsets from the start of raw_data) vec![] } ``` ## Registration and Usage ### Deploying the Policy 1. Place the completed policy `Crate` into the `./policy/` directory of the Butchunker repository. 2. Use the `butckrepo-refresh` program to refresh the registry: - If the program is not yet installed, you can execute the following in the root directory of the Butchunker repository: ```bash cargo install --path ./ ``` 3. After each policy library update, you must: - Execute `butckrepo-refresh` to refresh the registry. - Reinstall the `butck` binary: `cargo install --path ./`. ### Calling the Policy - The policy will be automatically registered in Butchunker's registry. Use the following command to call the policy: ````rust butck write --policy --storage ./ ````