summaryrefslogtreecommitdiff
path: root/policy/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'policy/README.md')
-rw-r--r--policy/README.md96
1 files changed, 96 insertions, 0 deletions
diff --git a/policy/README.md b/policy/README.md
new file mode 100644
index 0000000..c995506
--- /dev/null
+++ b/policy/README.md
@@ -0,0 +1,96 @@
+# Write your Policy!
+
+Welcome to the Butchunker Policy Development Guide. This guide explains how to create a custom chunking policy for Butchunker. A chunking policy defines how to split data streams or files into chunks. This is a core task for data deduplication, storage, and transfer.
+
+Before starting, you should know basic Rust and understand the Butchunker framework. Your policy will decide where to split the data based on its content and your settings.
+
+## Creating a Policy Crate
+
+First, create a new `Rust Crate` to host your chunking policy.
+
+### Writing `Cargo.toml`
+
+```toml
+[package]
+name = "butck_fixed_size" # Policy name
+authors = ["Butchunker"] # Author info
+version = "0.1.0"
+edition = "2024"
+
+[dependencies]
+```
+
+## Implementing Policy Logic
+
+### Writing `src/lib.rs`
+
+In `src/lib.rs`, implement one or both of the following schemes:
+
+#### Scheme 1: Streaming Processing Scheme
+
+Suitable for processing large files where subsequent content cannot be predicted, but also does not require loading the entire file into memory.
+
+```rust
+use std::collections::HashMap;
+
+// Streaming policy struct (must implement the Default trait)
+#[derive(Default)]
+pub struct YourPolicyStream {
+ // Define your state fields here
+}
+
+// Streaming processing function
+pub async fn your_policy_stream(
+ current_data: &[u8], // Current data chunk
+ len: u32, // Data length
+ stream: &mut FixedSizeStream, // Streaming processing context
+ params: &HashMap<&str, &str>, // Configuration parameters
+) -> Option<u32> {
+ // Implement your chunking logic
+ // Return the split position (offset from the start of current_data), or None if no split
+ None
+}
+```
+
+#### Scheme 2: Simple Processing Scheme
+
+Suitable for processing small to medium-sized files that can be loaded entirely at once, allowing knowledge of subsequent data during chunking for better results.
+
+```rust
+use std::collections::HashMap;
+
+// Simple processing function
+pub async fn your_policy(
+ raw_data: &[u8], // Raw data
+ params: &HashMap<&str, &str>, // Configuration parameters
+) -> Vec<u32> {
+ // Implement your chunking logic
+ // Return a vector of all split positions (offsets from the start of raw_data)
+ vec![]
+}
+```
+
+## Registration and Usage
+
+### Deploying the Policy
+
+1. Place the completed policy `Crate` into the `./policy/` directory of the Butchunker repository.
+2. Use the `butckrepo-refresh` program to refresh the registry:
+ - If the program is not yet installed, you can execute the following in the root directory of the Butchunker repository:
+
+ ```bash
+ cargo install --path ./
+ ```
+3. After each policy library update, you must:
+ - Execute `butckrepo-refresh` to refresh the registry.
+ - Reinstall the `butck` binary: `cargo install --path ./`.
+
+### Calling the Policy
+
+- The policy will be automatically registered in Butchunker's registry.
+
+ Use the following command to call the policy:
+
+ ````rust
+ butck write <file> --policy <policy_name> --storage ./
+ ````