赞
踩
Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.
你可以尝试在线 demos: whisper,LLaMA2, T5, yolo, Segment Anything.
相关文章、教程
Candle 结构包括:
https://huggingface.github.io/candle/guide/installation.html
安装 Rust: https://blog.csdn.net/lovechris00/article/details/124808034
1.1 首先,确保 Cuda 被正确安装了
nvcc --version
应该打印有关Cuda编译器驱动程序的信息。nvidia-smi --query-gpu=compute_cap --format=csv
应该打印您的GPU计算能力,例如:compute_cap
8.9
您还可以使用 CUDA_COMPUTE_CAP=<compute cap>
环境变量为特定的计算编译 Cuda内核。
如果以上任何命令出错,请确保更新您的Cuda版本。
1.2 创建一个新的app,添加 candle-core
来增加 Cuda 支持。
从创建一个新的 cargo 开始 :
cargo new myapp
cd myapp
Make sure to add the candle-core
crate with the cuda feature:
确保添加具有cuda功能的 candle-core
被创建:
cargo add --git https://github.com/huggingface/candle.git candle-core --features "cuda"
运行 cargo build
来保证所有被正确编译
cargo build
创建一个新的 app,并添加 candle-core
如下:
cargo new myapp
cd myapp
cargo add --git https://github.com/huggingface/candle.git candle-core
最后,运行 cargo build
来保证所有被正确编译
cargo build
You can also see the mkl
feature which could be interesting to get faster inference on CPU. Using mkl
转载自:Hello world!
We will now create the hello world of the ML world, building a model capable of solving MNIST dataset.
Open src/main.rs
and fill in this content:
use candle_core::{Device, Result, Tensor};
struct Model {
first: Tensor,
second: Tensor,
}
impl Model {
fn forward(&self, image: &Tensor) -> Result<Tensor> {
let x = image.matmul(&self.first)?;
let x = x.relu()?;
x.matmul(&self.second)
}
}
fn main() -> Result<()> {
// Use Device::new_cuda(0)?; to use the GPU.
let device = Device::Cpu;
let first = Tensor::randn(0f32, 1.0, (784, 100), &device)?;
let second = Tensor::randn(0f32, 1.0, (100, 10), &device)?;
let model = Model { first, second };
let dummy_image = Tensor::randn(0f32, 1.0, (1, 784), &device)?;
let digit = model.forward(&dummy_image)?;
println!("Digit {digit:?} digit");
Ok(())
}
Everything should now run with:
cargo run --release
Linear
层Now that we have this, we might want to complexify things a bit, for instance by adding bias
and creating the classical Linear
layer. We can do as such
struct Linear{
weight: Tensor,
bias: Tensor,
}
impl Linear{
fn forward(&self, x: &Tensor) -> Result<Tensor> {
let x = x.matmul(&self.weight)?;
x.broadcast_add(&self.bias)
}
}
struct Model {
first: Linear,
second: Linear,
}
impl Model {
fn forward(&self, image: &Tensor) -> Result<Tensor> {
let x = self.first.forward(image)?;
let x = x.relu()?;
self.second.forward(&x)
}
}
This will change the model running code into a new function
fn main() -> Result<()> {
// Use Device::new_cuda(0)?; to use the GPU.
// Use Device::Cpu; to use the CPU.
let device = Device::cuda_if_available(0)?;
// Creating a dummy model
let weight = Tensor::randn(0f32, 1.0, (784, 100), &device)?;
let bias = Tensor::randn(0f32, 1.0, (100, ), &device)?;
let first = Linear{weight, bias};
let weight = Tensor::randn(0f32, 1.0, (100, 10), &device)?;
let bias = Tensor::randn(0f32, 1.0, (10, ), &device)?;
let second = Linear{weight, bias};
let model = Model { first, second };
let dummy_image = Tensor::randn(0f32, 1.0, (1, 784), &device)?;
// Inference on the model
let digit = model.forward(&dummy_image)?;
println!("Digit {digit:?} digit");
Ok(())
}
Now it works, it is a great way to create your own layers. But most of the classical layers are already implemented in candle-nn.
candle_nn
For instance Linear is already there. This Linear is coded with PyTorch layout in mind, to reuse better existing models out there, so it uses the transpose of the weights and not the weights directly.
So instead we can simplify our example:
cargo add --git https://github.com/huggingface/candle.git candle-nn
And rewrite our examples using it
use candle_core::{Device, Result, Tensor};
use candle_nn::{Linear, Module};
struct Model {
first: Linear,
second: Linear,
}
impl Model {
fn forward(&self, image: &Tensor) -> Result<Tensor> {
let x = self.first.forward(image)?;
let x = x.relu()?;
self.second.forward(&x)
}
}
fn main() -> Result<()> {
// Use Device::new_cuda(0)?; to use the GPU.
let device = Device::Cpu;
// This has changed (784, 100) -> (100, 784) !
let weight = Tensor::randn(0f32, 1.0, (100, 784), &device)?;
let bias = Tensor::randn(0f32, 1.0, (100, ), &device)?;
let first = Linear::new(weight, Some(bias));
let weight = Tensor::randn(0f32, 1.0, (10, 100), &device)?;
let bias = Tensor::randn(0f32, 1.0, (10, ), &device)?;
let second = Linear::new(weight, Some(bias));
let model = Model { first, second };
let dummy_image = Tensor::randn(0f32, 1.0, (1, 784), &device)?;
let digit = model.forward(&dummy_image)?;
println!("Digit {digit:?} digit");
Ok(())
}
Feel free to modify this example to use Conv2d
to create a classical convnet instead.
Now that we have the running dummy code we can get to more advanced topics:
https://huggingface.github.io/candle/guide/cheatsheet.html#pytorch-cheatsheet
Using PyTorch | Using Candle | |
---|---|---|
Creation | torch.Tensor([[1, 2], [3, 4]]) | Tensor::new(&[[1f32, 2.], [3., 4.]], &Device::Cpu)? |
Creation | torch.zeros((2, 2)) | Tensor::zeros((2, 2), DType::F32, &Device::Cpu)? |
Indexing | tensor[:, :4] | tensor.i((.., ..4))? |
Operations | tensor.view((2, 2)) | tensor.reshape((2, 2))? |
Operations | a.matmul(b) | a.matmul(&b)? |
Arithmetic | a + b | &a + &b |
Device | tensor.to(device="cuda") | tensor.to_device(&Device::new_cuda(0)?)? |
Dtype | tensor.to(dtype=torch.float16) | tensor.to_dtype(&DType::F16)? |
Saving | torch.save({"A": A}, "model.bin") | candle::safetensors::save(&HashMap::from([("A", A)]), "model.safetensors")? |
Loading | weights = torch.load("model.bin") | candle::safetensors::load("model.safetensors", &device) |
伊织 2024-03-23
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。