Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved encoding and decoding speed of Vec<u8> #619

Merged
merged 4 commits into from
Mar 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ rand = "0.8"
uuid = { version = "1.1", features = ["serde"] }
chrono = { version = "0.4", features = ["serde"] }
glam = { version = "0.21", features = ["serde"] }
bincode_1 = { version = "1.3", package = "bincode" }
serde = { version = "1.0", features = ["derive"] }

[[bench]]
name = "varint"
Expand All @@ -53,9 +55,14 @@ harness = false
name = "inline"
harness = false

[[bench]]
name = "string"
harness = false

[profile.bench]
codegen-units = 1
debug = 1
lto = true

[package.metadata.docs.rs]
all-features = true
Expand Down
73 changes: 73 additions & 0 deletions benches/string.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
// https://github.com/bincode-org/bincode/issues/618

use bincode::{Decode, Encode};
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, Default, Encode, Decode)]
pub struct MyStruct {
pub v: Vec<String>,
pub string: String,
pub number: usize,
}

impl MyStruct {
#[inline]
pub fn new(v: Vec<String>, string: String, number: usize) -> Self {
Self { v, string, number }
}
}

fn build_data(size: usize) -> Vec<MyStruct> {
(0..size)
.map(|i| {
let vec: Vec<String> = (0..i).map(|i| i.to_string().repeat(100)).collect();
MyStruct::new(vec, size.to_string(), size)
})
.collect()
}

fn index_item_decode(c: &mut Criterion) {
let data = build_data(100);

c.bench_function("bench v1", |b| {
b.iter(|| {
let _ = black_box(bincode_1::serialize(black_box(&data))).unwrap();
});
});

let config = bincode::config::standard();
c.bench_function("bench v2 (standard)", |b| {
b.iter(|| {
let _ = black_box(bincode::encode_to_vec(black_box(&data), config)).unwrap();
});
});

let config = bincode::config::legacy();
c.bench_function("bench v2 (legacy)", |b| {
b.iter(|| {
let _ = black_box(bincode::encode_to_vec(black_box(&data), config)).unwrap();
});
});

let encodedv1 = bincode_1::serialize(&data).unwrap();
let encodedv2 = bincode::encode_to_vec(&data, config).unwrap();
assert_eq!(encodedv1, encodedv2);

c.bench_function("bench v1 decode", |b| {
b.iter(|| {
let _: Vec<MyStruct> =
black_box(bincode_1::deserialize(black_box(&encodedv1))).unwrap();
});
});

c.bench_function("bench v2 decode (legacy)", |b| {
b.iter(|| {
let _: (Vec<MyStruct>, _) =
black_box(bincode::decode_from_slice(black_box(&encodedv1), config)).unwrap();
});
});
}

criterion_group!(benches, index_item_decode);
criterion_main!(benches);
3 changes: 3 additions & 0 deletions src/enc/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ impl<W: Writer, C: Config> EncoderImpl<W, C> {
}

/// Return the underlying writer
#[inline]
pub fn into_writer(self) -> W {
self.writer
}
Expand All @@ -42,10 +43,12 @@ impl<W: Writer, C: Config> Encoder for EncoderImpl<W, C> {

type C = C;

#[inline]
fn writer(&mut self) -> &mut Self::W {
&mut self.writer
}

#[inline]
fn config(&self) -> &Self::C {
&self.config
}
Expand Down
9 changes: 8 additions & 1 deletion src/enc/impls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -295,10 +295,17 @@ impl Encode for char {

impl<T> Encode for [T]
where
T: Encode,
T: Encode + 'static,
Copy link

@stevenliebregt stevenliebregt Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change now gives me errors when trying to derive Encode for types with Vecs that contain references.

For example:

struct MyStruct<'a> {
  some_field: Vec<&'a str>
}

Argument requires that 'a must outlive 'static

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted this PR in #663 and tried a better fix in #667

{
fn encode<E: Encoder>(&self, encoder: &mut E) -> Result<(), EncodeError> {
super::encode_slice_len(encoder, self.len())?;

if core::any::TypeId::of::<T>() == core::any::TypeId::of::<u8>() {
let t: &[u8] = unsafe { core::mem::transmute(self) };
encoder.writer().write(t)?;
return Ok(());
}

for item in self {
item.encode(encoder)?;
}
Expand Down
15 changes: 15 additions & 0 deletions src/enc/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,18 @@ impl<'storage> Writer for SliceWriter<'storage> {
Ok(())
}
}

/// A writer that counts how many bytes were written. This is useful for e.g. pre-allocating buffers bfeore writing to them.
#[derive(Default)]
pub struct SizeWriter {
/// the amount of bytes that were written so far
pub bytes_written: usize,
}
impl Writer for SizeWriter {
#[inline(always)]
fn write(&mut self, bytes: &[u8]) -> Result<(), EncodeError> {
self.bytes_written += bytes.len();

Ok(())
}
}
47 changes: 39 additions & 8 deletions src/features/impl_alloc.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
use crate::{
de::{BorrowDecoder, Decode, Decoder},
enc::{self, Encode, Encoder},
de::{read::Reader, BorrowDecoder, Decode, Decoder},
enc::{
self,
write::{SizeWriter, Writer},
Encode, Encoder,
},
error::{DecodeError, EncodeError},
impl_borrow_decode, BorrowDecode, Config,
};
Expand All @@ -21,6 +25,12 @@ pub(crate) struct VecWriter {
}

impl VecWriter {
/// Create a new vec writer with the given capacity
pub fn with_capacity(cap: usize) -> Self {
Self {
inner: Vec::with_capacity(cap),
}
}
// May not be used in all feature combinations
#[allow(dead_code)]
pub(crate) fn collect(self) -> Vec<u8> {
Expand All @@ -29,6 +39,7 @@ impl VecWriter {
}

impl enc::write::Writer for VecWriter {
#[inline(always)]
fn write(&mut self, bytes: &[u8]) -> Result<(), EncodeError> {
self.inner.extend_from_slice(bytes);
Ok(())
Expand All @@ -40,7 +51,12 @@ impl enc::write::Writer for VecWriter {
/// [config]: config/index.html
#[cfg_attr(docsrs, doc(cfg(feature = "alloc")))]
pub fn encode_to_vec<E: enc::Encode, C: Config>(val: E, config: C) -> Result<Vec<u8>, EncodeError> {
let writer = VecWriter::default();
let size = {
let mut size_writer = enc::EncoderImpl::<_, C>::new(SizeWriter::default(), config);
val.encode(&mut size_writer)?;
size_writer.into_writer().bytes_written
};
let writer = VecWriter::with_capacity(size);
let mut encoder = enc::EncoderImpl::<_, C>::new(writer, config);
val.encode(&mut encoder)?;
Ok(encoder.into_writer().inner)
Expand Down Expand Up @@ -262,10 +278,20 @@ where

impl<T> Decode for Vec<T>
where
T: Decode,
T: Decode + 'static,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this bound rule out zero-copy deserialization of types like Vec<&'de str>?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer my own question: No, as this case goes though impl BorrowDecode for Vec<T>.

{
fn decode<D: Decoder>(decoder: &mut D) -> Result<Self, DecodeError> {
let len = crate::de::decode_slice_len(decoder)?;

if core::any::TypeId::of::<T>() == core::any::TypeId::of::<u8>() {
decoder.claim_container_read::<T>(len)?;
// optimize for reading u8 vecs
let mut vec = Vec::new();
vec.resize(len, 0u8);
decoder.reader().read(&mut vec)?;
// Safety: Vec<T> is Vec<u8>
return Ok(unsafe { core::mem::transmute(vec) });
}
decoder.claim_container_read::<T>(len)?;

let mut vec = Vec::with_capacity(len);
Expand Down Expand Up @@ -300,10 +326,15 @@ where

impl<T> Encode for Vec<T>
where
T: Encode,
T: Encode + 'static,
{
fn encode<E: Encoder>(&self, encoder: &mut E) -> Result<(), EncodeError> {
crate::enc::encode_slice_len(encoder, self.len())?;
if core::any::TypeId::of::<T>() == core::any::TypeId::of::<u8>() {
let slice: &[u8] = unsafe { core::mem::transmute(self.as_slice()) };
encoder.writer().write(slice)?;
return Ok(());
}
for item in self.iter() {
item.encode(encoder)?;
}
Expand Down Expand Up @@ -364,7 +395,7 @@ where

impl<T> Decode for Box<[T]>
where
T: Decode,
T: Decode + 'static,
{
fn decode<D: Decoder>(decoder: &mut D) -> Result<Self, DecodeError> {
let vec = Vec::decode(decoder)?;
Expand Down Expand Up @@ -444,7 +475,7 @@ where

impl<T> Decode for Rc<[T]>
where
T: Decode,
T: Decode + 'static,
{
fn decode<D: Decoder>(decoder: &mut D) -> Result<Self, DecodeError> {
let vec = Vec::decode(decoder)?;
Expand Down Expand Up @@ -513,7 +544,7 @@ where
#[cfg(target_has_atomic = "ptr")]
impl<T> Decode for Arc<[T]>
where
T: Decode,
T: Decode + 'static,
{
fn decode<D: Decoder>(decoder: &mut D) -> Result<Self, DecodeError> {
let vec = Vec::decode(decoder)?;
Expand Down