Motivations
I became interested in Rust when I heard about safety. Buffer overruns or some sort of mishandling of memory cause a lot of vulnerabilites. There is an interesting discussion about that on reddit. It is a bit biased because it’s on /r/rust.
People claim that you just have to git gud (and usually claim that they write bug free C or C++). Yet super experienced developers are making those bugs. Rust appears to be an interesting way of avoiding these kind of bugs by design, and without runtime cost.
An interesting debate is whether or not we should rewrite everything in Rust.. (I’m not sure that’s a pratical solution).
It was enough to spark my interest. Even if I don’t end up rewriting everything in Rust, I will probably learn interesting concepts along the way.
I’m not a total newbie, I have already started two projects:
- A web server: That was my first non “Hello World!” program, trying to understand how a web server might work and learning Rust at the same time. It was over a year ago.
- An SQLite clone: I started 4 months ago, based on cstack tutorial on how to write a db. I struggled because my knowledge of the language wasn’t good and I struggled to manage allocations of Rows and Pages.
Then, I was preparing interviews so I focused on that. Now that I got the job (Yay! 🎉), I can get back at it.
Resources
I will start with the Rust book.
I will then port a program I wrote for some homework: a http log monitoring program.
From then, I have three options that I find interesting:
- Get back at the database
- Write an OS following CS140e from Stanford or Philipp Oppermann’s blog
- Contribute to an open source project
The plan
The plan is to spend at least an hour per day coding in Rust and to write about my learnings and my struggles.
Starting today!
Week 1
Day 1 (1h)
I read the Rust Book, second edition on ownership, structs and pattern matching. It’s nicer than the first edition, the explaination are more detailled than in the first edition.
I have not much to say today, because I am re-learning things I vaguely knew. I do feel that I have a better understanding of ownership now.
I don’t want to spend too much time reading the docs because I might lose interest without a real problem. On the cool side, I noticed a part about parsing CLI inputs in the book. That will be relevant for the http log monitoring program.
I will try to reach that part by tomorrow, so I can build something!
Day 2 (1h)
Today, I read some sections of the Rust book (modules and common collections). It is taking longer than expected because the doc if really thourough.
The section 8.2 about String
and UTF-8 is very interesting.
Day 3 (1h)
I read about error handling, generic types and lifetimes. This has always been a part I never really understood. I think it’s clearer now, but I need to write some code to apply this knowledge.
Fortunately tomorrow, I will reach the Testing, and then the I/O program to write some real code.
Day 4 (45min)
I read about Testing and experimented a bit on my own. It’s nice to have everything built
into the language and not import external libs (unittest
or pytest
comes to mind).
It seems to lack the assert_almost_equals
and other syntaxic sugar that I am used to.
It is also a bit depressing not to have reached the coding part yet.
Day 5 (45min)
Finally, I started coding! It took longer than expected but I’m getting there. It was frustrating to read interesting stuff and not code.
I went with the minigrep
example, trying to code ahead of the code blocks. I don’t regret
the time I spent reading the book because I understood all the compile error I got. I even
foreseen some of them. It is definitely a step up from last time, when I would try to add
&
, mut
, str
/String
and lifetime until it compiles. I stopped halfway through the chapter
as I had to go to work.
Day 6 (2h)
I was excited to come home and continue coding! I went through the exercice, wrote some tests and refactored the code.
I also read the chapter on functional language features and the chapter on cargo. I will now work on the port of my log monitoring program.
I will stop reading the books by large chunks and focus more on coding. There are 31 chapters left, I can take my time. I might need the chapter on concurrency for the port but not right from the start.
Day 7 (1h)
It’s time to start my own project: rewriting my http log monitoring program in Rust!
I took the log generator that’s written in Python to get started quickly. For the first part of the project, I can reuse what I learned about building a CLI with the minigrep project
Starting the project was similar to the minigrep project. I built a program that could parse input from the CLI and read a file according to the argument
As I’m parsing structured files, I used a regex and the regex crate.
Fortunately, I had already written the regex in my earlier Python version. It wasn’t too hard to adapt.
I had a hard time escaping the double quote because I though r"..."
was the regex string syntax just like in Python.
In Rust, it means raw string.
Reading the file and displaying its content was straighforward. I also wrote some tests on the parsing.
Day 8 (1h)
I created a my first Consumer
that will ingest events and generate reports.
I will have multiple consumers that will all ingest events and generate reports.
This sounds like some similar behaviour between different classes. It makes sense to
create a Trait
for this:
pub trait Consumer {
fn ingest(&mut self, http_log: HttpLog);
fn report(&self);
}
I implemented the first one ErrorWatcher
that counts the number of 4xx and 5xx errors.
The struct holds a numeric counter, and a hashmap that holds an Enum
corresponding
to the error codes and their respective counts.
I could have done it more easily with just numeric counters (error_4xx_count
and error_5xx_count
)
but I wanted to experiment with the std::collections::Hashmap
.
It was interesting to see you need to implement 3 Trait
s to use something as a hashmap key.
For basic types, you can derive them automatically with the annotation: #[derive(PartialEq, Eq, Hash)]
I managed to wrote some tests for the ingestion.
Day 9 (3h)
I implemented another consumer: the Ranker
that reports the most active hosts and
the most requested first directories of urls (this-part
in http://url.com/this-part/that-other-part
).
I had more fun with hashmaps: the Ranker
struct holds 2 Hashmap<String, u32>
.
To compute a ranking, I needed to sort the keys by values. I built a Vec
holding the tuple and
sort it on the 2nd value:
fn rank<'a>(&self, container: &'a HashMap<String, u32>) -> Vec<(&'a String, &'a u32)> {
let mut ranking: Vec<(&String, &u32)> = container.iter().collect();
ranking.sort_by(|a, b| b.1.cmp(a.1));
ranking.truncate(self.max_ranking);
ranking
}
It doesn’t feel idiomatic though…
The consumer.rs file started to grow big. I refactored it as a module. I also implemented some tests.
Day 10 (1h)
My goal is to implement threading. That will allow the program to have one ingesting thread that continuously fills a queue with new log events, and another thread that computes stuff and displays it in the terminal.
My first challenge was to have a container that contains a trait. That is not directly possible because
error[E0277]: the trait bound `consumer::Consumer: std::marker::Sized` is not satisfied
--> src/lib.rs:40:21
|
40 | let consumers : Vec<Consumer> = Vec::new();
| ^^^^^^^^^^^^^ `consumer::Consumer` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `consumer::Consumer`
= note: required by `std::vec::Vec`
The solution is quite simple (use a reference or a Box
) but it raises an interesting questions: should I use a &
or a Box
?
TL;DR: With a reference, you lend the value, with a Box
you take ownership and you’re now
responsible for the Drop
. In our case, we don’t want more responsibility: reference it is.
I run into an interesting problem. This code would not compile
let mut consumers : Vec<&mut Consumer> = Vec::new();
let mut error_watcher = ErrorWatcher::new();
let mut ranker = Ranker::new();
consumers.push(&mut error_watcher);
consumers.push(&mut ranker);
error[E0597]: `error_watcher` does not live long enough
--> src/lib.rs:45:25
|
45 | consumers.push(&mut error_watcher);
| ^^^^^^^^^^^^^ borrowed value does not live long enough
...
72 | }
| - `error_watcher` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
The important thing here is the note at the bottom: values in a scope are dropped in the opposite order they are created
At the end of the program, error_watcher
would get dropped, then consumers
.
In that short moment of time, consumers
would have a dangling reference.
The solution is to change the order:
-let mut consumers : Vec<&mut Consumer> = Vec::new();
let mut error_watcher = ErrorWatcher::new();
let mut ranker = Ranker::new();
+let mut consumers : Vec<&mut Consumer> = Vec::new();
consumers.push(&mut error_watcher);
consumers.push(&mut ranker);
Day 11 (2h)
I started by reading more of the Rust Book: the chapters on Smart Pointers and Concurrency. That was an interesting read again, and I have some pointers (haha!) on how I will implement threading in the program.
I first wanted to benchmark the single threaded performance and compare it to the Python program. I started implementing the Alerter which raises an alert when there are too many requests.
I implented the Alert
struct that will hold the information for one alert.
Day 12 (2h)
I implented the logic for the Alerter
(the consumer that raises alerts). I spent way more
time than I expected on handling timezones with the chrono
crate.
At first, I couldn’t get the parsing of the date working:
extern crate chrono;
use chrono::{DateTime, Local, TimeZone, FixedOffset, Utc};
fn main() {
let working_input_string = "05/Jun/2018:20:55:45 +0200";
match Local.datetime_from_str(working_input_string,"%d/%b/%Y:%T %z") {
Ok(d) => println!("{:?}", d),
Err(err) => println!("{}", err),
}
let failing_input_string = "05/Jun/2018:20:55:45 +0300";
match Local.datetime_from_str(failing_input_string,"%d/%b/%Y:%T %z") {
Ok(d) => println!("{:?}", d),
Err(err) => println!("{}", err),
}
}
The first datetime would be parsed just fine, but the second one wouldn’t.
I get the following error message: no possible date and time matching input
. Note
that I live in a UTC+2 timezone
At first, I thought that I was related to the offset (the lib has no way of knowing which
timezone to convert to because the offset of my timezone changes with DST). But the same
issue appears with UTC (this time only the +0000
offset get parsed). I will raise an
issue on the crate repo to clarify this. Edit: Done
Finally, I chose to parse time with DateTime::parse_from_str
and convert it to Local time.
Final Update
I failed at the 100 days challenge. I took a week of holidays in July and then I didn’t went back at coding in Rust (mostly out of laziness). Now that things have settled a bit (I switched jobs and appartments during the summer).
I still have an interest in Rust so I will take out this challenge again, maybe under another form, but the goal is to code in Rust for around 100 hours and see what progress I made.